DeepSeek reveals efficient AI training method as China tries beat chip curbs
0
0

Chinese artificial intelligence startup DeepSeek has released new research that sheds light on how Chinese AI developers are adapting to hardware constraints while continuing to push model performance forward.
The paper outlines a more efficient method for training advanced AI systems, highlighting how Chinese firms are working around limits imposed by restricted access to top-tier chips.
The publication comes as competition intensifies between Chinese AI companies and global leaders such as OpenAI.
With access to the most advanced semiconductors curtailed, Chinese startups are increasingly turning to architectural and software-level innovation.
DeepSeekās latest work offers a window into how those constraints are shaping the next generation of AI development.
A different approach to AI efficiency
At the centre of the research is a framework called Manifold-Constrained Hyper-Connections.
The technique is designed to improve how large AI models scale while reducing both computational load and energy consumption during training.
The research also addresses issues such as training instability, which often become more pronounced as models grow larger.
The latest breakthrough helps language models to share more internal information in a controlled manner while maintaining the stability and efficiency even when the models are scaled bigger.
Research as a signal of what comes next
DeepSeekās technical papers have historically served as early indicators of upcoming products.
About a year ago, the company drew attention across the industry with its R1 reasoning model, which was developed at a significantly lower cost than comparable systems built by Silicon Valley firms.
The company had released foundational training research ahead of R1ās launch.
Since then, DeepSeek has released several smaller platforms, maintaining a steady pace of experimentation.
Anticipation is now building around its next flagship system, widely referred to as R2, and expected around the Spring Festival in February.
While the new paper does not explicitly reference this model, its timing and depth have fuelled expectations that it underpins future releases.
Innovation under external constraints
US export controls continue to prevent Chinese companies from accessing the most advanced semiconductors used to train and run cutting-edge AI.
These restrictions have become a defining factor in Chinaās AI strategy, encouraging firms to explore unconventional model architectures and efficiency-driven designs.
DeepSeekās research fits squarely into this trend.
By focusing on scalability and infrastructure optimisation, the company is attempting to narrow the performance gap with global competitors without matching their hardware budgets.
The paper was published this week on the open research repository arXiv and the open-source platform Hugging Face.
It lists 19 authors, with founder Liang Wenfeng named last.
Liang has consistently guided DeepSeekās research agenda, encouraging teams to rethink how large-scale AI systems are built.
Tests described in the paper were conducted on models ranging from 3 billion to 27 billion parameters.
The work also builds on hyper-connection architecture research published by ByteDance in 2024.
The post DeepSeek reveals efficient AI training method as China tries beat chip curbs appeared first on Invezz
0
0
Securely connect the portfolio youāre using to start.
