As the little Chinese start-up ai deepseek shocked the silicone valley
The little Chinese artificial intelligence laboratory this week has amazed the world by discovering a technical recipe for its top model, turning its leader into a national hero who defied American attempts to stop Chinese high -tech ambitions.
Deepseek, founded by the manager of Hedge Funds Liang Weenfeng, posted his model R1 on Monday, explaining with detailed paper how to build a large linguistic model on a launch budget that can be automatically learned and improved without human surveillance.
US companies, including Openi and Google Deepmind, pionered development in explanation models, a relatively new field of AI research that is trying to make models match human cognitive abilities. In December, an Openai headquartered in San Francisco posted Complete version of his O1 model but he kept his methods secret.
Deepseek’s edition of R1 sparked a furious debate in the Silicon Valley on whether the AI -based AI companies, including targets and anthropies, can defend their technical advantage.
In the meantime, Liang has become a focal point of national pride at home. This week was the only Ai The leader was chosen to attend an published meeting of an entrepreneur with another most popular leader in the country, li Qiang. Entrepreneurs have been said to “concentrate efforts to break through key fundamental technologies.”
In 2021, Liang began buying thousands of graphic Nvidia graphic units for his AI side project, while running his quantum trading fund. The industrial insiders considered her eccentric procedures of billionaires looking for a new hobby.
“When we first met him, he was that very nervous guy with a terrible hairstyle that spoke about building a cluster of 10,000 chips to train his own models. We didn’t take it seriously,” said one of Liang’s business partners.
“He couldn’t articulate his vision other than saying: I want to build it, and that will be a change of game. We thought it was only possible from giants like Byttenndance and Alibabe,” the person added.
Liang’s status of an outsider in AI field was an unexpected source of power. In high back, he built wealth using AI and algorithms to identify patterns that could affect shares prices. His team became skilled in using Nvidia chips to make money trading shares. 2023 started Deepsek, announcing his intention to develop AI at the human level level.
“Liang built an exceptional infrastructure team that really understands how chips have functioned,” said one founder from a rival LLM company. “He took his best people with him from Hedge Fund to Deepseek.”
After Washington forbade Nvidia to export its most powerful chips to the cinema, local AI companies were forced to find innovative ways to maximize the computer forces of a limited number of land chips – the problem of Liang’s team that he already knew how to solve.
“Deepseek engineers know how to unlock the potential of these GPUs, even if they are not in art,” said a AI researcher near the company.
Industry insiders say Deepseek’s unique focus on research makes it dangerous competitors because he is ready to share his breakthroughs rather than protect them for commercial gains. Deepseek did not raise money from external funds or made significant moves to bring in its models.
“Deepseek is guided as early days of Deepmind,” one AI investor said in Beijing. “It’s purely focused on research and engineering.”
Liang, who is personally involved in Deepseek’s research, uses revenue from his trading Hedge funds to pay off top salaries for the best AI talent. In addition to Tiktok’s owner, Deepsek is known for providing the highest compensation available AI engineers in China, with staff based in Hangzhou and Beijing offices.
“Deepseek offices are felt as a university campus for heavy researchers,” said a business partner. “The team believes in Liang’s vision: to show the world that the Chinese can be creative and build something from zero.”
Deepseek and High-Flyer did not answer the commentary request.
Liang stylized Deepseek as a unique “local” company, staff with doctoral students from top Chinese schools, Beijing, Tsinghua and Beihang University, not experts in US institutions.
In an interview with the home press last year, he said that his fundamental team “did not have people who came back from abroad. All are local.. We have to develop a top talent for ourselves.” Deepsek’s identity as a purely Chinese company LLM won praise at home .
Deepseek claimed that he only used $ 2,048 NVIDIA H800S and $ 5.6 million to train models with 671 billion parameters, a fraction of what Openi and Google spent to coaching models comparable size.
Ritwik Gupta, a policy researcher AI from the University of California, Berkeley, said the recent editions of Deepseek show that “there is no dittle when it comes to AI capabilities.”
“The first person who trained models has to spend a lot of resources to get there,” he said. “But the other initiator can get cheaper and faster there.”
Gupta added that China had a much bigger talent of system engineers than now that they understand how best to use computer resources to train and start the model cheaper.
Industrial insiders say that although Deepsek has shown impressive results with limited resources, the question remains an open question whether it can still be competitive as the industry develops.
He returns to the High Flyer, his great support, lagging behind in 2024, for which one person close to Liang blamed the attention of the founder who is mostly focused on Deepseek.
His American rivals do not stand still. They build mega “clusters” of Nvidia’s Blackwell New generation chips, creating a computer power that threatens to create a gap with Chinese rivals once again.
This week Openii said it is Creating a common investment With Japanese softbank, called Stargate, with plans to spend at least $ 100 billion on AI infrastructure in the US. Xai Elon Musk massively expands its Colossus supercompute that it contains more than 1 MN GPU to help train its Grok AI models.
“Deepseek has one of the biggest advanced computer clusters in China,” Liang said business partner. “For now, they have enough capacity, but not much longer.”
Additional Wenjie Ding reporting in Beijing