Ant-A supported by Jack Maus Ai a breakthrough in Chinese chips

Jack Ma supported Ant Group Co. He used the semiconductor of Chinese products to develop techniques for training AI models that would reduce costs by 20%, according to people who are familiar with this issue.
Ant used homemade chips, including Affiliate Alibaba Group Holding Ltd. and Huawei Technologies Co., for the training of models using the so -called mixture of access to experts, people said. They received results similar to those from Nvidia Corp. Chips like the H800, they said, seeking not to be named as information were not public.
Anto based in Hangzhou is still used by Nvidia for the development of AI, but now it mainly relies on alternatives, including Advanced Micro Device Inc. And Chinese chips for his latest models, one of the people said.
The models indicate entry into the race between Chinese and American companies that have been accelerated because Deepseek has shown that capable models can be able to train for far less than billions invested by Google Openi and Alphabet Inc. This emphasizes that Chinese companies are trying to use local alternatives to the most advanced Nvidia semiconductor. Although not the most advanced, the H800 is a relatively powerful processor, and now it is currently banning from China.
The company has published aResearch workThis month, who claimed that they sometimes surpassed their models to the target of platform Inc. In certain reference values, which Bloomberg News did not confirm independently. But if they act as advertising, Ant’s platforms could mark another step forward for the development of Chinese artificial intelligence by reducing the cost of an infferent or support for AI services.
Because companies insert significant money in AI, MOE models have emerged as a popular option, receiving recognition for their use by Google and Hangzhou Startup Deep, among others. This technique divides tasks to smaller data sets, very like a team of experts who focused on the job segment, making the procedure more effective. Ant refused to comment on in the statement of e.
However, the training of the model can usually rely on high performance of chips like graphic units for NVIDIA. The cost to date has been banned for many small businesses and limited broader adoption. Ant works in ways of how more effectively train llms and eliminate that limit. His paper title clearly shows this, because the company sets the goal of the model “without top GPUs.”
It goes against the Nvidia grain. CEO Jensen Huang claimed that demand for calculation would grow even more effective models like Deepseek R1, positive That the companies will need better chips to generate more revenue, not cheaper to reduce costs. He stuck in a strategy for the construction of large GPU -as more processing core, transistor and increased memory capacity.
Ant said that it cost about 6.35 million Juan ($ 880,000) to train a token of three trillion with high performance hardware, but its optimized approach would reduce to 5.1 million Juan with low specifications hardware. Tokens are units of information that the model swallows to find out about the world and provide useful answers to user inquiries.
The company plans to use the recent breakthrough in large language models, which has developed, ling-plus and ling -ite, for industrial AI solutions, including health care and finance, people said.
AntboughtChinese Internet Platform HAODF.com this yearboosthis services of artificial intelligence in healthcare. The Ant has created an assistant AI doctor to support 290,000 HAODF doctors with tasks such as managing medical documents, the company announced in a separate statement on Monday.
The company also has an AI “Life Assistant” app called Zhixiaobao and the financial advisory service AI Maxiaocai.
About understanding of English, Ant said in his work that the Ling-Lite model had better managed in a key reference number compared to one of the targets of the Llam. Ling-Lite and Ling-Plus models have exceeded Deepseek’s equivalent to reference values in Chinese.
“If you find one point of attack to beat the world’s best master Kung Fu, you can still say that you have won them, which is why applying in the real world is important,” said Robin Yu, the CEO of AI solutions based in Beijing Shengshang Tech Co.
Ant made the Ling models open code. Ling-Lite contains 16.8 billion parameters, which are adjustable settings that act like buttons and dialing to direct the performance of the model. Ling-plus has 290 billion parameters, which is considered relatively large in the field of linguistic models. For comparison, experts estimate that Chatgpt GPT-4.5 has 1.8 trillion parameters,according toto a myth technological examination. Deepseek-R1There is671 billion.
The company faced challenges in some training fields, including stability. Even small changes in the hardware or the structure of the model have led to problems, including the jumps in the model of the model errors, stated in the work.
Ant said that on Monday he built large models focused on health care, which used seven hospitals and health care providers in cities, including Beijing and Shanghai. The big model uses Deepseek R1, Alibaba Qwen and Ant’s own llm and can carry out medical counseling, it is said.
The company also announced that it was carried out by two medical agents AI – Angel, who served more than 1,000 medical institutions, and Yibaoer, which supports health insurance services. Last September, she launched AS Healthcare Manager within Alipaya, his payment app.
This story is originally shown on Fortune.com