DeepSeek, China's AI model: News & Discussion

冖_冖 · Aug 19, 2025

https://www.cnbc.com/2025/08/18/openai-altman-china-ai.html

OpenAI’s Altman warns the U.S. is underestimating China’s next-gen AI threat

Published Mon, Aug 18 20252:07 PM EDTUpdated 4 Hours Ago

OpenAI CEO Sam Altman said the U.S. may be underestimating the complexity and seriousness of China’s progress in artificial intelligence.
His comments come as Washington adjusts its policies designed to curb China’s AI ambitions.
Altman said competition from Chinese models — particularly open-source systems like DeepSeek and Kimi K2 — was a factor in OpenAI’s recent decision to release its own open-weight models.

Sam Altman admits OpenAI ‘totally screwed up’ its GPT-5 launch and says the company will spend trillions of dollars on data centers | Fortune

Altman also said that he thinks we’re in an AI “bubble.”

fortune.com

Sam Altman admits OpenAI ‘totally screwed up’ its GPT-5 launch and says the company will spend trillions of dollars on data centers

神威98 · Aug 19, 2025

Imagine that spend millions on Huawei Ascend GPUs only to get half a product that can't even be used to train AI. IMO, Deepseek should spear head the initiative to seek out the "Xiaomi" of domestic GPUs get others to join in and reward it financially. By which I mean Huawei makes utter crap outside of 5G base stations and smafos and I had to go to Xiaomi to get an actual working product!

Why you need to buy inference chips from Huawei, and then have to buy another set of GPUs or rent GPU farms to train your AI model? Chinese companies should forgo Huawei and get another domestic company's products that actually does both decently!

冖_冖 · Aug 22, 2025

Chinese semiconductor shares surge after DeepSeek gives boost to homegrown chips

Chipmaker Cambricon jumps 20% to record high after AI start-up adapts model for ‘next generation of domestic chips’

www.ft.com

Chinese semiconductor shares surge after DeepSeek gives boost to homegrown chips

an hour ago

Cambricon Technologies led a rally of Chinese chipmakers on Friday after artificial intelligence start-up DeepSeek unveiled an updated model that would be compatible with domestically made semiconductors.

Shanghai-listed shares of Cambricon, one of China’s leading AI chipmakers, jumped 20 per cent to a record high, taking the company’s market capitalisation past Rmb500bn ($70bn) for the first time. Its value has more than quadrupled in the past year.

Chip foundries also gained on Friday, with Hong Kong-listed Semiconductor Manufacturing International Corporation and Hua Hong Semiconductor up 8 per cent and 12 per cent, respectively.

ftcms%3Ae000a6be-b4b9-4bd1-b7e3-105967b84af8

The moves come after DeepSeek released an update of its V3 model on Thursday, which it said was adapted “for the next generation of domestic chips”.

While the start-up did not specify individual chipmakers’ products, industry insiders and analysts said it was a positive signal that Chinese AI companies would use more domestic chips.

“If DeepSeek can use China-made chips, then the rest of the semiconductor universe will fly,” said Wee Khoon Chong, a senior strategist at BNY. “The potential demand for Chinese chips is going to be huge.”

Huawei is widely seen as the main competitor in China to Nvidia, with its Ascend AI chip series being broadly adopted by state-owned enterprises and telecoms companies. But the Shenzhen-based tech conglomerate is not listed, meaning investors have looked elsewhere to capitalise on news that China is pushing for domestic alternatives to the US chip giant.

Nvidia Said To Halt H20 Chip Output After Beijing Directive

Shares were indicated modestly lower overnight.

www.investors.com

Nvidia Orders Halt to H20 Production After China Directive Against Purchases

Updated 09:39 PM ET 08/21/2025

China’s AI chip drive eyes 82% autonomy by 2027

China's top provinces are intensifying efforts to reduce reliance on foreign semiconductors, setting aggressive targets to boost self-sufficiency in AI and data center chips.

www.digitimes.com

China’s AI chip drive eyes 82% autonomy by 2027

22 August 2025

John Smith · Aug 22, 2025

冖_冖 said:
Chinese semiconductor shares surge after DeepSeek gives boost to homegrown chips

Chipmaker Cambricon jumps 20% to record high after AI start-up adapts model for ‘next generation of domestic chips’

www.ft.com

Chinese semiconductor shares surge after DeepSeek gives boost to homegrown chips
an hour ago

Cambricon Technologies led a rally of Chinese chipmakers on Friday after artificial intelligence start-up DeepSeek unveiled an updated model that would be compatible with domestically made semiconductors.

Shanghai-listed shares of Cambricon, one of China’s leading AI chipmakers, jumped 20 per cent to a record high, taking the company’s market capitalisation past Rmb500bn ($70bn) for the first time. Its value has more than quadrupled in the past year.

Chip foundries also gained on Friday, with Hong Kong-listed Semiconductor Manufacturing International Corporation and Hua Hong Semiconductor up 8 per cent and 12 per cent, respectively.

The moves come after DeepSeek released an update of its V3 model on Thursday, which it said was adapted “for the next generation of domestic chips”.

While the start-up did not specify individual chipmakers’ products, industry insiders and analysts said it was a positive signal that Chinese AI companies would use more domestic chips.

“If DeepSeek can use China-made chips, then the rest of the semiconductor universe will fly,” said Wee Khoon Chong, a senior strategist at BNY. “The potential demand for Chinese chips is going to be huge.”

Huawei is widely seen as the main competitor in China to Nvidia, with its Ascend AI chip series being broadly adopted by state-owned enterprises and telecoms companies. But the Shenzhen-based tech conglomerate is not listed, meaning investors have looked elsewhere to capitalise on news that China is pushing for domestic alternatives to the US chip giant.

Nvidia Said To Halt H20 Chip Output After Beijing Directive

Shares were indicated modestly lower overnight.

www.investors.com

Nvidia Orders Halt to H20 Production After China Directive Against Purchases
Updated 09:39 PM ET 08/21/2025

China’s AI chip drive eyes 82% autonomy by 2027

China's top provinces are intensifying efforts to reduce reliance on foreign semiconductors, setting aggressive targets to boost self-sufficiency in AI and data center chips.

www.digitimes.com

China’s AI chip drive eyes 82% autonomy by 2027
22 August 2025

Still think it'll be a good 5-7 years before Chinese chips catch up to their Taiwanese/Western counterparts.
China can help speed this up by invading Taiwan. I hope WW3 starts before my semester finals start

Beijingwalker · Aug 27, 2025

China's push for global AI dominance

冖_冖 · Sep 16, 2025

Leading AI scientist Alex Kot moves to a Sino-Russian university in China

Information and electronic engineering scholar joins Shenzhen MSU-BIT University (SMBU), a Chinese-foreign cooperative university.

www.scmp.com

Singapore’s leading AI scientist Alex Kot moves to a Sino-Russian university in China

Information and electronic engineering scholar joins Shenzhen MSU-BIT University (SMBU), a Chinese-foreign cooperative university

Published: 7:00pm, 14 Sep 2025

Alex Kot, a top AI and computer expert who has been with Nanyang Technological University (NTU) in Singapore for more than 30 years, has joined a university in China that was jointly built with Russia.

Kot will serve as a distinguished professor and chief scientist at Shenzhen MSU-BIT University (SMBU), a Chinese-foreign cooperative university.

SMBU was jointly founded by the Shenzhen government, Lomonosov Moscow State University (MSU), and the Beijing Institute of Technology (BIT), a top Chinese defence university that has been sanctioned by the US.

Kot, a fellow of the Academy of Engineering, Singapore and of the Institute of Electrical and Electronics Engineers, said he would “devote himself fully to the construction and development of the school, contributing his efforts to breakthroughs in engineering disciplines and the establishment of research platforms”, according to an SMBU statement.

He has published more than 300 technical papers in the areas of signal processing for communication, biometric recognition, authentication, image forensics, machine learning and artificial intelligence.

Kot graduated with a bachelor’s degree and MBA from the University of Rochester in the United States before obtaining a master’s degree and doctorate from the University of Rhode Island in 1984 and 1989, respectively.

冖_冖 · Sep 18, 2025

Alibaba-developed AI processor on par with Nvidia’s H20 chip, CCTV report shows

The broadcast offers fresh evidence that Chinese developers are designing advanced chips that could replace imports.

www.scmp.com

Alibaba-developed AI processor on par with Nvidia’s H20 chip, CCTV report shows
The broadcast offers fresh evidence that Chinese developers are designing advanced chips that could replace imports

Published: 9:00pm, 17 Sep 2025

Alibaba Group Holding’s semiconductor design unit, T-Head, has developed an artificial intelligence chip with capabilities that are on par with Nvidia’s H20 graphics processing unit (GPU), according to a report by state broadcaster China Central Television (CCTV).

The report, which aired on Tuesday, showed T-Head’s PPU, an application-specific integrated circuit, being compared with Nvidia’s H20 and A800 GPUs in a performance benchmark during Premier Li Qiang’s visit to a data centre operated by China Unicom in northwestern Qinghai province.

Li was briefed by China Unicom on the use of mainland-developed chips in the telecommunications network operator’s infrastructure.

This marked the first time that Alibaba’s proficiency in semiconductor design was highlighted in a state broadcast, which offered fresh evidence that Chinese developers are designing advanced chips that could replace imports like Nvidia’s GPUs. Alibaba owns the South China Morning Post.

Alibaba’s Hong Kong-listed shares closed 5.28 per cent higher at HK$161.60 on Wednesday, as the CCTV report seized the market’s attention.

A China Central Television report on Tuesday shows Alibaba Group Holding’s PPU, an AI chip developed by the company’s T-Head semiconductor design unit, being compared to Nvidia’s H20 and A800 processors in a performance benchmark during Premier Li Qiang’s visit to China Unicom’s data centre in northwestern Qinghai province. Photo: CCTV

The footage aired by CCTV showed a chart that compared a number of locally designed AI accelerators with Nvidia’s two GPUs, which were tailored for China to comply with US tech export restrictions.

T-Head’s PPU card, which had 96 gigabytes of high-bandwidth memory per unit, matched Nvidia’s H20 and surpassed Huawei Technologies’ Ascend 910B, according to the CCTV footage. Alibaba’s PPU card also featured chip-to-chip bandwidth of 700 gigabytes per second, high-speed Peripheral Component Interconnect Express standard, which connects hardware components within a computer; and a 400-watt power consumption, which was lower than what the H20 needed.

Another chart shown on the broadcast listed China Unicom’s contracts with four domestic chip providers, which totalled 22,832 cards that provided 3,579 petaflops, a measure of computing speed. A petaflop is equal to 1,000 trillion calculations per second.

T-Head’s PPU cards accounted for 16,384 units, which provided 1,945 petaflops.

The other suppliers to China Unicom were MetaX, a GPU start-up that collaborates with the Chinese Academy of Sciences; Biren Technology, a Shanghai-based AI chip designer; and Zhonghao Xinying Technology, founded by former Google tensor processing unit engineer Yanggong Yifan.

Other chip suppliers mentioned in the CCTV report as potential partners included Tecorigin, a company based in Wuxi in eastern Jiangsu province, as well as Moore Threads Technology and Tencent Holdings-backed Enflame.

China’s growing number of mainland AI chip suppliers gave credence to Alibaba CEO Eddie Wu Yongming’s assurance during the firm’s earnings call last month that the company had prepared “backup plans” to secure AI chip supplies amid US restrictions and heightened geopolitical tensions.

China’s demand for high-performance computing has grown rapidly alongside the expansion of AI development projects. This has prompted local governments and telecoms network operators to invest in large-scale data centres for AI-related projects.

Li visited China Unicom’s 2.77 billion yuan (US$389 million) Sanjiangyuan Green Energy Intelligent Computing Centre, which broke ground in August 2024. It covers an area of 5.3 hectares, where the facility will be built in four phases and provide a total capacity of more than 20,000 petaflops when completed, according to a report by the local Qinghai Daily.

Alibaba lands China Unicom as flagship client for its AI chips

Alibaba Group secures China Unicom as a customer for its AI chips.

www.cryptopolitan.com

Alibaba lands China Unicom as flagship client for its AI chips

Updated: September 17 2025 10:47 AM UTC

Alibaba Group Holdings has won China Unicom as a client for its AI chips. Per CCTV report late Tuesday, China’s second-biggest wireless carrier will use Alibaba’s Pingtouge, or T-Head AI accelerators.

The carrier will place the chips in its expansive new data center in northwestern China, together with accelerators from MetaX and Biren Technology, which are already in use. Alibaba’s recent interest in T-Head and chip development aligns with Jack Ma’s stepped-up participation in the company’s strategy this year.

Alibaba has been advancing its development in AI infrastructure

Alibaba has recently invested more in AI infrastructure to compete with Chinese tech companies like Huawei and reduce its dependence on Nvidia Corporation’s designs. So far, it has committed 380 billion yuan ($53.5 billion) to the initiative over three years. Alibaba Cloud has also begun delivering large volumes of AI chips to Unicom’s data facilities, though more details have not been disclosed.

Nonetheless, more information about its AI chip efforts surfaced earlier this week during CCTV’s coverage of Premier Li Qiang’s visit to Qinghai. The report briefly showed a billboard at Unicom’s Sanjiangyuan data center outlining the telecom’s deployment of Alibaba chips. In a separate briefing, Unicom added that Alibaba’s AI chip outperforms Huawei’s Ascend 910B in several key hardware metrics, including more advanced memory.
https://www.cryptopolitan.com/china-anti-dumping-probe-on-us-analog-chips/
However, Huawei is introducing the more powerful Ascend 910C for its part. Nevertheless, last month, the Wall Street Journal also revealed that Alibaba has designed an AI chip that can operate AI services, including DeepSeek’s R1 and its Qwen series.

As recently reported by Cryptopolitan, DeepSeek has delayed the launch of its latest AI model after encountering persistent technical challenges with Huawei’s Ascend processors.

The Chinese artificial intelligence company had been encouraged by authorities to use Huawei’s chips instead of US-made Nvidia products after the successful release of its R1 model in January. Still, the firm ran into major issues during the training phase of its R2 model. These issues forced DeepSeek to rely on Nvidia chips for training, while using Huawei’s Ascend chips for inference.

Meanwhile, Alibaba’s push into chipmaking parallels similar initiatives by other Chinese tech giants working on homegrown AI silicon amid restrictions on Nvidia’s most advanced products. Nvidia’s AI accelerators are considered the industry standard for training next-generation models from OpenAI and Anthropic. Baidu said in August that it had landed a 10 billion yuan deal to deliver servers using its Kunlun chips to China Mobile, Unicom’s bigger rival.

https://www.reuters.com/world/china/alibaba-baidu-begin-using-own-chips-train-ai-models-information-reports-2025-09-11/

Alibaba, Baidu begin using own chips to train AI models, The Information reports

Tencent, Alibaba, Baidu turn to local AI chips, shaking Nvidia's grip

Tencent said this week it has integrated AI chips from Chinese suppliers and plans to scale up its use in cloud-based AI services. Alongside Alibaba and Baidu, which have already adopted domestic or in-house processors, the 'BAT' tech giants are accelerating a pivot away from Nvidia's dominance...

www.digitimes.com

Tencent, Alibaba, Baidu turn to local AI chips, shaking Nvidia's grip

https://www.reuters.com/business/me...-computing-power-plans-first-time-2025-09-18/

Huawei unveils chipmaking, computing power plans for the first time

September 18, 202512:31 PM GMT+8Updated 22 mins ago

SHANGHAI, Sept 18 (Reuters) - Huawei said on Thursday it would roll out four new iterations of its Ascend AI chip over the next three years, breaking years of secrecy to reveal its chipmaking progress and ambitions to compete against Nvidia (NVDA.O), opens new tab for the first time.

The Chinese technology giant has been one of the key players in leading efforts to develop a domestic semiconductor manufacturing industry, aiming to reduce reliance on a supply chain dominated by the United States.

After the launch of the Ascend 910C in the year's first quarter, Vice Chairman Eric Xu said the company plans to launch next year two variants of its successor, the Ascend 950, and follow up with the 960 version in 2027 and the 970 in 2028.

"Computing power has always been, and will continue to be, key to artificial intelligence, and even more so to China's AI," Xu told the annual Huawei Connect conference in the commercial hub of Shanghai, the company said.

The Ascend 950 chip would be powered by the company's own proprietary high-bandwidth memory, he said, revealing that it had overcome a key bottleneck China faced in the technology, limited for years to South Korean and U.S. suppliers.

Huawei also plans to roll out new computing power supernodes called the Atlas 950 and Atlas 960, which Xu described as the world's most powerful, supporting 8,192 and 15,488 Ascend chips respectively.

The chips are successors to the Atlas 900, also known as the CloudMatrix 384, which uses 384 of Huawei's latest 910C chips.

On some metrics, the Huawei product outperforms Nvidia's GB200 NVL72, which uses 72 B200 chips, research group SemiAnalysis has said.

Huawei says the system uses "supernode" architecture that allows the chips to interconnect at super-high speeds.

冖_冖 · Sep 19, 2025

https://www.reuters.com/world/china/chinas-deepseek-says-its-hit-ai-model-cost-just-294000-train-2025-09-18/

BEIJING, Sept 18 (Reuters) - Chinese AI developer DeepSeek said it spent $294,000 on training its R1 model, much lower than figures reported for U.S. rivals, in a paper that is likely to reignite debate over Beijing's place in the race to develop artificial intelligence.

The rare update from the Hangzhou-based company - the first estimate it has released of R1's training costs - appeared in a peer-reviewed article in the academic journal Nature published on Wednesday.

DeepSeek's release of what it said were lower-cost AI systems in January prompted global investors to dump tech stocks as they worried the new models could threaten the dominance of AI leaders including Nvidia (NVDA.O), opens new tab.

Since then, the company and founder Liang Wenfeng have largely disappeared from public view, apart from pushing out a few new product updates.

The Nature article, which listed Liang as one of the co-authors, said DeepSeek's reasoning-focused R1 model cost $294,000 to train and used 512 Nvidia H800 chips. A previous version of the article published in January did not contain this information.

冖_冖 · Oct 23, 2025

https://www.bloomberg.com/news/features/2025-10-22/china-s-deepseek-pushes-into-africa-making-ai-accessible-to-millions

DeepSeek’s Surge in Africa Reveals China’s AI Power Grab

冖_冖 · Oct 30, 2025

https://finance.yahoo.com/news/airbnb-picks-alibabas-qwen-over-093000045.html

Airbnb picks Alibaba's Qwen over ChatGPT in a win for Chinese open-source AI

MBZUAI unveils K2 Think reasoning model based on Qwen 2.5

The Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) in Abu Dhabi has unveiled K2 Think, a low-cost reasoning model

dataconomy.com

The Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) in Abu Dhabi has unveiled K2 Think, a low-cost reasoning model designed to rival systems developed by DeepSeek and OpenAI.

K2 Think is built on Alibaba’s Qwen 2.5 large language model and runs on hardware from Cerebras. MBZUAI claims it is among the fastest and most efficient reasoning systems, processing around 2,000 tokens (about 1,500 words) per second. This balance of speed and accuracy is intended to give it a competitive edge against larger, more resource-intensive models.

冖_冖 · Nov 26, 2025

AI Singapore picks Alibaba’s Qwen to drive new regional language model

In a boost for China’s AI ambitions, AI Singapore has chosen Alibaba’s Qwen to train its new language model.

www.scmp.com

Singapore picks Alibaba’s Qwen to drive regional language model in big win for China tech

In a boost for China’s AI ambitions, AI Singapore has chosen Alibaba’s Qwen to train its new language model

Published: 7:00pm, 25 Nov 2025Updated: 9:08pm, 25 Nov 2025

AI Singapore (AISG) – a national programme by the city state of Singapore to accelerate the adoption of artificial intelligence – has chosen to base its latest large language model on Alibaba’s Qwen, in a significant win for the Chinese technology giant as it promotes its AI services in Southeast Asia.

AI Singapore, designed to enhance the city state’s national AI capabilities, had released a new model, Qwen-SEA-LION-v4, based on Alibaba’s Qwen3-32B foundation model to better address the linguistic and cultural demands of the region, Alibaba Cloud said in a statement. Alibaba Group Holding owns the South China Morning Post.

An early version of the SEA-LION models was based on Llama, the open-source large language model developed by US tech giant Meta.

AISG in August also released a multimodal model for Southeast Asia based on Gemma, an open-source model developed by Google DeepMind.

Competition between Chinese open-source models and US models in third-party countries like Singapore is closely watched, as it has broad implications for the world’s AI landscape.

The roll-out of models such as DeepSeek as well as Qwen has significantly boosted China’s competitiveness in the arena.

The cooperation between Alibaba Cloud and AISG underscores how Alibaba’s open-source strategy has helped the company’s AI model family maintain its appeal among global AI developers.

Alibaba’s Qwen series, first open-sourced in August 2023, has now become one of the world’s largest open-source model families. The series had achieved over 600 million downloads and more than 170,000 derivative models created by developers worldwide, the company said in September.

The Qwen3-32B model is part of Alibaba’s Qwen3 series launched this April, which consists of eight enhanced models that range from 600 million to 235 billion parameters.

The collaboration marked “an important milestone in advancing AI inclusivity and making it more representative of Southeast Asia”, said Leslie Teo, senior director of AI products at AI Singapore.

“It embodies our shared vision of accelerating AI innovation across the region and ensuring that developers, enterprises and public institutions have access to AI that is open, affordable and locally relevant,” he added.

The latest model was trained on over 100 billion tokens of Southeast Asian languages, a move that is expected to improve its ability to interpret local expressions and regional knowledge domains, as mainstream models typically have a better understanding of English or Chinese.

The model currently ranks first among open-source models under 200 billion parameters on the SEA-HELM leaderboard, a benchmark for regional language performance.

Meanwhile, the Qwen model series has also been winning over tech firms in the West, reflecting its appeal compared with major AI models created by US developers.

Airbnb, the San Francisco-based online accommodation booking giant, relied heavily on the Qwen models to power its AI-driven customer service agent, the company’s co-founder and CEO Brian Chesky said in October, while adding that ChatGPT’s integration abilities were not “quite ready” for Airbnb’s needs.

Singapore’s national AI program drops Meta model and switches to Alibaba’s Qwen · TechNode

Singapore’s national AI program has moved its Sea-Lion large language model off Meta’s model family and adopted Alibaba Cloud’s Qwen architecture,

technode.com

Singapore’s national AI program drops Meta model and switches to Alibaba’s Qwen

Singapore’s national AI program has moved its Sea-Lion large language model off Meta’s model family and adopted Alibaba Cloud’s Qwen architecture, according to information cited by foreign media from AI Singapore (AISG). The latest version, Qwen-Sea-Lion-v4, was trained with technical support from Alibaba Cloud and built on the Qwen3-32B foundation model, which covers 119 languages and dialects and was trained on 36 trillion tokens. Alibaba Cloud said the model received an additional 100 billion Southeast Asian language tokens for this collaboration, while AISG contributed regional datasets and handled evaluation.

冖_冖 · Nov 26, 2025

China leapfrogs US in global market for ‘open’ AI models

Beijing-backed technology gains ground as American giants hold fast to ‘closed’ AI strategies

www.ft.com

Published9 hours ago

A study by the Massachusetts Institute of Technology and open-source AI start-up Hugging Face found that the total share of downloads of new Chinese-made open models rose to 17 per cent in the past year.

The figure surpasses the 15.8 per cent share of downloads from American developers such as Google, Meta and OpenAI — the first time Chinese groups have beaten their American counterparts.

Open models — which are free to download, modify and integrate by developers — make it easier for start-ups to create products and researchers to improve them. Widespread adoption will confer outsized influence over AI’s future.

China’s push to release open models comes in stark contrast to the “closed” approach of most of the biggest US tech companies.

冖_冖 · Dec 2, 2025

https://venturebeat.com/ai/deepseek-just-dropped-two-insanely-powerful-ai-models-that-rival-gpt-5-and

DeepSeek just dropped two insanely powerful AI models that rival GPT-5 and they're totally free

December 2, 2025

Chinese artificial intelligence startup DeepSeek released two powerful new AI models on Sunday that the company claims match or exceed the capabilities of OpenAI's GPT-5 and Google's Gemini-3.0-Pro — a development that could reshape the competitive landscape between American tech giants and their Chinese challengers.

The Hangzhou-based company launched DeepSeek-V3.2, designed as an everyday reasoning assistant, alongside DeepSeek-V3.2-Speciale, a high-powered variant that achieved gold-medal performance in four elite international competitions: the 2025 International Mathematical Olympiad, the International Olympiad in Informatics, the ICPC World Finals, and the China Mathematical Olympiad.

The release carries profound implications for American technology leadership. DeepSeek has once again demonstrated that it can produce frontier AI systems despite U.S. export controls that restrict China's access to advanced Nvidia chips — and it has done so while making its models freely available under an open-source MIT license.

"People thought DeepSeek gave a one-time breakthrough but we came back much bigger," wrote Chen Fang, who identified himself as a contributor to the project, on X (formerly Twitter). The release drew swift reactions online, with one user declaring: "Rest in peace, ChatGPT."

How DeepSeek's sparse attention breakthrough slashes computing costs

At the heart of the new release lies DeepSeek Sparse Attention, or DSA — a novel architectural innovation that dramatically reduces the computational burden of running AI models on long documents and complex tasks.

Traditional AI attention mechanisms, the core technology allowing language models to understand context, scale poorly as input length increases. Processing a document twice as long typically requires four times the computation. DeepSeek's approach breaks this constraint using what the company calls a "lightning indexer" that identifies only the most relevant portions of context for each query, ignoring the rest.

According to DeepSeek's technical report, DSA reduces inference costs by roughly half compared to previous models when processing long sequences. The architecture "substantially reduces computational complexity while preserving model performance," the report states.

Processing 128,000 tokens — roughly equivalent to a 300-page book — now costs approximately $0.70 per million tokens for decoding, compared to $2.40 for the previous V3.1-Terminus model. That represents a 70% reduction in inference costs.

The 685-billion-parameter models support context windows of 128,000 tokens, making them suitable for analyzing lengthy documents, codebases, and research papers. DeepSeek's technical report notes that independent evaluations on long-context benchmarks show V3.2 performing on par with or better than its predecessor "despite incorporating a sparse attention mechanism."

The benchmark results that put DeepSeek in the same league as GPT-5

DeepSeek's claims of parity with America's leading AI systems rest on extensive testing across mathematics, coding, and reasoning tasks — and the numbers are striking.

On AIME 2025, a prestigious American mathematics competition, DeepSeek-V3.2-Speciale achieved a 96.0% pass rate, compared to 94.6% for GPT-5-High and 95.0% for Gemini-3.0-Pro. On the Harvard-MIT Mathematics Tournament, the Speciale variant scored 99.2%, surpassing Gemini's 97.5%.

Credit: DeepSeek

The standard V3.2 model, optimized for everyday use, scored 93.1% on AIME and 92.5% on HMMT — marginally below frontier models but achieved with substantially fewer computational resources.

Most striking are the competition results. DeepSeek-V3.2-Speciale scored 35 out of 42 points on the 2025 International Mathematical Olympiad, earning gold-medal status. At the International Olympiad in Informatics, it scored 492 out of 600 points — also gold, ranking 10th overall. The model solved 10 of 12 problems at the ICPC World Finals, placing second.

These results came without internet access or tools during testing. DeepSeek's report states that "testing strictly adheres to the contest's time and attempt limits."

On coding benchmarks, DeepSeek-V3.2 resolved 73.1% of real-world software bugs on SWE-Verified, competitive with GPT-5-High at 74.9%. On Terminal Bench 2.0, measuring complex coding workflows, DeepSeek scored 46.4%—well above GPT-5-High's 35.2%.

The company acknowledges limitations. "Token efficiency remains a challenge," the technical report states, noting that DeepSeek "typically requires longer generation trajectories" to match Gemini-3.0-Pro's output quality.

Why teaching AI to think while using tools changes everything

Beyond raw reasoning, DeepSeek-V3.2 introduces "thinking in tool-use" — the ability to reason through problems while simultaneously executing code, searching the web, and manipulating files.

Previous AI models faced a frustrating limitation: each time they called an external tool, they lost their train of thought and had to restart reasoning from scratch. DeepSeek's architecture preserves the reasoning trace across multiple tool calls, enabling fluid multi-step problem solving.

To train this capability, the company built a massive synthetic data pipeline generating over 1,800 distinct task environments and 85,000 complex instructions. These included challenges like multi-day trip planning with budget constraints, software bug fixes across eight programming languages, and web-based research requiring dozens of searches.

The technical report describes one example: planning a three-day trip from Hangzhou with constraints on hotel prices, restaurant ratings, and attraction costs that vary based on accommodation choices. Such tasks are "hard to solve but easy to verify," making them ideal for training AI agents.

DeepSeek employed real-world tools during training — actual web search APIs, coding environments, and Jupyter notebooks — while generating synthetic prompts to ensure diversity. The result is a model that generalizes to unseen tools and environments, a critical capability for real-world deployment.

F-22Raptor · Dec 10, 2025

DeepSeek is Using Banned Nvidia Chips in Race to Build Next Model

DeepSeek, the Chinese AI startup, has been developing its next major model using several thousand Nvidia’s state-of-the-art Blackwell chips which the U.S. has forbidden from being exported to China, according to six people with knowledge of the matter. The chips DeepSeek is using were smuggled ...

www.theinformation.com

Deepseek is using illegally obtained NVIDIA Blackwell chips to train its next models

冖_冖 · Dec 12, 2025

https://www.cnbc.com/2025/12/10/nvidia-report-china-deepseek-ai-blackwell-chips.html

Published Wed, Dec 10 20259:45 AM ESTUpdated Wed, Dec 10 20252:02 PM EST

Nvidia rebutted a report that the Chinese AI startup DeepSeek has been using smuggled Blackwell chips to develop its upcoming model.
“We haven’t seen any substantiation,” an Nvidia spokesperson said in a statement.

Nvidia on Wednesday responded to a report that the Chinese artificial intelligence startup DeepSeek has been using smuggled Blackwell chips to develop its upcoming model.

The U.S. has banned the export of Nvidia’s Blackwell chips, which are considered the company’s most advanced offerings, to China in an effort to stay ahead in the AI race.

DeepSeek is reportedly using chips that were snuck into the country without authorization, according to The Information.

“We haven’t seen any substantiation or received tips of ‘phantom data centers’ constructed to deceive us and our [original equipment manufacturer] partners, then deconstructed, smuggled and reconstructed somewhere else,” an Nvidia spokesperson said in a statement. “While such smuggling seems far-fetched, we pursue any tip we receive.”

AI race: Meta reportedly uses Alibaba’s Qwen for its ‘Avocado’ model

Reported move to reinvigorate Meta’s faltering AI effort with new ‘Avocado’ model is another likely win for Chinese AI.

www.scmp.com

Published: 7:00pm, 11 Dec 2025

American tech giant Meta Platforms is reportedly using an open-source artificial intelligence model developed by Alibaba Group Holding to reinvigorate its faltering AI efforts in another likely win for Chinese AI.

According to a Bloomberg report on Wednesday, Facebook owner Meta was using Alibaba’s Qwen model, along with other open-source models from Google and OpenAI, as part of the training process for a new model code-named Avocado, which was expected to be released in the spring.

The report did not specify which Alibaba Qwen model was being used.

Bloomberg reported that Meta’s new model would mark a departure from its previous strategy of open-sourcing its models, meaning that users would only be able to access the model through an official application programming interface, or API. Meta has not officially announced a change of AI strategy from its open-source Llama models to a closed-source, revenue-generating model.

DeepSeek, China's AI model: News & Discussion

Registered Member

OpenAI’s Altman warns the U.S. is underestimating China’s next-gen AI threat​

Sam Altman admits OpenAI ‘totally screwed up’ its GPT-5 launch and says the company will spend trillions of dollars on data centers​

Registered Member

Registered Member

Chinese semiconductor shares surge after DeepSeek gives boost to homegrown chips Chipmaker Cambricon jumps 20% to record high after AI start-up adapts model for ‘next generation of domestic chips’ www.ft.com

​

Chinese semiconductor shares surge after DeepSeek gives boost to homegrown chips​

Nvidia Orders Halt to H20 Production After China Directive Against Purchases​

China’s AI chip drive eyes 82% autonomy by 2027​

Registered Member

Chinese semiconductor shares surge after DeepSeek gives boost to homegrown chips Chipmaker Cambricon jumps 20% to record high after AI start-up adapts model for ‘next generation of domestic chips’ www.ft.com

​

Chinese semiconductor shares surge after DeepSeek gives boost to homegrown chips​

Nvidia Orders Halt to H20 Production After China Directive Against Purchases​

China’s AI chip drive eyes 82% autonomy by 2027​

VIP Member

China's push for global AI dominance​

Registered Member

Leading AI scientist Alex Kot moves to a Sino-Russian university in China Information and electronic engineering scholar joins Shenzhen MSU-BIT University (SMBU), a Chinese-foreign cooperative university. www.scmp.com

Singapore’s leading AI scientist Alex Kot moves to a Sino-Russian university in China​

Information and electronic engineering scholar joins Shenzhen MSU-BIT University (SMBU), a Chinese-foreign cooperative university​

Registered Member

Alibaba lands China Unicom as flagship client for its AI chips​

Alibaba has been advancing its development in AI infrastructure​

​

Alibaba, Baidu begin using own chips to train AI models, The Information reports​

Tencent, Alibaba, Baidu turn to local AI chips, shaking Nvidia's grip​

https://www.reuters.com/business/me...-computing-power-plans-first-time-2025-09-18/​

Huawei unveils chipmaking, computing power plans for the first time​

Registered Member

Registered Member

DeepSeek’s Surge in Africa Reveals China’s AI Power Grab​

Registered Member

Airbnb picks Alibaba's Qwen over ChatGPT in a win for Chinese open-source AI​

Registered Member

Singapore picks Alibaba’s Qwen to drive regional language model in big win for China tech​

In a boost for China’s AI ambitions, AI Singapore has chosen Alibaba’s Qwen to train its new language model​

​

Singapore’s national AI program drops Meta model and switches to Alibaba’s Qwen​

Registered Member

Registered Member

DeepSeek just dropped two insanely powerful AI models that rival GPT-5 and they're totally free​

How DeepSeek's sparse attention breakthrough slashes computing costs​

The benchmark results that put DeepSeek in the same league as GPT-5​

Why teaching AI to think while using tools changes everything​

Elite Member

Registered Member

Users who are viewing this thread

Share this page

We value your privacy

OpenAI’s Altman warns the U.S. is underestimating China’s next-gen AI threat

Sam Altman admits OpenAI ‘totally screwed up’ its GPT-5 launch and says the company will spend trillions of dollars on data centers

Chinese semiconductor shares surge after DeepSeek gives boost to homegrown chips

Chipmaker Cambricon jumps 20% to record high after AI start-up adapts model for ‘next generation of domestic chips’

www.ft.com

Chinese semiconductor shares surge after DeepSeek gives boost to homegrown chips

Nvidia Orders Halt to H20 Production After China Directive Against Purchases

China’s AI chip drive eyes 82% autonomy by 2027

Chinese semiconductor shares surge after DeepSeek gives boost to homegrown chips

Chipmaker Cambricon jumps 20% to record high after AI start-up adapts model for ‘next generation of domestic chips’

www.ft.com

Chinese semiconductor shares surge after DeepSeek gives boost to homegrown chips

Nvidia Orders Halt to H20 Production After China Directive Against Purchases

China’s AI chip drive eyes 82% autonomy by 2027

China's push for global AI dominance

Leading AI scientist Alex Kot moves to a Sino-Russian university in China

Information and electronic engineering scholar joins Shenzhen MSU-BIT University (SMBU), a Chinese-foreign cooperative university.

www.scmp.com

Singapore’s leading AI scientist Alex Kot moves to a Sino-Russian university in China

Information and electronic engineering scholar joins Shenzhen MSU-BIT University (SMBU), a Chinese-foreign cooperative university

Alibaba lands China Unicom as flagship client for its AI chips

Alibaba has been advancing its development in AI infrastructure

Alibaba, Baidu begin using own chips to train AI models, The Information reports

Tencent, Alibaba, Baidu turn to local AI chips, shaking Nvidia's grip

https://www.reuters.com/business/me...-computing-power-plans-first-time-2025-09-18/

Huawei unveils chipmaking, computing power plans for the first time

DeepSeek’s Surge in Africa Reveals China’s AI Power Grab

Airbnb picks Alibaba's Qwen over ChatGPT in a win for Chinese open-source AI

Singapore picks Alibaba’s Qwen to drive regional language model in big win for China tech

In a boost for China’s AI ambitions, AI Singapore has chosen Alibaba’s Qwen to train its new language model

Singapore’s national AI program drops Meta model and switches to Alibaba’s Qwen

DeepSeek just dropped two insanely powerful AI models that rival GPT-5 and they're totally free

How DeepSeek's sparse attention breakthrough slashes computing costs

The benchmark results that put DeepSeek in the same league as GPT-5

Why teaching AI to think while using tools changes everything