DeepSeek, China's AI model: News & Discussion

China is also making fast progress on chips, then the game is really going to be on.
Anyway, I've been using it and it is a charm.
 
Practically every major American AI CEO has said it. VP of R&D at TSMC also believes Deepseek illegally used it. Unfortunately her X account was deleted. Probably too much info released

The Chinese use front companies out of Singapore, Vietnam etc
Ofcourse every American AI CEO would say it, they collectively got hammered on the stock market yesterday you think they going to say good things about deepseek when it hits their bottom line?

Ironically many of their workers are openly praising Deepseek.
 
LMFAO!!!!

To view this content we will need your consent to set third party cookies.
For more detailed information, see our cookies page.
 
Your point is efficiency, mine is access to the most advanced chips/compute.

Compute/chips will win out in the end and Deepseeks founder acknowledges that above. Without it you can’t achieve the leading edge models and ultimately ASI.

You’re coping, NVIDIAs moat is significant, and SMIC can’t even mass produce A100 level chip performance.

China is only a fast follower at this point. The US remains at the frontier.


I think everyone needs to chill, I don't believe the Chinese are being hugely triumphalist even, it's more the west reminding itself it can't lose to china so creating it's own hysteria, you think they don't have more capability in the West lol

The big tech mag 7 needed a correction anyway

This China Vs USA will go on for years, this is the first sugar rush


And rest assured we have Tata consultants ready to lend their expertise on qa projects😇
 
I think everyone needs to chill, I don't believe the Chinese are being hugely triumphalist even, it's more the west reminding itself it can't lose to china so creating it's own hysteria, you think they don't have more capability in the West lol

The big tech mag 7 needed a correction anyway

This China Vs USA will go on for years, this is the first sugar rush


And rest assured we have Tata consultants ready to lend their expertise on qa projects😇
One must see the philosophy behind Deepseek differently. From what I gather its purpose is to allow smaller companies, individuals with lower end infrastructure to run a top end model. Unlike chatgbt4o/1o etc they dont need all the compute power to try and cover every topic and have it learn everything like the encyclopaedia, they can focus on the topics they want like medicine, finance etc.

Having it open source and with so many different models makes this accessible to so many people now, it breaks a monopoly the big tech firms have had as you dont need to spend $millions in setting up decent AI models for your own purposes. Ofcourse the AI models being backed by the billion $$ firms are no joke either, but with what deepseek has done they can probably make their models leaner and more efficient with all that compute behind them.
 
Try asking an American AI about Palestine or the Holocaust and see what answers you get.

The Australian media continues its panic tantrum comparing answers from ChatGPT and Deepseek. Needless to say, they didn't ask about these 'sensitive' areas.

LOL, what a hypocrisy!
Go ahead and ask !
 
Western media panic tantrum continues ... to the amusement of one and all!

Westerners and Indians are 'geniuses' and 'visionaries', others are 'nerds'.

--
source: https://www.smh.com.au/business/com...old-nerd-behind-deepseek-20250129-p5l7wy.html

The 40-year-old reclusive ‘nerd’ behind DeepSeek eyes massive fortune​


January 29, 2025 — 6.28am

Three years ago, Liang Wenfeng’s quantitative hedge fund firm apologised profusely to investors for losing money during a tumultuous period for China’s sharemarket.

It was a surprising stumble for Zhejiang High-Flyer Asset Management, which used artificial intelligence to pick stocks and had grown rapidly to become one of the country’s largest quant funds. As the firm navigated through that crisis and its assets shrunk by more than a third from a peak of more than $US12 billion ($19.2 billion), behind the scenes, Liang was laying the groundwork for a new AI startup, DeepSeek.

DeepSeek, which grew out of High-Flyer, is now threatening to upend the global artificial intelligence supply chain and challenge the seemingly-unassailable US lead in critical frontier AI technologies. The sudden popularity of the 20-month-old firm’s breakthrough technology and its namesake app sparked a massive US and European stock rout on Monday, wiping out close to $US1 trillion in combined market value from chip giant Nvidia and other peers.

It has also drawn shock and awe over how Liang, an engineering graduate who has never studied or worked outside of mainland China, pulled off such a feat. He has demonstrated that with local artificial intelligence engineers, constrained access to the latest semiconductor technologies and limited resources, it is possible to match — and even surpass — the best in the field.

“Every country in the world could have that kind of a project going on, if they can acquire the talent and be able to work on it, of course. The rest of the industry is going to learn from this,” said Shuman Ghosemajumder, co-founder and chief executive of Reken, a San Francisco-based AI startup.

The question now gripping investors, companies and policymakers is whether artificial intelligence requires hundreds of billions of dollars in capital expenditure to come up with the latest innovations and vanguard AI models — and whether export controls can hold off Chinese competition.

Liang has been compared to OpenAI founder Sam Altman, but the Chinese citizen keeps a much lower profile and seldom speaks publicly. “OpenAI is not a god and cannot always be at the forefront,” Liang told Chinese media outlet 36Kr in July 2024.

The previous year, Liang said more investment doesn’t necessarily lead to more innovation. He has also opined on how Chinese companies have long been mostly followers as opposed to technology innovators. The problem has been a “lack of confidence and not knowing how to organise high-density talents to achieve effective innovation,” he was quoted as saying.

An outlier​

Liang was born in 1985 in Zhanjiang, an economically poor city in China’s southern Guangdong province. His father was an elementary school teacher. He studied electronic engineering at Zhejiang University, a prestigious college in the city of Hangzhou, and also earned a master’s degree in information and communication engineering there.

High-Flyer was as much an outlier in China’s quant industry as DeepSeek is to the global AI industry.

Liang and two of his former university classmates started dabbling in domestic stocks in 2008. Unlike the founders of most Chinese quant funds, none of them had overseas or institutional trading experience.

The trio tried different strategies from discretionary trading to arbitrage, before settling on using a systematic approach to implement trading ideas in 2015, the year they set up High-Flyer. They initially built a model based on price-and-volume factors, before trying machine learning in 2016.

The new tool allowed the firm to dig deeper to find new factors and identify “non-lineal” connections between factors, its chief executive officer Simon Lu said in an interview in 2020. The founders integrated machine learning into High-Flyer products in 2018.

AI allowed High-Flyer to achieve “a lot of innovations” and develop a multi-strategy, multi-cycle investment model to “pile up” returns from different sources of returns, according to a 2020 brochure for the firm. Its flagship product benchmarked against the CSI 500 Index integrated low-risk strategies like intra-day trading, allowing it to beat the gauge by a combined 120 percentage points in the previous three years, it showed.

High-Flyer grew assets quickly as a result, reaching more than 90 billion yuan in 2021 before it stumbled later that year.

In December 2021, after experiencing record drawdowns at some funds, High-Flyer said its artificial intelligence mistimed some trades and performed poorly during periods of large stock swings. “We feel deeply guilty,” it told investors. The firm also stopped accepting fresh inflows and said it would reduce its assets under management and adjust its strategies.

Three months later, its marketing head warned that certain volatility-sensitive clients should redeem their money — a highly unusual move.

Last year, High-Flyer said it would wind down products that had made two-way bets on the markets, and focus on “long-only” strategies in which it took only bullish positions on stocks. Its assets under management have dropped to around 60 billion yuan.

The research hub​

DeepSeek’s research was funded by High-Flyer’s R&D budget, Liang said previously. It drew computing resources from the quant fund, which had amassed 10,000 Nvidia GPUs in 2021, prior to US bans on exports of sophisticated Nvidia chips and other graphics processing units.

Liang recruited engineering talent almost exclusively from China. Many were fresh out of top universities, interns in their final leg of doctoral studies and Olympiad medal holders.

“He’s a nerd but nerd in this context is not a negative,” said Zihan Wang, a PhD student at Northwestern University who did a six-month internship at DeepSeek in 2024.

Wang said Liang ran many experiments on his own, and DeepSeek operated much like a research lab. “It started small, but as they got real progress, they started to get excited,” he said.

The startup began periodically releasing models, seemingly impervious to — even stirred up by — the US ban on exports of cutting-edge AI accelerator chips.

DeepSeek released its R1 advanced AI reasoning model on January 20, the same day Donald Trump was sworn in as America’s 47th president.

Earlier that Monday, Liang attended a closed-door business symposium in Beijing that was hosted by Chinese Premier Li Qiang. There, experts in technology, science, education and other fields offered their opinions and suggestions for a draft government work report, according to the official Xinhua news agency. Video footage on YouTube shows Liang sitting across the table from Li and speaking, with the Chinese leader nodding attentively.

Significantly, DeepSeek open sourced its R1, allowing researchers and developers to freely use, modify and commercialise the model. That sent a signal that it wants to collaborate and innovate with others in the global AI community.

Liang stands out among Chinese entrepreneurs because of that non-commercial goal, his laser-focus on research and the realisation of Artificial General Intelligence, said Thomas Qitong Cao, assistant professor of technology policy at Tufts University.

Liang is assumed to own 51 per cent of High-Flyer. That would give him a stake worth $US71 million based on a comparative analysis, according to the Bloomberg Billionaires’ Index. If DeepSeek reaches the same potential as OpenAI, valued at roughly $US150 billion, the founder could potentially be in line for a massive windfall.

Some have questioned whether Liang’s DeepSeek is as promising as it appears. Shortcomings include the startup’s infrastructure’s ability to handle global traffic waiting to try its service, or the app’s handling of sensitive subjects such as the 1989 protests in Tiananmen Square and queries on Chinese leader Xi Jinping.

Experts have also questioned the assumption that DeepSeek was building with 10,000 A100 Nvidia chips, with analysts like Dylan Patel speculating that DeepSeek needs at least 50,000 of Nvidia’s far-more powerful chips, the H100s. Meta for instance, operates the equivalent of 600,000 Nvidia H100s.

Still, Liang is prompting a rethink and recalibration in the global AI ecosystem. It’s made obvious that the “The AI race won’t be won by creating the most sophisticated model; it’ll be won by embedding AI into business systems to generate tangible economic value,” said Mike Capone, chief executive officer of Qlik, a data analytics and artificial intelligence platform.

Bloomberg
 

Users who are viewing this thread

Back
Top