DeepSeek: the 'grey swan' forcing the US to innovate and Europe to wake up

DeepSeek: the ‘grey swan’ forcing the US to innovate and Europe to wake up

Daniele Venanzi

28/01/2025

Horizons

With a total loss in value of about USD 1 trillion in the space of a few hours and the share values of Nvidia (which burned through about USD 600 billion in market capitalisation) and Microsoft plummeting by 17 and 7 per cent respectively, 27 January 2024 has all the hallmarks of a ‘Black Monday’ in high tech.

One week was all it took for shareholders to be convinced that the launch of the new R-1 model by DeepSeek, a Chinese start-up developing the eponymous language model of artificial intelligence, represents a more than concrete threat to the absolute US hegemony in the sector, both on the hardware and software side, with the financial earthquake that began at the same time as the app’s rise to the top of the free downloadable list in the iPhone App Store. In fact, the distinguishing feature of the new Chinese competitor is not so much that it outperforms the competition in benchmark tests (where it achieves similar or slightly better results than OpenAI’s and Anthropic’s flagship models) but that it achieves similar performance in an extremely energy-efficient and, consequently, cost-effective manner, while offering a completely free service.

DeepSeek – let it be understood – is not perfect, but the markets, as of today, do not care that it is. Knowing that it needs a fraction of the computing power of its main competitors is enough to convince the financial markets that hardware and investment capital will become less and less relevant in the race for AI – hence, the prediction that Nvidia will sell fewer and fewer GPUs, enough to cause it to report the most significant stock market drop in its history.

In detail, the DeepSeek R-1 model would only need $2.19 to process one million tokens(the fundamental unit of ‘measurement’ of language models), whereas OpenAi, to generate the same amount with its o1, has to shell out a good $60. Moreover, the Chinese start-up stated that 5.58 million dollars and about two months’ time would be enough to train its v3 model: crumbs, when compared to the 100 million needed for Anthropic’s flagship models, with development costs that, for some projects, can be close to a billion dollars. Finally, there is perhaps the most politically relevant aspect: DeepSeek, according to the man of the moment, its founder and CEO Liang Wenfeng, was able to optimise the operation of its models to such an extent because it was forced to resort to hardware that was severely ‘castrated’ in the number of units and performance, such as the H800 GPUs, developed by Nvidia specifically to circumvent the ban on the sale of its flagship products to Chinese companies, imposed on the manufacturer in the summer of 2022 by the Biden administration as a strategic move to try and hinder the development of Chinese artificial intelligence technologies, especially in the military and cybersecurity fields.

A ‘grey swan’ shrouded in myths to be dispelled

Only time will tell whether DeepSeek adheres, in terms of scope and impact, to Nassim Taleb’s definition of a ‘black swan’ for those unique and unpredictable events capable of provoking authentic financial earthquakes, or whether, as some analysts are already suggesting, its scope has been greatly overestimated by the markets, also and above all on the wave of inaccurate – if not completely erroneous – information circulating on the web and punctually taken up, without any fact-checking, even by authoritative international publications.

To dot the i’s, DeepSeek is not really open source, nor does it really use the MIT licence, opting, on the contrary, for a hybrid and customised solution. On this and many other pivotal points, the company’s communication is particularly ambiguous: if, on the one hand, it is clearly stated on DeepSeek’s website that the R-1 model is open source, on the other hand, it is enough to ask the chatbot itself about the nature of its licence to discover that it is an ‘open weight, not completely open source’ system: in practice, this means that we do not really have access to the complete source code and, crucially, we cannot access the data and the way it was trained. This is no small blunder, considering that the model, in this way, is not really replicable, nor does it lend itself to genuine improvements and innovations by third-party developers.

Another myth to be debunked, of which a sloppy press is complicit, to say the least, is the lack of recourse to Nvidia’s flagship products to power Chinese AI: while it is true that the start-up was only founded in the summer of 2023, it is equally true that Wenfeng, through the Chinese investment fund High-Flyer, has already received the necessary capital several years ago to acquire an entire fleet of the most powerful GPUs produced by Nvidia – such as the A100, the main model subject to the US ban. As reported by Associated Press – which is to be applauded for doing its job so well, unlike so many of its colleagues – some Chinese media had reported that Wenfeng would buy 10,000 units of A100 chips in 2022, just a few months before the ban comes into force. The crucial affair is also confirmed by MIT Technology Review in its article on how DeepSeek managed to circumvent US sanctions, which reports experts’ estimates that the company’s available units are as high as 50,000. On the other hand, back in March 2024, an interesting article by Marco Silvestri in Tom’s Hardware detailed how even Nvidia’s other flagship model, the H100, is absolutely widespread in China, despite the export ban.

A miracle of efficiency or a Chinese propaganda tool?

Doubts and perplexities do not end here, given the concerns about the impartiality and objectivity of a language model that is strictly censored by the Beijing government, with the chatbot carefully avoiding answering questions on political issues vetoed by the Party. It would certainly be obtuse to ask about the situation in Taiwan, the treatment of the Uighurs or the events in Tiananmen Square through a Chinese AI chatbot. Too bad, though, that the bias and manipulation certainly extend to most of the answers provided by the bot, even when we would not suspect it. On the other hand, as already mentioned, since DeepSeek is not really open source, there is no way of knowing what data and how it was trained. In this light, the scenario in which the language model in question takes on the connotations of a Chinese propaganda arm is more than a suggestion or unfounded concern. At the same time, there would be the question of data processing and privacy: an aspect of no small importance, given that, in the terms of use of the service, it is clearly stated that the AI monitors the IP address, the model of device used, and even the user’s keystrokes on the keyboard – data that there is no way of knowing where and how it might be used, with the real fear that it might end up directly in the hands of the Chinese government.

In short, DeepSeek appears to be a miracle of efficiency and a bursting innovation only if we uncritically accept all the declarations of its creators, spread by a country – it is worth remembering – on whose statements it is very difficult to do fact-checking due to the extensive and oppressive governmental control over all information, both internally and externally. We should, in the end, supinely accept the startup CEO’s word on the subject of funds needed to train its top models, without assuming huge and strategic ‘hidden’ funding from the Chinese government. In the same way, we should believe the company’s version of why new registrations to the service are currently closed, citing cyber attacks as the reason, without in the least questioning the real capacity of DeepSeek’s infrastructure to handle data traffic comparable to that carried on a daily basis, without batting an eyelid, by its main US competitors.

Understandably, the market, i.e. the end users, don’t care about any of this: individuals are constantly driven by utility maximisation and DeepSeek, as it stands, represents a saving of 20 dollars/euros per month on the Plus subscription and 200 on ChatGPT’s Pro package, and that’s all that matters. Whether it is a bluff or a genuine innovation, DeepSeek today embodies what Austrian economist Joseph Schumpeter called ‘creative destruction’: that process made possible only by the market and the capitalist system of production, that ‘industrial mutation which ceaselessly revolutionises the economic structure from within, ceaselessly destroying the old and always creating a new one’. It is the miracle of competition, of competition: now, echoing Schumpeter’s own distinction between invention and innovation, the US AI giants have an added incentive to invent new solutions, knowing that their position of dominance is threatened by a competitor who has brought an innovation that the market values – no matter how successful.

We regret to note that in this new frontier – full of implications for international relations – where America invents and China innovates, the European Union passively regulates, without any ability to take part in the race with its own companies, in a hell of rules and restrictions that make it impossible to do business. As argued in our analysis, the AI Act purports to regulate a sector that in Europe, due to obtuse policies, simply has not arrived. Reversing course before it is too late, as the French authorities are clamouring for from Brussels, is the only solution if we are not to miss the last train of global competitiveness, from which the Old Continent is already cut off in so many sectors: just think of the automotive industry which, as we have analysed, lies agonising under the blows of competition from both Washington and Beijing.