·"Technology giants have strong fault tolerance. Every revolution in the technology field does not originate from the mistakes of large companies, but from the rapid rise of an emerging technology. The decline of old companies and the rise of new companies are all caused by the changes of the times." In the field of chips, there may be more specialized chips suitable for large models in the future, and the person who brings this revolution may be an emerging company with no historical baggage. "
is the "shovel" for digging in the artificial intelligence wave. became hard currency.
gpu (graphics processor) is a key chip for the development of artificial intelligence. Technology giant Nvidia dominates the artificial intelligence chip market with its GPU, occupying about 80% of the market share.
On March 19, 2024, local time, in San Jose, California, the Nvidia 2024 gtc ai conference was held, officially launching a new generation ai graphics processor (gpu) called blackwell. Visual China Data Map
ai chips are expensive and still scarce. Technology giants such as Meta, Google, AMD, Intel, and Microsoft have challenged Nvidia's market dominance and launched benchmarking products. The chip war is intensifying. Who are the potential disruptors?
chip war intensifies
h100 chip (a GPU chip announced by Nvidia in 2022) is in short supply, although Nvidia launched the successor of h100 in March this year-a new generation of artificial intelligence chip b200. However, companies large and small are looking to replace Nvidia's GPUs, trying to break the status quo of one dominant player in the AI chip market.
On April 10, local time, meta announced that the new generation of meta training and inference accelerator (mtia) has more than twice the computing and memory bandwidth of the previous generation solution. The latest version of the chip helps drive meta’s ranking and recommended advertising models on Facebook and Instagram.
Just the day before, Intel also announced the details of its artificial intelligence chip to compete with Nvidia. Intel's Gaudi 3 accelerator for AI training and inference is expected to significantly reduce training time for the 7 billion and 13 billion parameter LLama2 models and the 175 billion parameter GPT-3 model. Intel says the Gaudi 3 chip can train specific large language models 50% faster than the Nvidia H100.
Google's next-generation accelerator TPU v5p, which is dedicated to training large generative AI models, is also online through Google Cloud services. Microsoft's first artificial intelligence chip, Maia 100, is expected to be launched this year.
In December last year, amd launched the mi300 artificial intelligence accelerator series. Among them, mi300x has 153 billion transistors and is an advanced GPU suitable for artificial intelligence computing. It is specially designed for large language model training. Mi300a combines graphics processing functions with a standard central processing unit (CPU) for artificial intelligence and scientific research. AMD said its mi300 artificial intelligence accelerator is becoming the fastest-growing revenue product in its history, saying its mi300x chip has better inference performance than Nvidia's h100.
"Google's TPU has been iterated to the fifth generation, showing very strong chip development capabilities, and combined with its own understanding of the business line, it can define more efficient products." Yu Guang, executive director of Yaotu Capital, recently told The Paper Technology (www.thepaper.cn) commented. As for Nvidia's pursuer, AMD is also a big player. As a veteran CPU and GPU manufacturer, AMD has been catching up and overtaking in the server CPU field in the past. Now it is focusing on AI chips and is one of Nvidia's stronger competitors.
Some technology investors also told The Paper that Intel has also been lagging behind Nvidia in AI chips.
Intel has been manufacturing Gaudi chips since 2019 and released the Gaudi3 chip at the "ai everywhere" conference in December last year. But Rosenblatt Securities analyst Hans Mosesmann said, “Apart from Intel, AI seems to be everywhere.” Russ Mould, investment director of AJ Bell, a British investment service agency He said that as chips from companies such as Nvidia and AMD play an increasingly important role in the AI industry, Intel is in danger of being left behind.
“NVIDIA’s technology, products and businesses operate in the most competitive fields, characterized by rapid technological change and evolving industry standards."Zhang Jun, chairman of Shenzhen China-Europe Fund Investment Co., Ltd. (also known as "China-Europe Capital"), told The Paper (www.thepaper.cn) that competition will become more intense in the future, coming from existing competitors and new market entrants. Especially New entrants are also potential disruptors.
Instead of waiting for big companies to make mistakes, it is better to innovate
Nvidia’s AI chips have become hard currency.
Yu Guang said that NVIDIA has a deep layout in AI chips and flexible product adaptation. In the wave of AI, AI chip players with various innovative architectures have appeared, trying to subvert Nvidia, but they have not achieved widespread success. In the end, Nvidia was the first to catch up with the opportunity of large models. "Judging from this wave, many players already hope to make AI chips that are more optimized for current large models, but it is still difficult to subvert. At present, AI companies cannot get rid of NVIDIA's role," basically You have to rely on Nvidia's GPU chips and even overall solutions. "
" There has never been a chip in any field that has such a big market as today's AI chips. It's so big that it's exaggerated. "Xue Xiang, a veteran in the domestic GPU industry, told The Paper (www.thepaper.cn), "There is a saying in the chip industry that only two companies in a field of chips can survive well, and the first one accounts for 80% of the market share. , the second company accounts for 20%. "
In the past, Intel's CPU market was large, but the unit price was low; while the unit price of Nvidia's h100 chip exceeded 200,000 yuan. Tesla CEO Musk once revealed that Tesla will spend 500 million on Nvidia's AI chips this year alone. US dollars. This seems like a huge amount of money, but in fact it is only equivalent to about 10,000 h100. "Many companies have found that buying 100,000 Nvidia accelerator cards a year is not cost-effective at all. It is better to raise a team of their own. R&D. "Xue Xiang said that this is a business choice.
Just as meta has spent billions of dollars to purchase artificial intelligence chips from Nvidia and other companies, in the face of the ever-expanding demand for computing power, meta has embarked on the road of self-research and reduced its reliance on Nvidia. Reliance and cost reduction. Sam Altman, the father of chatgpt and CEO of openai, is also wooing Middle Eastern investors and chip manufacturers to raise billions of dollars to build a chip company and build a semiconductor factory network. CEO Masayoshi Son plans to raise $100 billion for AI chip companies to compete with NVIDIA.
Technology giants have strong fault tolerance. Every revolution in the technology field does not originate from mistakes made by large companies, but from an emerging technology. "The decline of old companies and the rise of new companies are both caused by changes in the times. "Xue Xiang said that in the chip field, perhaps more specialized chips suitable for large models will appear in the future, and the person who brings this revolution may be an emerging company without historical baggage. Therefore, instead of waiting for big companies to make mistakes, it is better to make their own decisions Innovation.
But Yu Guang believes that in the foreign chip innovation ecosystem, it is difficult for startups to subvert chip giants. Unlike the rapid iteration of large models, AI chip development cycles are long and require large investment in manpower and material resources. You can obtain financing through the innovation of chip architecture, and then iterate on previous generations of products. The final path may still be to be acquired by those big manufacturers, because when it comes to the actual engineering mass production and commercialization stage, big manufacturers have more advantages. "
The development of ai chips cannot only focus on computing performance
Artificial intelligence technology represented by large models has advanced rapidly, and the demand for computing power has surged, making upgrading chips inevitable. Xue Xiang said that major foreign technology manufacturers such as Nvidia "cannot stop" since the A100 chip "Live in a car", the chips developed are "better than each generation", and their performance is far ahead. "Now it is equivalent to driving a racing car on a straight road. Before chip production technology and cluster technology have reached the physical bottleneck, you can run ahead along this road." It runs very fast, and there will be a very big gap after two or three generations of product iterations. "But Xue Xiang said that this is a normal phenomenon and there is no need to worry too much.
The current storage performance and computing power of GPUs jointly determine the inference speed of large models. However, Xue Xiang said that the current usage scenarios of large models require model generation speed It’s not that high. So when using AI chips for large-scale model inference, the bottleneck is mainly storage.
Yu Guang said that during the AI wave in the past few years, technology giants such as Google and Meta developed their own customized AI chips based on their own business needs. Nowadays, large models have brought about rapid changes in AI. How these players adapt AI chips to the current large model algorithms is a common topic. He also believes that developing AI chips cannot just focus on computing performance. Storage and interconnection are issues that AI chips need to solve.
On the one hand, large models have many parameters and require a large amount of storage space. Currently, it seems that HBM (a new type of CPU/GPU memory chip) must be used to fit such a large model. On the other hand, data needs to be transferred from the HBM to the GPU's computing die (unpackaged die) at high speed. High-speed interconnections between GPUs are required such as nvlink (a bus and communication protocol developed and launched by NVIDIA) and nvswitch (NVIDIA's A high-speed switch technology) to achieve high-speed communication, and a higher-level high-speed networking method is required to form a computing cluster. Yu Guang said.
Furthermore, in Xue Xiang’s view, unlimited heap parameters and chips should be more of a scientific research activity rather than an application activity. “It is very expensive to run such a large model once and does not comply with business logic.” Looking back at the current situation, “There are One concept is called ai infra (artificial intelligence infrastructure), which is to optimize hardware and models based on existing large models, with the purpose of improving efficiency. "Xue Xiang estimated, "Optimize the efficiency of computing power usage and optimize the model structure without stacking so many parameters. Every aspect can be saved by 20%, and the overall cost can be saved by more than 50%,” Luo Xuan, chief operating officer of Shenzhen Yuanshi Intelligence Co., Ltd. (“rwkv Yuanshi Intelligence”), told The Paper (www.thepaper.cn). Most of China's models are fine-tuned or retrained based on low computational efficiency open source models, and most of the models are similar to meta's llama2 model. This is a large language model based on the transformer architecture. From a first principles perspective, the transformer is too complex and requires a lot of computing power. In the future, the development of multi-agent, embodied intelligence, and world models will be limited by computational complexity. . This has also caused large model companies to be "stuck" by NVIDIA's GPUs. On the other hand, NVIDIA chips are sold at 10 times the price. "This is a very abnormal computing power cost." The
transformer architecture and chip have blocked the commercial implementation of large models. And front-end research, Luo Xuan believes that a new architecture must be found before an enterprise can achieve pmf (product market fit). On the other hand, "Be sure not to buy high-premium computing power. Most of the money is made by NVIDIA. We believe that there will be new computing power in the future that will reduce the cost to 1/10 or even 1% of the current price." Station From the perspective of chip users, Luo Xuan said that currently domestic computing power is not cheap enough. How to reduce the cost of training and inference is a problem that domestic chips need to solve.