As a guest in the last session of the opening ceremony, Dark Side of the Moon CEO Yang Zhilin was surrounded by people as soon as he stepped off the stage. Yang Zhilin embarrassedly repeated "there will be opportunities in the future" while leaving the venue in a hurry with the help of the staff.
This is fun for laymen to watch. On June 14, the Beijing Zhiyuan Conference opened. We witnessed a large model "star chasing" scene at the place closest to Yang Zhilin.
On the same day, Yang Zhilin had a peak dialogue with Wang Xiaochuan, CEO of Baichuan Intelligence, Zhang Peng, CEO of Zhipu AI, and Li Dahai, CEO of Wall-facing Intelligence. We heard the discussion of AGI, the "belief" in scaling law, and the "recognition" of large-model price wars in the place closest to the hottest entrepreneurs in the AI field.
This is a secret that experts love to hear. The two scenes are a portrayal of China's AI market from the outside to the inside, and from the outside to the inside. The two are connected, both the present and the future.
What is agi
The breakthrough of the large model makes it more possible for mankind to achieve the ultimate fantasy of agi. But for AGI itself, there seems to be no unified definition in the industry.
Yang Zhilin believes that the current agi may not necessarily need a precise and quantitative definition. It may also be a qualitative and emotional existence. Its significance is to prepare society or humans for what is going to happen next. However, Yang Zhilin believes that in the short term, such a quantitative definition is indeed needed, "because if there is no quantification at all, it will naturally be impossible to measure the progress of AGI development, thus affecting the overall development."
Zhang Peng is also more willing to believe that agi is a belief and a symbol, and its connotation and extension are constantly changing. If you can describe something very quantitatively and clearly, then the ceiling will probably be within reach. Zhang Peng believes that for the development of artificial intelligence, no one can clearly explain what AGI is, but it may be a good thing, which means there are more unknown spaces waiting to be explored.
Wang Xiaochuan tried to understand AGI from a "human" perspective, that is, whether large models can "create doctors." The reason for choosing this indicator is that the first change in AGI is that large models begin to have the ability to think, learn, communicate, empathize, and even multi-modal processing. Just like doctors, they need both multi-modality and reduced hallucinations. At the same time, it requires strong memory ability, literature search ability, and reasoning ability. If the abilities that can be agreed upon in the industry are projected into the standards for being a doctor, then this can become an indicator-the "artificial doctor" is AGI.
Li Dahai tried to define AGI from the perspective of economics. In his view, an ideal AGI means that when performing any task, the marginal cost has been reduced to 0. When everyone in the industry promoted the implementation of large models last year, many scenes still needed fine-tuning, and the marginal cost of this process was very high. "We believe that with the improvement of model capabilities, the threshold for large models is getting lower and lower, and the cost is getting closer and closer to 0, when AGI has basically arrived."
Is scaling law still a "creed"
Beijing Zhiyuan Conference On the topic, one word was mentioned repeatedly - scaling law, which is similar to the large-model version of the law of "big efforts can produce miracles", and has successfully shaped openai. But with the passage of time and the development of AI, this golden rule that the more parameters the stronger the model, has gradually fallen into the discussion of whether Moore's Law is generally close to failure.
When talking about scaling law, Yang Zhilin believes that there is essentially no problem with scaling law. As long as there are more computing power and more parameters, it will continue to evolve. But in this process, if you continue to use the current method, the upper limit is obvious. The more critical thing is how to implement scaling law more efficiently.
Wang Xiaochuan also believes that in addition to scaling law, we must look for new transformations in paradigms in terms of data, algorithms, computing power, etc. “Whether it is in terms of strategy or belief, I think there are paradigm changes in addition to scaling law. The possibility is not just to simply change to compression mode, but to step out of a system. Only in this way can we have the opportunity to move towards AGI and compete with cutting-edge technology."
"So far, there are no signs that scaling law will become invalid. Scaling law will still be effective for a considerable period of time in the future. Of course, this validity is also a dynamic concept. The key is that the things it contains will continue to evolve." Zhang Peng also agreed with Wang Xiaochuan's point of view, saying that scaling law focused on parameter scale in the early days, but now it has gradually expanded, and the importance of parameter quantity, data quantity and data quality has also been gradually valued. "As everyone's understanding of the rules becomes more and more important, Deep and gradually reveal the nature of the law, you can gradually grasp the key to the future."
Promoting the landing of large models on the end side is one of the current key tasks of wall-facing intelligence, and the "small steel cannon" minicpm is an important starting point. Whether scaling law is still effective on lightweight models has become a new question.
In this regard, Li Dahai chose to still believe in scaling law. Li Dahai said that scaling law is an empirical formula, which is the industry’s experience summary after observing a complex system such as a large model. As more and more experiments are conducted during the training process, the cognition becomes clearer and there will be finer particles. degree of cognition. For example, model training methods and data quality have a significant impact on scaling law and intelligence.
Is the price war good or bad?
In late May, when the domestic large-scale models were fiercely engaged in a "price war", Wang Xiaochuan once showed a kind of calmness that "let him be strong and let others be strong, and the breeze will blow over the hills." A sentence that was repeatedly quoted by the outside world at that time was Wang Xiaochuan's saying, "Startup companies are not within the range of large manufacturers."
has changed from using "centi" as the unit of calculation to "turning the table" and making it completely free. Throughout May, large models have been caught in a craze dominated by price wars, especially Alibaba, Baidu, iFlytek, and Tencent Cloud have followed suit. The price war for large models has entered a fierce stage.
But now, Wang Xiaochuan's attitude has changed slightly. At the Beijing Zhiyuan Conference, Wang Xiaochuan made it clear that the current price war is a very special thing for China's development model, "I view this matter positively."
In Wang Xiaochuan's view, price wars are usually competition-oriented market behavior, which will allow more companies and more people to use large models, allowing large models to spread rapidly. On the other hand, in view of the "anxiety" of the industry, many companies that were supposed to be users of large models also wanted to transform into suppliers of large models, resulting in a waste of talents, funds, etc. After the price war, many companies began to wake up and think about their competitive advantages, and then returned to becoming users of large models, reducing waste while allowing more companies to find a development position that suits them.
Li Dahai also agrees that although the current price war has a certain marketing component, the price will definitely be lower than the current price in the future. "Everyone has profits, which is a healthy way to truly enable applications in thousands of industries." Landing".
preceded ByteDance’s vigorous price cuts, and Zhipu AI was once considered the initiator of this round of price wars. At the conference, Zhang Peng smiled and clarified, "It's nothing."
Zhang Peng said that from a macro perspective, price reduction can promote large models to become real infrastructure, but Zhang Peng also mentioned that we should not pay too much attention to or even publicize this matter. "It is not normal business logic to do loss-making transactions. If it is not sustainable, we will eventually return to user value and productivity value."
returns to value itself, which is also Yang Zhilin’s core view. He made three judgments. First, a very important node in the future is when the computing power for inference will significantly exceed the computing power for training, which marks the beginning of the release of value. The second important node is that the c-side reasoning cost is significantly lower than the customer acquisition cost, which is significantly different from the previous business model. Under these two premises, the third key is that what AI can do may exceed what humans can do, thus creating new business models.
"This may not be a price war using APIs on the b-side as discussed today, but an inclusive AI, and at the same time forming a business model based on the value generated. These three points may change the large model business model itself Or the way in which roi issues are important." Yang Zhilin said.When
was interviewed by Beijing Business Daily and other media, Wang Zhongyuan, president of Zhiyuan Research Institute, was inevitably asked about the price war. Wang Zhongyuan believes that price reduction is a matter of both pros and cons.
On the one hand, price reduction has value for developers to develop application scenarios and access large models to make more attempts. But on the other hand, if the price reduction continues and is lower than the actual cost, it is likely to outweigh the gain.
"After all, iteration of large models still requires huge capital investment, and China's large models should not stay at the level of gpt-4. We are more willing to see a good industrial ecology develop healthily, which also means that it needs to find its own industrial model and business model”. Wang Zhongyuan said.
For developers, Wang Zhongyuan believes that when choosing a large model, the priority should not be cost-effectiveness, but performance. Wang Zhongyuan said: "On this basis, I believe that the price will eventually return to a reasonable level. And there is also a scale effect for large and easy-to-use models. If the scale of use increases, engineers will naturally be able to do engineering systems. With optimization, the price will naturally drop, thus preventing bad money from driving out good money.”
Beijing Business Daily reporter Yang Yuehan