Zhang Peng, founder and president of Geek Park, Jiang Daxin, founder and CEO of Step Star, Yang Zhilin, founder of Dark Side of the Moon Kimi, and Zhu Jun, deputy dean of the Artificial Intelligence Research Institute of Tsinghua University and chief scientist of Shengshu Technol

entertainment 9682℃

Zhang Peng, founder and president of Geek Park, Jiang Daxin, founder and CEO of Step Star, Yang Zhilin, founder of Dark Side of the Moon Kimi, and Zhu Jun, deputy dean of the Artificial Intelligence Research Institute of Tsinghua University and chief scientist of Shengshu Technol - Lujuba

Zhang Peng, founder and president of Geek Park, Jiang Daxin, founder and CEO of Step Star, Yang Zhilin, founder of Dark Side of the Moon kimi, Zhu Junzaiyun, deputy dean of the Institute of Artificial Intelligence of Tsinghua University and chief scientist of Shengshu Technology At the Habitat Conference. Photo provided by interviewee

Recently, the American artificial intelligence company OpenAI’s first models with “reasoning” capabilities, o1 and o1-mini, have officially opened access to enterprise and education users. It is reported that users can enter up to 50 messages per week.

The openai o1 series model is considered by the industry to be a major step in AGI (general artificial intelligence). It not only solves the previously controversial mathematical problem of "who is bigger, 13.11 or 13.8", but also solves scientific and programming problems. Can handle more complex tasks than previous large models.

At the recently held Yunqi Conference, on topics such as "What impact may openai o1 have?" "Is the current development of large models accelerating or decelerating?" "What impact does AI currently have on the industry?" Several leaders of large model companies who have great influence and are known as the "Six Little Dragons of AI" - Jiang Daxin, founder of Step Stars, Yang Zhilin, founder of Kimi, Dark Side of the Moon, and Vice Dean of the Artificial Intelligence Research Institute of Tsinghua University, Shengshu Zhu Jun, chief scientist of science and technology, launched a wonderful discussion.

The more human-like o1 brings new entrepreneurial opportunities

The launch of the openai o1 large model has once again attracted people's attention. However, the release of OpenAI O1 caused different voices in the industry.

In the view of Jiang Daxin, founder and CEO of Step Star, openai o1 has proved for the first time that language models can actually be used for slow thinking of the human brain - an ability called "System 2". "System 1" is a kind of linear thinking. With the "System 1" ability, gpt4 can break down a complex problem into many steps and then solve it step by step, but it is still linear thinking. The ability of "System 2" can explore different paths, self-reflection and error correction, and continue trial and error until a correct path is found. Openai o1 combines the previous imitation learning and reinforcement learning, so that a model has the capabilities of both the "System 1" and "System 2" of the human brain.

Jiang Daxin believes: “In the past, reinforcement learning scenarios were designed for specific scenarios. For example, alphago can only play Go, and alphafold can only predict the structure of proteins. But the emergence of openai o1 has made reinforcement learning more versatile and generalizable. Although openai o1 has not yet reached a very mature stage and is still in its infancy, this is exactly what makes people feel very excited. This is equivalent to openai finding a path with a high ceiling and carefully thinking about the methods behind it. , you will find that this road can go on. "

At present, academia and industry have classified AGI (abbreviation for general artificial intelligence) from L1 to L5. l1 is equivalent to a chat robot, similar to chatgpt; l2 is a reasoner, who can do in-depth thinking on complex problems; l3 is an intelligent agent, which can change and interact, moving from the digital world to the physical world; l4 is an innovator, able to discover, Create something new, or discover some new knowledge; L5 is an organizer, who can collaborate or operate more efficiently in a certain organizational way. Each level has a narrow and broad distinction. Zhu Jun, deputy dean of the Institute of Artificial Intelligence at Tsinghua University and chief scientist of Shengshu Technology, believes that in this sense, OpenAI O1 has achieved high-level human intelligence in certain specific tasks of L2. From a grading perspective, it does represent a huge step forward for the entire industry.

Yang Zhilin, founder of Dark Side of the Moon Kimi, also said that openai o1 has raised the upper limit of ai. Humans may only be able to use ai to increase productivity by 5%-10%, but openai o1 may use ai to increase productivity by 10 times. In the view of Yang Zhilin , this will also bring changes to the industrial structure or startup companies. “I think a key point is that the proportion of training and inference computing power will change greatly. Changes in this ratio will essentially create many new opportunities."

Wall-facing Intelligence CEO Li Dahai said that the openai o1 model once again shows that original and basic innovation is the core driver of the development of artificial intelligence. Through this technological change, the center of computing power of large models may gradually shift from the training stage to the inference stage. The research center of the model may also gradually shift from the self-supervised pre-training paradigm to the reinforcement learning paradigm and alignment stage

cloud facilities and computing power are ready

It has been 18 years since the launch of chatgpt caused the whole world to start paying attention to agi. In the past 18 months, is the technology of large model development accelerating or decelerating? At what stage has agi developed?

Jiang Daxin said that the development of large models has not only accelerated in the past 18 months, but also developed very quickly from a "volume" perspective. Look, new models, new products, and new applications emerge every month. “For example, from the perspective of models alone, OpenAI released Sora during the Chinese New Year, which bombarded everyone. A gpt-4o was released in May, and an o1 was released last week. Anthropic, openai's old rival, has series from claude3 to claude3.5, plus Google gemini series, claude series, llama series, etc. In the past, our feeling was that Openai was the dominant company and was far ahead. This year, it has become a situation where everyone is chasing each other. It feels like the development of each company is accelerating. "

From a "qualitative" perspective, Jiang Daxin said that many landmark events have occurred in the past 18 months. For example, the release of gpt-4o has brought multi-modal integration to a higher level, integrating original functions such as visual understanding, sound, and video generation. Isolated models are fused together. The important thing about multimodality is that the physical world itself is a multimodal world, and multimodal fusion helps the model better simulate the physical world.

"In addition," Tesla released. The end-to-end large model fsd v12 is also a landmark event. "Jiang Daxin believes that intelligent driving is a real application scenario that moves from the digital world to the physical world. The significance of fsd v12 lies not only in intelligent driving itself, but this methodology can provide guidance on how to combine smart devices with large models in the future and how to better explore The physical world points out a direction.

Regarding the current development status of large models, Yang Zhilin said that from a vertical perspective, the IQ of models has been improving, which is reflected in mathematical capabilities, programming capabilities and the length of context they can understand. "For example, in the mathematics competition, I completely failed last year, but this year I have achieved over 90 points; in terms of coding, I can basically beat many professional programming players, and it has also created many new application opportunities. In addition, the length of context that current language models can support. At this time last year, most models could only support 4k-8k. Now 128k is a standard configuration, and many models can even support long text lengths of 1m or even 10m. This is actually a very important basis for the continuous improvement of model IQ. "

From the horizontal dimension, the various modes of the model are also developing, which allows the model to have more skills and complete more tasks. "There have also been many new breakthroughs horizontally, and of course sora may be the most influential. , such as video generation. Recently, many new products and technologies have come out. Now it is possible to directly generate a two-person conversation through a paper, and it is basically impossible to tell whether it is true or false. The transformation, interaction and generation between different modalities like this will become more and more mature. "

Zhu Jun also said that the overall progress is accelerating, everyone is solving new problems, and the speed of solving new problems is also accelerating. In February this year, sora shocked many people because it did not have public data. At that time, they were still wondering how to break through? But now it has taken about half a year to use the video model, and achieved good results.

Zhu Jun believes that the core reason for the acceleration is that everyone’s awareness and preparation of the route have reached a certain level, physically. Cloud facilities and computing resources are also better prepared, and we are no longer at a loss as when chatgpt first came out."Of course, there are industry differences in how different capabilities can be radiated to actual users, but from a technical point of view, the progress curve is actually getting steeper. In the future, the development of higher-end AGI may be faster than before. "

AI further changes the physical world and product form

" In the past 22 months, AI has developed faster than any historical period, but we are still in the early stages of the AGI revolution. It’s not about making one or two new super apps on the mobile phone screen, but about taking over the digital world and changing the physical world.” At the Yunqi Conference, Wu Yongming, CEO of Alibaba Group and Chairman and CEO of Alibaba Cloud Intelligence Group, mentioned that generative. AI gives the world a unified language - token. AI models can understand the real world through tokenization of physical world data. Wu Yongming also focused on the two major industries of automobiles and robots, and asserted that in the future, all moving objects will become intelligent robots.

Zhu Jun also mentioned that AI currently points in two directions. One is to make the digital content seen by consumers more beautiful and natural. The other direction points to the entity and the physical world, and a better combination point is robots. "The robot we built in our own laboratory is like a quadruped robot. In the past, we used it on different venues, and it required a lot of manual parameter adjustment to run. But now in a simulation environment, or using AI to generate some synthetic data, let it run in Through large-scale training, the trained strategies can be fed into the robot. It is equivalent to replacing a brain, allowing the limbs to coordinate better. The same set of strategies can be adapted to various venues. In fact, this is still a preliminary step. For example, now everyone is also paying attention to more complex control decisions, such as spatial intelligence and embodied intelligence," Zhu Jun said.

Zhu Jun also said that when AGI develops to the L3 intelligent agent stage, robots can do better reasoning planning, interact with the environment better and more efficiently, and better complete complex human tasks. "In the future, we can soon see that robots can accept complex instructions and complete complex tasks. Through its embedded thinking chain or process learning method, it can complete complex tasks. By that time, there will be another huge level of intelligent capabilities. "

" Yang Zhilin also said that the current progress of AI will definitely change the form of chat products. "In the future, AI may not only think for 20 seconds or 40 seconds like it is now, but it may have to call various tools to perform tasks at the minute level, hour level or even day level. The product form may be closer to a person. It is closer to the concept of an 'assistant', helping you complete asynchronous tasks (tasks that are executed concurrently in a process). The product form design may also undergo great changes, and there is quite a lot of room for new imagination.

Zhang Peng, founder and president of Geek Park, Jiang Daxin, founder and CEO of Step Star, Yang Zhilin, founder of Dark Side of the Moon Kimi, and Zhu Jun, deputy dean of the Artificial Intelligence Research Institute of Tsinghua University and chief scientist of Shengshu Technol - Lujuba

Zhang Peng, founder and president of Geek Park, Jiang Daxin, founder and CEO of Step Star, Yang Zhilin, founder of Dark Side of the Moon kimi, Zhu Junzaiyun, deputy dean of the Institute of Artificial Intelligence of Tsinghua University and chief scientist of Shengshu Technology At the Habitat Conference. Photo provided by interviewee

Recently, the American artificial intelligence company OpenAI’s first models with “reasoning” capabilities, o1 and o1-mini, have officially opened access to enterprise and education users. It is reported that users can enter up to 50 messages per week.

The openai o1 series model is considered by the industry to be a major step in AGI (general artificial intelligence). It not only solves the previously controversial mathematical problem of "who is bigger, 13.11 or 13.8", but also solves scientific and programming problems. Can handle more complex tasks than previous large models.

At the recently held Yunqi Conference, on topics such as "What impact may openai o1 have?" "Is the current development of large models accelerating or decelerating?" "What impact does AI currently have on the industry?" Several leaders of large model companies who have great influence and are known as the "Six Little Dragons of AI" - Jiang Daxin, founder of Step Stars, Yang Zhilin, founder of Kimi, Dark Side of the Moon, and Vice Dean of the Artificial Intelligence Research Institute of Tsinghua University, Shengshu Zhu Jun, chief scientist of science and technology, launched a wonderful discussion.

The more human-like o1 brings new entrepreneurial opportunities

The launch of the openai o1 large model has once again attracted people's attention. However, the release of OpenAI O1 caused different voices in the industry.

In the view of Jiang Daxin, founder and CEO of Step Star, openai o1 has proved for the first time that language models can actually be used for slow thinking of the human brain - an ability called "System 2". "System 1" is a kind of linear thinking. With the "System 1" ability, gpt4 can break down a complex problem into many steps and then solve it step by step, but it is still linear thinking. The ability of "System 2" can explore different paths, self-reflection and error correction, and continue trial and error until a correct path is found. Openai o1 combines the previous imitation learning and reinforcement learning, so that a model has the capabilities of both the "System 1" and "System 2" of the human brain.

Jiang Daxin believes: “In the past, reinforcement learning scenarios were designed for specific scenarios. For example, alphago can only play Go, and alphafold can only predict the structure of proteins. But the emergence of openai o1 has made reinforcement learning more versatile and generalizable. Although openai o1 has not yet reached a very mature stage and is still in its infancy, this is exactly what makes people feel very excited. This is equivalent to openai finding a path with a high ceiling and carefully thinking about the methods behind it. , you will find that this road can go on. "

At present, academia and industry have classified AGI (abbreviation for general artificial intelligence) from L1 to L5. l1 is equivalent to a chat robot, similar to chatgpt; l2 is a reasoner, who can do in-depth thinking on complex problems; l3 is an intelligent agent, which can change and interact, moving from the digital world to the physical world; l4 is an innovator, able to discover, Create something new, or discover some new knowledge; L5 is an organizer, who can collaborate or operate more efficiently in a certain organizational way. Each level has a narrow and broad distinction. Zhu Jun, deputy dean of the Institute of Artificial Intelligence at Tsinghua University and chief scientist of Shengshu Technology, believes that in this sense, OpenAI O1 has achieved high-level human intelligence in certain specific tasks of L2. From a grading perspective, it does represent a huge step forward for the entire industry.

Yang Zhilin, founder of Dark Side of the Moon Kimi, also said that openai o1 has raised the upper limit of ai. Humans may only be able to use ai to increase productivity by 5%-10%, but openai o1 may use ai to increase productivity by 10 times. In the view of Yang Zhilin , this will also bring changes to the industrial structure or startup companies. “I think a key point is that the proportion of training and inference computing power will change greatly. Changes in this ratio will essentially create many new opportunities."

Wall-facing Intelligence CEO Li Dahai said that the openai o1 model once again shows that original and basic innovation is the core driver of the development of artificial intelligence. Through this technological change, the center of computing power of large models may gradually shift from the training stage to the inference stage. The research center of the model may also gradually shift from the self-supervised pre-training paradigm to the reinforcement learning paradigm and alignment stage

cloud facilities and computing power are ready

It has been 18 years since the launch of chatgpt caused the whole world to start paying attention to agi. In the past 18 months, is the technology of large model development accelerating or decelerating? At what stage has agi developed?

Jiang Daxin said that the development of large models has not only accelerated in the past 18 months, but also developed very quickly from a "volume" perspective. Look, new models, new products, and new applications emerge every month. “For example, from the perspective of models alone, OpenAI released Sora during the Chinese New Year, which bombarded everyone. A gpt-4o was released in May, and an o1 was released last week. Anthropic, openai's old rival, has series from claude3 to claude3.5, plus Google gemini series, claude series, llama series, etc. In the past, our feeling was that Openai was the dominant company and was far ahead. This year, it has become a situation where everyone is chasing each other. It feels like the development of each company is accelerating. "

From a "qualitative" perspective, Jiang Daxin said that many landmark events have occurred in the past 18 months. For example, the release of gpt-4o has brought multi-modal integration to a higher level, integrating original functions such as visual understanding, sound, and video generation. Isolated models are fused together. The important thing about multimodality is that the physical world itself is a multimodal world, and multimodal fusion helps the model better simulate the physical world.

"In addition," Tesla released. The end-to-end large model fsd v12 is also a landmark event. "Jiang Daxin believes that intelligent driving is a real application scenario that moves from the digital world to the physical world. The significance of fsd v12 lies not only in intelligent driving itself, but this methodology can provide guidance on how to combine smart devices with large models in the future and how to better explore The physical world points out a direction.

Regarding the current development status of large models, Yang Zhilin said that from a vertical perspective, the IQ of models has been improving, which is reflected in mathematical capabilities, programming capabilities and the length of context they can understand. "For example, in the mathematics competition, I completely failed last year, but this year I have achieved over 90 points; in terms of coding, I can basically beat many professional programming players, and it has also created many new application opportunities. In addition, the length of context that current language models can support. At this time last year, most models could only support 4k-8k. Now 128k is a standard configuration, and many models can even support long text lengths of 1m or even 10m. This is actually a very important basis for the continuous improvement of model IQ. "

From the horizontal dimension, the various modes of the model are also developing, which allows the model to have more skills and complete more tasks. "There have also been many new breakthroughs horizontally, and of course sora may be the most influential. , such as video generation. Recently, many new products and technologies have come out. Now it is possible to directly generate a two-person conversation through a paper, and it is basically impossible to tell whether it is true or false. The transformation, interaction and generation between different modalities like this will become more and more mature. "

Zhu Jun also said that the overall progress is accelerating, everyone is solving new problems, and the speed of solving new problems is also accelerating. In February this year, sora shocked many people because it did not have public data. At that time, they were still wondering how to break through? But now it has taken about half a year to use the video model, and achieved good results.

Zhu Jun believes that the core reason for the acceleration is that everyone’s awareness and preparation of the route have reached a certain level, physically. Cloud facilities and computing resources are also better prepared, and we are no longer at a loss as when chatgpt first came out."Of course, there are industry differences in how different capabilities can be radiated to actual users, but from a technical point of view, the progress curve is actually getting steeper. In the future, the development of higher-end AGI may be faster than before. "

AI further changes the physical world and product form

" In the past 22 months, AI has developed faster than any historical period, but we are still in the early stages of the AGI revolution. It’s not about making one or two new super apps on the mobile phone screen, but about taking over the digital world and changing the physical world.” At the Yunqi Conference, Wu Yongming, CEO of Alibaba Group and Chairman and CEO of Alibaba Cloud Intelligence Group, mentioned that generative. AI gives the world a unified language - token. AI models can understand the real world through tokenization of physical world data. Wu Yongming also focused on the two major industries of automobiles and robots, and asserted that in the future, all moving objects will become intelligent robots.

Zhu Jun also mentioned that AI currently points in two directions. One is to make the digital content seen by consumers more beautiful and natural. The other direction points to the entity and the physical world, and a better combination point is robots. "The robot we built in our own laboratory is like a quadruped robot. In the past, we used it on different venues, and it required a lot of manual parameter adjustment to run. But now in a simulation environment, or using AI to generate some synthetic data, let it run in Through large-scale training, the trained strategies can be fed into the robot. It is equivalent to replacing a brain, allowing the limbs to coordinate better. The same set of strategies can be adapted to various venues. In fact, this is still a preliminary step. For example, now everyone is also paying attention to more complex control decisions, such as spatial intelligence and embodied intelligence," Zhu Jun said.

Zhu Jun also said that when AGI develops to the L3 intelligent agent stage, robots can do better reasoning planning, interact with the environment better and more efficiently, and better complete complex human tasks. "In the future, we can soon see that robots can accept complex instructions and complete complex tasks. Through its embedded thinking chain or process learning method, it can complete complex tasks. By that time, there will be another huge level of intelligent capabilities. "

" Yang Zhilin also said that the current progress of AI will definitely change the form of chat products. "In the future, AI may not only think for 20 seconds or 40 seconds like it is now, but it may have to call various tools to perform tasks at the minute level, hour level or even day level. The product form may be closer to a person. It is closer to the concept of an 'assistant', helping you complete asynchronous tasks (tasks that are executed concurrently in a process). The product form design may also undergo great changes, and there is quite a lot of room for new imagination.”

Written by: Nandu reporter Lin Wenqi

Tags: entertainment