Bald head, smiling face, and a kind face. On June 14, when MiniMax founder Yan Junjie appeared at the 36Kr WAVES2024 conference, he still had a face that was unfamiliar to many people. As the founder of a high-profile AI model company, this is his first time to attend an offline

has a bald head, a smiling face, and a friendly expression.

html On June 14, when minimax founder Yan Junjie appeared at the 36Krwaves2024 conference, he still had a face that was unfamiliar to many people. As the founder of a high-profile AI model company, this is his first time to attend an offline summit.

In the fierce battle between China's large models that has been delayed for a year and a half, minimax is actually one of the two startups established before the emergence of chatgpt. It is also the one with the highest valuation among China's large model startups and the most advanced commercialization exploration on the c-end. One of the fastest companies. But Yan Junjie seems particularly low-key.

In this dialogue titled "Non-Consensus and Long-term Optimal Solution," Yan Junjie was talking to Huang Mingming, an early investor in minimax and the founding partner of Mingshi Capital. During this period, Yan Junjie rarely talked about some topics that he had never disclosed, such as how the initial decision came about; the difficult choices on the model route; his experience in organizational management; and how he views the inflection point of the explosion of AI applications. and China’s AGI’s position in the world.

"AGI is a game that China cannot lose." Huang Mingming said.

Currently, in addition to Mingshi Capital, investors in Minimax also include Hillhouse Ventures, IDG Capital, Sequoia China, Yunqi Capital, miHoYo, Alibaba, Tencent, Xiaohongshu and other institutions. Waves is a new summit IP launched by 36Kr last year, and this year is the second one. Yan Junjie and Huang Mingming at the

waves conference.

The following are the key points of Yan Junjie’s conversation extracted by us:

1. One of the reasons why he decided to do general artificial intelligence from doing artificial intelligence is: my grandfather said one day that he wanted to write a book about himself. Decades of experience. But he couldn't write it out, because it required very good language organization skills and at least the ability to type. I think only artificial intelligence can help him achieve this.

2. AI development will have three stages: the first stage of

will be before 2021. At this time, AI does not exist independently, but is more of a link in business and products. The second stage of

is that AI begins to have some general capabilities and can solve some popular problems. At this time, AI can independently drive the existence of some products. We are at this stage now.

The third stage is that the ability of AI is stably greater than that of ordinary individuals, and the online time of AI-driven users will inevitably be longer than that of traditional products.

3. The gpt-4 generation model has an error rate of about 20-30% in various evaluations and many real scenarios. The turning point in the future is to reduce the model error rate by another order of magnitude and increase the application scale by two orders of magnitude.

4. If there are only five AGI companies in the world in the future, there will be at least two Chinese companies, and at least the second place will be a Chinese company.

5. Regardless of whether it is a big manufacturer with money or a start-up company without money, (Chinese companies) investment in computing power may be 1-2 orders of magnitude smaller than that of American companies. This is a very certain thing in the next two or three years.

6. Transforming from dense to moe model is one of the necessary conditions for a better model. Including synthetic data, attention mechanism, multi-modal fusion, etc., the technology stack required for better models is increasing.

7. When did you feel you made the right choice in starting a business? When you find that you have no choice but can only find the only way, there is a high probability that it is the right choice.

8. I believe that the value of AI lies in serving ordinary people, because most people in the entire society are ordinary people.

9. Talent is the most core asset in a company, because talent and talent organization create everything.

10. I grew up in a relatively backward area. It is an obvious observation that people in these areas may need the help of artificial intelligence more than people in cities.

The initial decision

Huang Mingming: Hello io (io is the title of Junjie within the company). Many people have never seen you in public and say you are not the AI ​​hiding behind your back. I am very happy to accept the invitation of 36 Krypton Undercurrent and bring real people to the scene for direct communication.

You and I met for the first time at the end of 2021. The introducer was Liu Wei, the co-founder of MiHoYo.At that time, you just started to start a business, and you were the first startup in China to propose general artificial intelligence. The first time we met, there were three of us. The first time we chatted, I frankly didn’t understand what you were talking about. You talked about dialogue, voice, and digital people. The market once went crazy and said that you were a company that transformed the metaverse into AGI. Fortunately, at least one of the three of us understood it, and that was Xia Xing. After chatting with you again, he told me that this project must be invested, so we participated in minimax so early, which gave us a ticket to a new world.

Mingshi Capital has therefore set a rule. When meeting with relatively important founders, we bring at least three people with us. What if one person understands?

The time is set back to 2021, because the last wave of AI was far lower than everyone’s expectations, regardless of social value or commercial value. It was a moment when the whole world had a darker view of AI. chatgpt will not be released until November 30, 2022. What kind of huge opportunities did you see in 2021, and why did you believe in the arrival of AGI so early?

Yan Junjie: was actually just over two years ago, but it seems like a century has passed.

Huang Mingming: One day in the sky and a thousand years underground.

Yan Junjie: I thought of this three years ago, at the end of 2020. Why did I think of making general artificial intelligence? In fact, I experienced two very extreme things, which made me realize that I must do this.

I have been doing technology before, writing papers and doing a lot of research. One of my classmates is doing algebraic geometry, which is one of the most cutting-edge areas of mathematics. One day he told me that his teacher’s teacher passed away. I realized that in such an important frontier field, there may only be about 20 people in the world who understand this field. Progress in this field is very random, and it is increasingly difficult to enter this field.

If you rely on a random individual, progress will definitely encounter challenges. How can a cutting-edge field continue to progress? In addition to cultivating better people, at that time I began to think about whether this could also be achieved if there was better AI? If technological progress is important, then in addition to cultivating better talents, another way to do research is artificial intelligence, because the certainty of relying on technology is the highest.

In addition to cutting-edge fields, ordinary people are the same.

My hometown is in a county, and I often go back to the county to observe the lives of people in the county. My grandfather was in his seventies or eighties. One day, he said that he wanted to write a book about his experiences in the past few decades. Maybe not many people care about this experience, but I do. I found that he couldn't write it down, because it requires very strong language organization skills and typing, which he couldn't do. How could he turn his experiences into a book? There's nothing I can do to help him, but I think AI can.

I realized that whether it is the most cutting-edge things or ordinary people's things, if there is more general artificial intelligence technology, there will be a lot of differences.

But at that time, artificial intelligence technology relied heavily on customizing models according to special needs, and could only solve specific problems, such as face recognition and speech recognition. This matter is important in the long run, but the actual value of artificial intelligence at that point in time is very limited. It must be the wrong method or wrong route.

I began to realize that the only way to solve this problem is to make artificial intelligence more general and become a part of ordinary people's lives. At that time, I began to think that I must do general artificial intelligence and do ai to c. But at that point in time, there was no word for large model. If you use simplified language to describe it as an interactive agent, it can easily be regarded as a digital human.

Huang Mingming: As a pioneer in this field, can you share your thoughts on the development of AGI in the next five or ten years?

Yan Junjie: You can look at the history of first. I think there will be three stages in the development of AI. In the first phase of

, before 2021, more AI will be embodied in university laboratories, including many large companies, which will solve specific business problems through better algorithms.AI does not exist independently in this generation. It is more of a link in business and products, making a specific function more efficient. This is the stage after the emergence of deep learning and before the emergence of large models.

Such as face recognition, voice recognition, many beauty photos and other similar things. There will be such a company in the United States starting in 2020, and we will start doing it at the end of 2021.

Now we are in the second stage. There will be such a company in the United States starting in 2020, and we will start doing it at the end of 2021. AI can already exist in an independent product form. The core variable is that AI can become universal. Universal means that it can serve more scenarios without customization. Only then can it have independent value.

For example, some native AI products can appear on the AI ​​Assistant and AIGC content communities. But the problem is that the current user scale penetration rate is not that high. How to increase these penetration rates mainly relies on technological progress and product innovation. We have found that at least in our products, the major user turning points basically come from the improvement of model capabilities, which is a very significant phenomenon. The third stage of

is to experience another round of improvement in model ability. The error rate will drop by another order of magnitude, and the model's ability will stably exceed that of ordinary individuals. This will inevitably lead to a situation where the frequency of user interactions exceeds that of applications based on recommendation systems. The inflection point can be defined as the model error rate being reduced by one order of magnitude and the application scale being increased by two orders of magnitude.

Huang Mingming: When the model enters the next era, the error rate must be reduced by an order of magnitude, and users must exceed 100 million dau.

The future position of Chinese AGI in the world:

At least the second place is a Chinese company

Huang Mingming: As an AGI company born in China, we are destined to have fewer resources than openai, including some world-class manufacturers. 1-2 orders of magnitude. You even mentioned that the top 50 people with the greatest influence on large models in the world are not in China. As a Chinese AGI startup, how can we catch up with top companies like OpenAI? What opportunities are there to surpass these companies in the future?

Yan Junjie: We can look at some objective figures for . In addition to openai, leading startup companies have more than 1 billion US dollars in funding. But this field is not a track for startups themselves, but a track for startups to work with the next generation of larger companies. We can look at American companies, such as Google, Microsoft, and AWS. They will invest hundreds of billions of dollars in the next few years.

Huang Mingming: Each company will invest 100 billion US dollars in three years.

Yan Junjie: is the consensus of several major manufacturers in the United States. openai also has a similar amount of investment. It is possible that China's ByteDance, or Tencent and Alibaba, may have so much money. But coupled with these computing power limitations, they actually can't spend the money. Regardless of whether it is a big manufacturer with money or a start-up company without money, the investment in computing power may be 1-2 orders of magnitude smaller. This is something that is very certain in the next two to three years. We can't complain about

in particular. Let's think about what we should do to make AGI better, given that the restrictions exist objectively.

requires computing power, data and algorithms. In fact, there are still very core elements that have been ignored: users.

ai is not only reflected in a model, but also has another part that can be reflected in the user's creation. Objectively speaking, we will lag behind in terms of models. Through our many efforts, we will further narrow the backward generation gap. We can be better with users and bridge these gaps through users. This can be simplified into technical catching up, and then more work with users to realize AGI together.

Huang Mingming: is first. We have to admit this reality, the gap in computing power and resources, but we also have the advantage of continuous iteration that can evolve, including user experience and user foundation, and user-centered thinking. Bonuses from many engineers can go a long way.

still extends the above problem. In the last era, we invested in electric vehicles and were ideal early investors. I remember BYD Wang Chuanfu once said, "Together, we are Chinese cars."As the pioneer and leader of AGI in China, how do you see the positioning of Chinese AGI companies in the future global landscape? I personally think AGI is a game that China cannot lose. If we lose this game, I think it will be like when China first started to face the world in the 18th and 19th centuries, using agricultural civilization to treat the world that has evolved into the industrial age.

Yan Junjie: The R&D investment in ai must be increasing. This is undeniable. There will be a lot of competition in the short term, both domestically and overseas, and there will be a lot of randomness that cannot be clearly considered. But if you look at it in the long term, think about five or ten years from now, assuming there are only three companies in the world, or only five companies.

Huang Mingming: if there are five agi companies.

Yan Junjie: , at least the second place should be a Chinese company.

First of all, there are 1 billion Internet users in China, at least in terms of user scale, China is the absolute leader.

Secondly, in terms of talent, although China's current overall environment and innovation capabilities are still far behind the United States. But we can also see that many outstanding people will come back or grow up, and we don’t necessarily think of AI as a particularly mysterious thing. It is the same as other disciplines, such as new energy and biopharmaceuticals. I believe that although there is a gap in China currently, the overall talent quality and talent ecology in China will get better and better. At that time, the best company in China may not be as good as the first company in the United States, but there is a high probability that it will be better than the second company in the United States, because the top companies in the United States will also gather together.

We are lagging behind in terms of short-term computing resources, computing power, and chip manufacturing processes, but we are leading in communications and interconnection.

Huang Mingming: communication interconnection is world-class.

Yan Junjie: Although will experience many challenges in the short term and there are gaps in all aspects, in the long run, if there are five companies, at least two will be Chinese, and at least the second one will be Chinese.

Huang Mingming: Eight years ago, when we looked at smart EVs, when we dismantled Tesla from the electronic and electrical architecture of the vehicle to its battery pack, our first feeling was that the traditional automobile industry was finished. .

The second feeling is that there is no way to catch up with China's automobile industry. The only difference is the gap between several generations. But it took us seven or eight years to achieve "overtaking in the corner", and the development of China's EV is obvious to all.

Musk said that he believes that among the top ten car companies in the world in the future, there should be one Tesla, and the remaining nine are all Chinese companies. In the Chinese AGI field, with the help of io, if there are five companies in the future, at least 2-3 will come from China. We acknowledge the gap, but we still have great hope of catching up.

Regarding the choice of moe: This is the only way

Huang Mingming: From the first day of , many decisions made by minimax are very non-consensus. We first proposed to do general artificial intelligence in 2021, and last year we bet on moe (hybrid -Multi-expert model) network, in fact, by June 2023, moe is not a consensus in Silicon Valley. Only openai is fully betting on moe, and Google is fully betting on the dense model ens. Even the proposer of moe himself does not believe in the path of moe.

minimax’s internal decision in June was also fully betting on this matter, betting almost 80% of the available computing power resources. At that time, minimax was raising funds with a valuation of about US$1 billion. Although this matter would be beneficial in the long term, other domestic peers that did not make such a choice might be more likely to develop features that investors and users would see value for. At such a moment, why do you dare to make this decision?

Yan Junjie: is caused by two things. As an entrepreneur and a rational person, I do a lot of analysis. At that time, we found that tens of billions of tokens were processed every day. If it is a dense model, we cannot issue so many tokens every day. Soon all the money will be exhausted due to the cost of reasoning.

Huang Mingming: only had a small number of users at that time.

Yan Junjie: We already knew clearly at that time that with , although it appeared to be a C-end product, the value it brought to users was essentially an improvement in model capabilities. We can easily see that the dense ceiling is right there.

If we pursue a higher ceiling, we must make similar technological innovations. It’s not that there are two paths to choose from, it’s that this is the only path to achieve your goals.

Huang Mingming: is a necessary condition for leading to agi.

Yan Junjie: is a necessary condition for a better model. Not only choosing moe, but also all kinds of decisions in starting a business, I found that what I thought was a choice at the beginning was actually not a choice. When did you feel that you made the right choice? You will find that this is not a choice, but the only way you can think of, the only way that can lead to your goal. That was all I could do at that time. If I couldn’t do it, it would be over.

Huang Mingming: I have always said that entrepreneurship is the same as our life. The most important decisions that affect life may be 5-6, and the same is true for startup companies. Every decision makes the difference between you and your peers. It seems that there are many choices, but after thinking clearly, it may be the only choice. Is it because you are looking at longer-term things that you come to this conclusion?

Yan Junjie: is correct. We know this is difficult when starting this business, but optimizing a 3-6 month goal is of little significance. It is a long-term thing anyway.

Huang Mingming: Short-term optimization of can be perceived by the outside world, but it does not have much significance for long-term goals.

Yan Junjie: yes. This thing is very simple when you think about it. Internally, it is said that no shortcuts are taken. We have also taken some shortcuts internally, but every time we took shortcuts, we were slapped in the face. In the end, it became the number one value of the company, not to take shortcuts. But despite this, sometimes I can't help but take shortcuts.

Huang Mingming: Human nature is willing to take shortcuts, especially in our industry where there are many smart people. I have interviewed company executives, and the only one in the world that has made moe is openai, and this technology is indeed very difficult as you said. We have failed twice. I know that some of your executives are actually quite panicked. They have also asked you, but the feedback they give me is that every time they come to you, I don’t know if you are just pretending or if you are really determined in your heart. There was never an iota of hesitation. Did you have any hesitation in your heart at that time, especially after failing twice and risking almost all the company's computing resources and manpower? Did you have any hesitation in your heart?

Yan Junjie: is actually very confusing. When you don't think clearly, you will be confused, but when you think clearly, you will find that it is the only way. Knowing that no other path will lead to success, this is the only path that can be taken. There is no point in being anxious, you can only move forward because you are already convinced that this is the case.

Huang Mingming: is the only way to go because of its long-term goals. He was betting that moe was not communicated to the majority of shareholders. There were rumors in the market last year that some people had made good features, and some people were continuing to iterate on Dense, saying that the large model of minimax was stopped at the March version and there was not much progress.

Many people come to ask. Many people are worried about you and don’t know what you are doing. The original model has not been iterated and the product has encountered bottlenecks.

Your moe was not completely finished in January this year, but at that time you already had a plan in mind. During one of our meetings, io told Xia Ling and me in an understatement that he had bet almost 80% of the company's resources and failed twice, but the matter was almost over now.

He acted lightly, and my face was calm at the time, but I can share with you my true feelings at that time. I felt that the person opposite was either a madman or a genius who dared to bet all his resources on this matter. Matter. Every investor has an idea that this company has underinvested. This is the last time I talked to Junjie about his bet on Moe. The first thing I said to Xia Ling after I walked out was that Minimax was The company invested less, and we invested less in Junjie.

About users: For every ordinary person

Huang Mingming: openai is doing two of the most difficult things in the world at the same time, one is AGI, and the other is a super-large killer application (killer app).Minimax is also the first AGI company in China to propose making large models and killer app applications at the same time. Why must they do it at the same time?

Yan Junjie: comes from the internal philosophy formed during the entrepreneurial process. We realized two things. The value of AI lies in serving ordinary people, because most people in the entire society are ordinary people. Greater value means that more ordinary people can use your product.

If you want to serve ordinary people, the only way is to reach so many people in the form of products. The value of this company also lies in how much value it creates for users. The more users you have, the greater your value will be. The advancement of technology relies on interactive feedback from many users. Feedback is not necessarily direct likes, but includes a variety of information. User feedback makes the model better, which is the core element.

Huang Mingming: reminds me of the electric vehicle field a few years ago. There was a wave of people who were heading towards L4 and L5. But for EV companies like Tesla and Ideal, I need to have as many cars on the road as possible and get feedback from users’ driving behaviors in order to build better autonomous driving models. This is similar to what you just said.

Yan Junjie: In , we work with users to create better artificial intelligence, rather than making a good technology for others. This is our understanding of technology and products. Users or user creations are part of the model and product, not two separate individuals. It is not about making the best thing and then letting everyone use it like God.

Huang Mingming: users and user creation are part of the product and model. We found a very interesting phenomenon. People in Silicon Valley say that the road to AGI is destined to be full of power struggles. Regardless of whether it is Silicon Valley or all the people who are doing AI in China, most people are using an elite perspective. If I make the most awesome things, you people, you 1 billion or 6 billion people can just use it. They are all using an elite perspective. Overlooking all living things. You mentioned that we are not developing this technology, but we are co-creating this technology with users.

Huang Mingming: Apart from what I just mentioned, does have anything to do with my previous growth experience?

Yan Junjie: I grew up in a relatively backward area. Now I spend a lot of time living in the city, but I also have the opportunity to often see how my old family members live. It is an obvious observation that they may need the help of artificial intelligence more than people in the city. Whether it's the elderly, working people like me, or even some younger students.

Huang Mingming: When Junjie mentioned this to me for the first time, I felt very ashamed. In the past, whether we discussed AGI internally or from the perspective of so-called social and moral constraints, we were more from an elite perspective.

This reminds me of a recent joke. This year's college entrance examination seems to have a topic about the development of AI on human society. For a child in the mountains of Yunnan, he may not have even touched a computer or the Internet. What he is thinking about is how to finish plowing the crops at home and still go to class. How should such a child answer the impact of AI on social development? So co-creating with users and creating value for every ordinary person is a big shock to me at minimax.

About organizational management: The decisiveness of killing under the gentle appearance

Huang Mingming: When seeing io for the first time, it is easy to be confused by the appearance, smiling and cheerful. I later talked to your colleagues and found that they went from distrust to trust in you. Should we bet that every time you are ahead of domestic companies, even ahead of the world?

Our biggest concern at the time was whether you could manage despite your gentle appearance. After all, running a company is different from technology development. Later, after chatting, I discovered that you are the complete opposite of what you appear to be, and you are extremely efficient and decisive in making decisions. When you make a decision, you don't hesitate at all. You only consider whether it will help achieve the next better model or longer-term progress. If it doesn't help, cut off all useless nodes.

You were also a manager who led a team of 1,000 people in your last company, while Minimax only has more than 300 people today. Is this kind of organization and management method something you have thought about since you founded this company, or have you encountered it in the process? Rapid iteration of the problem?

Yan Junjie: This is a very critical issue. Assuming that this company has no employees, and all that is left is some money, models, and users, there is actually no way to make it better.

Talent is the core asset of a company, because people and organizations create what follows.

figured it out when he started this company, because your resources are very limited, competition is fierce, goals are extremely difficult, there are all kinds of uncertainties, and you are completely out of control. The only way is to think about the most essential things and not be confused by superficial things.

If you want low-level things, the only thing is the efficiency of technological progress. The efficiency and effect of technological progress can be transformed into each other. Assuming that you have limited computing resources, high efficiency means that your iterations are faster and have better results. These two things are not equivalent in traditional AI, but in this era , efficiency and effect are almost equivalent.

If your only goal is R&D efficiency, you can naturally deduce what kind of R&D organizational form can achieve relatively high efficiency. This can almost deduce what a good organization should look like, how to operate, what kind of people to find, and how. There are many things that can be done to take an organization from good to great.

The only way is to find a few streamlined core principles and figure out what to do based on the principles. Keep adjusting when you encounter errors. The clearer the bottom layer thinks, the lower the probability of making mistakes.

Huang Mingming: including Zhang Yiming and Li Xiang of Li Auto, your hearts are very pure. Everyone has to consider a hundred or ten thousand points when making decisions, such as what the outside world thinks, what investors think, what employees think, and what the media thinks. But most of us consider those things that you just mentioned are at the bottom, or are really the lowest. Long-term goals are not thought out clearly and thoroughly enough.

In fact, only people who are pure in their hearts and have true faith and love for this matter can resolutely and decisively abandon the noise and come up with a long-term optimal solution. This optimal solution may be considered by many people at the time is non-consensus.