On December 20, the "Digital Power, Exploring the Uncensored" 2023 Discovery Conference hosted by the Internet Society of China, Weibo, and Sina News kicked off in Beijing. Zhipu AI coo Zhang Fan delivered a speech entitled "New Business Paradigm in the Big Model Era".
Zhipu ai coo Zhang Fan - Keynote speech
The following is the transcript of Zhang Fan's speech, the content has been slightly edited and deleted:
I am very happy to have the opportunity to share with you today our Zhipu's experience and thinking on the implementation of large models. Today I will share with you Report some of our thoughts.
Everyone must have clearly seen that chatgpt is a phenomenal explosion. In two months since December last year, the number of global users exceeded 100 million, and the growth rate is almost unprecedented. There is another point that is a little different from before. In the past, for any concept, there was a concept first, and then we expected it to be implemented, whether it was the metaverse or the blockchain, etc. But today’s big model is not what we perceive first Concept, the first thing we perceive is the application. We first saw a product like chatgpt. After being shocked by it, we realized that the technology behind it is actually a large model. So I think this is different from the past. Today, when implementation is king, the large model is already a product that has the conditions for implementation.
Why is the large-scale model technology we see this time different from the implementation of AI in the past? The past AI has a history of almost a hundred years. From the emergence of AI in the 1940s to today, there have been countless times in which AI has to replace humans. Whether it is Deep Blue or Alphago, it soon fell silent every time. Why is almost the whole society paying attention to and applying large models this time? What are the reasons behind it? We have some simple thoughts.
The entire AI development process is an inclusive process. At the beginning, in the mobile Internet era, the cost of using AI was very high. Each task had to be independently defined, and the corresponding data sets, methods, and models were selected to obtain such results. Therefore, only major Internet companies at that time could Have the opportunity to use ai. By 2013 and 2014, the algorithm layer began to be unified. The maturity of neural networks has greatly reduced the threshold and cost for using AI, which is an order of magnitude lower than before. We don’t need so many algorithm scientists. Just some data engineers and product managers can do a little bit of AI. At this time, we found that AI began to enter the industry from major Internet companies.
So this also brought the first universal benefit of AI. In today's 2.0 era, we found that large models unified everything, from data to algorithms, models, and tasks, basically solved with one model. Everything, so it has changed from an era of training models to an era of using models. On the one hand, it brings a lot of capabilities to AI, which we can feel is much stronger than before. More importantly, its production cost has dropped by two orders of magnitude compared to before. We can often see that a decent AI application can be made with 1% of the original cost and time, so AI begins to become a basic element of social production.
ai has become ubiquitous, moistening things in all products and quietly changing our product experience and business model. So we can see that AI can bring about great changes in production efficiency and production quality, and with today's rapid growth, AI sea levels continue to rise, and continue to begin to submerge one after another that was originally thought to be only human beings. We can stand on the top of the mountain, so we have to prepare early today to turn ourselves into an amphibian, which can live on the mountain and on the sea.
We believe that the logic behind it is essentially a change in the underlying interactive capabilities. We see that human needs have never changed much. In different eras, we have different underlying interaction capabilities, which bring different product experiences, different efficiencies, different costs, and even different collaboration methods and business models.
From the earliest command line era, to the desktop era, to the touch control in the mobile era, to the natural language and multi-modality in today's AI era, our ability to use technology and the way we interact have become more and more instinctive and intuitive. The more efficient it is, the more information it contains, and its expressive power becomes stronger and stronger.
We can see that there are many scenarios at the beginning of every era. For example, in the DOS era, it was still very difficult to learn a DOS. As you get to the later stage, you don’t have to learn anymore. The threshold is getting lower and lower, and the interaction and expressiveness are getting better and better. Strong, every era has its own applications. From the initial input, to the expression of less content, to the desktop era, there are ways to interact with client games, maybe bbs. In the mobile era, mobile games and Douyin have appeared again, and the way of obtaining information will become different. So in the AI era, when our operating system becomes a large model, will we have new applications in the AI era, or new products that meet original needs? I think it will definitely happen, and this is the direction that everyone is working towards.
Zhipu AI has gone far in the implementation of large models. We have started commercializing large models since March this year. So far, we have seen more than 2,000 customers in about 9 months. More than 200 companies have created it, and it can be fully implemented in many scenarios and has even been applied. Therefore, the large model is very clear and has value. Whether it is the combination of industries, scenarios, or business chains, we can always find the biggest common denominator and be able to implement it today. At this point, we feel that we have done a good job.
Slowly, AI began to be unimportant and became able to lead our ability to locally transform some things. It slowly transitioned to AI's copilot and became an independent serial ability, reshaping the entire product experience, and maybe in the future. We can’t touch it yet, but we believe that the products that will be reconstructed based on AI will be completely different from today’s experience.
We also pay attention to how enterprises use AI capabilities. Ensure that all processes are informatized from the very beginning, so that the business becomes digital, can become a unified caliber, gather data together uniformly, have a unified business interpretation, as well as intelligent decision-making and intelligent processes, through some enhancements Analysis to inform decisions. We also simply divide enterprises’ use of AI into several levels. In the beginning it was just a matter of simply trying out some apps. When reaching L1, can some companies use AI to do some internal empowerment? When L2 comes, can AI be used to empower independent business modules to bring about some efficiency enhancements, such as intelligent customer service, user Non-mainstream businesses such as labeling and sales quality inspection are only used to improve efficiency and quality.
The next step is to start using AI to reconstruct our core business scenarios, such as assisting diagnosis in the medical system, generating advertising copy in advertising companies, and performing some series connections in the main business process, so I think this Block has been able to see some companies landing. And in the final stage, whether AI can reconstruct the business model, we have indeed seen some signs. In some companies, many individuals were originally connected through platforms to provide services. Now the way of collaboration has begun to change, that is, based on It is also possible to reconstruct business models with AI. I will briefly introduce you to a few implementation cases that are currently being implemented.
We have thought about many scenarios. For example, before the emergence of large models, text generation could only rely on rules, and the expressive power was very limited. Today we have also implemented advertising copywriting and can quickly generate copywriting. From writing copywriting in an advertising company to Select copywriting to improve efficiency, including office assistance, which can quickly generate drafts and structures of some structured articles, or implement partial expansion, abbreviation, style change, etc. Of course, it also includes the generation of news and novels. Some people will use it to assist in the generation of some novels and generate some paragraphs and fragments to improve efficiency. In the field of e-commerce,
can generate product copywriting with different language styles and tendencies for different platforms, and can quickly generate descriptions based on each platform through machines.
In the office field, for example, the generation, expansion, and conversion of meeting minutes into schedules, including the generation of JDs in some recruitment fields, and the generation of some data reports, whether we are in the business of enterprise services or not, there are some implementations Scenes.
There is also information extraction. For example, in the sales process, the salesperson and the customer had an hour of communication and chatted 10,000 words about how to quickly extract the user's needs so that the user portrait can be directly entered into the CRM.There are also some sales quality inspections to ensure that everything the sales said is correct and whether the sales have said everything they should say. It can even be further turned into sales training, how to simulate being a customer to ask questions, right His suggestions for scoring answers and making modifications include public opinion analysis, information extraction, and data processing, and there are some application cases.
There is also information retrieval. For example, given the scale and scenario of information, searching for a question will result in 10 results in the search engine. If it takes an hour to read each one, can today's large model help you read ten results? The result becomes a targeted answer, and you can continue to ask questions, especially in the video field. Can we build an index based on the content in their subtitles, and even combine multiple modalities and even scenes to create an index, which can bring a different experience. Including some product resumes, real estate structured searches, and large models can bring about some completely different scenes from the past. There are also smart conversations, customer service mobile assistants, smart cockpits, etc., which also have some applications in these fields.
From natural language to code, in addition to directly writing natural language into some executable programs to improve efficiency, we can also directly turn it into some websites, which can be quickly generated when doing knowledge analysis, giving us every sales operation Get your own analyst. There is also natural language to rpa, which can change the application effect into more advanced scenarios. The boundaries of large models are very wide, and what we see today is only part of it. As our cooperation with customers deepens, we can always discover some new scenes, so it is still very meaningful.
One aspect of how an enterprise should implement a large model is from the perspective of the enterprise itself. If there are no clear expectations, it may not be implemented. We will allow enterprises to self-test in several dimensions, such as whether the degree of digitalization is complete enough and whether it can ensure that all core businesses have data; on the other hand, whether it is possible to find a clear and independent pilot that is both in line with the general needs and The logic of the model is valuable in the business, which is often said to be the greatest common denominator between the large model and the business; and whether there can be a clear and measurable indicator, and even a full-time person in charge to ensure the implementation and implementation of the project. Durability; more importantly, reasonable expectations. Don’t overestimate the short-term effects, treat large models as wishing pools, and don’t underestimate the long-term potential of large models.
In addition to our own perspective, we also need to see how to make it your ability from the perspective of the model. So today, there are only about three ways that large models can be changed or optimized, which are more customized than the original system. The difficulty should be low, such as pre-training and fine-tuning, which deal with general capabilities, domain capabilities, and task-level capabilities respectively. Therefore, when you use it, which data should be selected and where should it be placed becomes very important. It’s not just about throwing all the data into the model and training it, there is also a set of methodology to implement it.
Finally, companies should consider how to build their own big model strategy in the big model era and build their competitiveness in the new era. First, you need to choose a reliable base model. Calculate the general ledger through sustainability, effectiveness, and engineering capabilities. Whether it has a complete enough matrix are all factors you need to consider. Sometimes it is not cheaper or better, but free. Sometimes it is more expensive. Based on your scenario, you should consider what kind of base model to choose and build a matching organization. Second, in terms of organization, applying large models requires fine-tuning engineers, data engineers, etc., and you should even have BP of large models to go deep into the implementation of each business line; and quickly establish your forward cycle and forward flywheel after the first two steps. , Constantly immersing yourself in the data assets of the new era will become your competitiveness in the new era, and how it can be seamlessly introduced into the business process, so that everyone can use the large model without perception, not just a conversation.
I think today’s companies should think holistically when implementing large models. I believe that 2023 may be an era when models are king, so everyone only looks at some parameters and lists; in 2024, implementation must be king and commercial value In the era of being king, everyone will pay attention to how models can be transformed into user value and commercial value, so I hope everyone can find a good implementation scenario in 2024. Thank you everyone!