China News Service, May 14 (China News Finance Wu Jiaju) It is not GPT-5, but GPT-4o. On May 14th, Beijing time, OpenAI, the company that developed ChatGPT, launched GPT-4o with the ability to "listen, see, and speak". The OpenAI website states that the "o" in GPT-4o stands for "

entertainment 6530℃

China News Service, May 14 (China News Finance Wu Jiaju) is not gpt-5, but gpt-4o.

On May 14th, Beijing time, openai, a company that develops chatgpt, launched gpt-4o with the ability to "listen, see, and speak". The website

openai says that the "o" in gpt-4o stands for "omni". In English, "omni" is often used as a root word to express the concept of "all" or "all".

The company's CEO, Sam Altman, has previously warned that the latest product release "is not gpt-5, it is not a search engine, but we have been working hard to develop some new things that we think people will like."

According to reports, gpt-4o can perform reasoning on audio, visual and text in real time, accept any combination of text, audio and image input, and generate any combination of text, audio and image output. According to

openai, gpt-4o can react to audio input in as little as 232 milliseconds, with an average reaction time of 320 milliseconds, which is similar to the reaction time of humans in a conversation. Additionally, its performance is comparable to gpt-4 turbo on English and code text, and significantly improved on non-English language text. At the same time, in terms of API (application program interface), the speed is faster and the cost is reduced by 50%. At the

site, openai demonstrated multiple application scenarios of gpt-4o.

For example, in one demonstration, Mark Chen, head of cutting-edge research at OpenAI, spoke to chatgpt via his phone. Chen and chatgpt said that he was a little nervous during the demonstration, so chatgpt comforted him "like a friend." At the same time, chatgpt could hear Chen's nervousness from his rapid breathing, and then said to him, "Slow down. Mark, you are not a vacuum cleaner. Inhale and count to four."

China News Service, May 14 (China News Finance Wu Jiaju) It is not GPT-5, but GPT-4o. On May 14th, Beijing time, OpenAI, the company that developed ChatGPT, launched GPT-4o with the ability to 'listen, see, and speak'. The OpenAI website states that the 'o' in GPT-4o stands for ' - Lujuba

gpt-4o detects people's expressions. Picture from x platform

In another demonstration, OpenAI post-training team leader Barrett Zoph turned his face to the camera to let GPT-4O see his emotions. During the demonstration, zoph first turned on the rear camera of the phone and took a picture of the wooden tabletop, so chatgpt said, "What I saw seemed to be a wooden surface." After asking chatgpt to try again, chatgpt said to zoph, " You look very happy, smiling brightly, and maybe a little excited."

At the same time, openai also released a series of demonstration videos on its official website: gpt-4o can help learn mathematics, learn Spanish, and prepare for interviews, which can be judged by the screen. You are celebrating your birthday, and then I will sing happy birthday to you, you can sing different styles of lullabies as required, and you can even be the referee of "Rock, Paper, Scissors". In these videos, the conversation between gpt-4o and the demonstrator was smooth, and the tone of voice was "like a real person."

Some netizens said that based on the videos currently displayed, gpt-4o has improved a lot in the voice interaction experience. Some netizens believe that gpt-4o has greater demand for computing power. Some netizens also asked, since gpt-4o has "vision", whether it can replace the blind person in seeing the world.

openai said that with gpt-4o, the company trained a new model end-to-end across text, visuals and audio, which means that all inputs and outputs are processed by the same neural network. Since GPT-4O is OpenAI's first model to combine all of these modes, the company is still in its infancy in exploring the model's capabilities and its limitations.

Regarding this, Sam Altman stated on social media that the original chatgpt showed the prototype of the language interface, while the new chatgpt gives people a completely different feeling. It's fast, smart, fun, natural, and "helpful." "As we add (optional) personalization features, the ability to access information, the ability to take action on your behalf, etc., I really see an exciting future where we can do more with computers than ever before .”

Tags: entertainment