Picture source @Visual China
text | Value Planet, author | Zhu Ming, editor | Maji
sora came out of the blue, attracting the attention of film and television people, and also filling the film and television industry with an atmosphere of anxiety.
Although the Spring Festival season achieved a record-breaking box office of 8.016 billion yuan, half of the eight Spring Festival movies ended in disgrace - "Let's Shake the Sun Together" and "Mr. Red Carpet" were withdrawn one after another. This illustrates the current users of the film and television market. Demands are becoming increasingly complex and diverse.
Another thing that makes the film and television industry anxious is that on February 15, openai on the other side of the ocean released the generative AI model sora. Its specialty is the ability to generate videos of up to 60 seconds through text commands, creating realistic scenes such as fashionable women walking on the streets of Tokyo and mammoths walking in snowy fields.
From the moment people use AI tools to turn abstract text into concrete pictures, a new era of visual storytelling may be coming and will have a profound impact on multiple industries.
Some people believe that sora has accelerated the era of "everyone can be a director" and will in turn take away the jobs that originally belonged to some film and television people. Some people even shouted that "filmmakers will lose their jobs." However, some people believe that Sora's current limitations are not enough to support the vision of "everyone can be a director" and will not have a huge impact on the film and television industry.
In the huge discussion with different opinions, is the anxiety in the film and television industry really necessary or is it unfounded?
Technology craze and employment anxiety
In the technology circle, there has not been an application like sora for a long time, which has set off a craze among the people who are eager to try it. The previous one was chatgpt, which is also owned by openai. After
sora was released, a blogger entered the same text command into the four models of sora, pika, runway, and stable video:
Beautiful, snowy Tokyo is bustling. The camera passes through the bustling city streets, following a few people enjoying the beautiful scenery. On a snowy day, while shopping at nearby stalls, the gorgeous cherry blossom petals flutter in the wind along with the snowflakes.
Dynamic picture source: https://twitter.com/gabor/status/1758282791547232482
Obviously, compared with the works generated by the other three large models, the video effects generated by sora are more silky and natural. Various demonstration videos released by
sora, such as battleships fighting fiercely in coffee, show that text commands can generate video imagination space under different time and space scenarios, and the effect is very realistic.
This means that in the future, perhaps everyone can write a script based on their own ideas, and then produce differentiated videos through sora, eliminating the need for finding photographers, translators and even actors.
Guotai Junan Securities mentioned sora and believed that it has three major highlights: first, 60-second long video, second, single video multi-angle lens, and third, the ability to understand the real world, which collectively improves users' ability to make videos. Among the three highlights of
, the most thought-provoking one is "the ability to understand the real world."
In the past, when the film and television industry used computers to produce videos, they preferred to use industrial methods such as 3D modeling and lighting, and then render them frame by frame. This method of simulating images is not only inefficient, but also very costly.
Sora's logic for making videos is not to "copy" directly, but to learn and understand various styles and genres from a large amount of video data, accumulate their own experience, and then perform complex tasks based on experience after the user issues text instructions. Finally simulate the video.
According to Zhou Hongyi, sora's method of making videos is very similar to human dreaming, because the pictures seen in dreams are based on human experience, not 3D modeling and rendering.
Therefore, openai does not simply regard sora as a "video model", but as a "world simulator". Once a "video artifact" like
is introduced to the market, the first to be affected will be software companies that specialize in video production. Coincidentally, on the second day after Sora was launched, the stock price of Adobe, which specializes in image processing and video production software, fell by more than 7%.
Subsequently, more practitioners in the film and television industry also smelled the atmosphere of anxiety. Looking back at history, every major technological innovation has threatened some jobs.For example, under the "Red Flag Law" in the early days of the Second Industrial Revolution, the emergence of automobiles was collectively boycotted by coachmen because they were worried that they would be robbed of their jobs. Sequoia Capital also published an article titled "Generative AI: A Creative New World", mentioning that generative AI involves billions of artificial labor. The emergence of
sora once again triggered two completely different voices and started a discussion on the prospects of AI. Will
ai take away human jobs? After
sora was released, many big names and institutions expressed their admiration for its capabilities.
Tesla CEO Musk said that "human beings are willing to admit defeat." The authoritative American film industry journal "The Hollywood Reporter" speculates that Openai will use Sora to enter Hollywood on a large scale and may replace a large number of labor forces.
In fact, the reason why many big names and institutions believe that sora will have a disruptive impact on the film and television industry lies mainly in the following two levels:
First, it is user acceptance and commercialization.
In China, the user base of short videos has accounted for 94.8% of the total Internet users, and it is one of the few Internet sectors that is still maintaining traffic growth. Not only China, but the whole world has entered the short video era, and tiktok has become the most popular application.
Massive users have huge demand for video supply. This means that Sora will achieve large-scale implementation in the later stage, which will be much smoother than other products. The idea of "everyone becoming a director" has a huge user base.
On the other hand, any product that wants to continuously iterate and form a huge influence must be commercialized, and sora's commercial space in the film and television industry is large enough.
San Francisco early investor Zak Kukoff predicts that a team of less than 5 people will use Vincent Video models and non-union labor to produce a movie with a box office revenue of more than 50 million US dollars within 5 years.
has such an excellent input-output ratio that almost no film and television team can match it at the moment.
At present, the commercial value of sora in the film and television industry mainly lies in its ability to help content production and distribution achieve cost reduction and efficiency improvement. Before
was released on sora, a domestic university team stated that AI-generated videos have helped reduce about 1/3 of labor costs in real-life shooting, and may be further compressed in the future. Recently, a blogger from
said that the "Journey to the West" animated short film he generated through AI not only showed very high video quality, but also completed the usual workload of at least half a year in just one week.
Zheshang Securities believes that sora and similar products will participate in the process of changing the two major aspects of information production and distribution. PGC (professional production content) will widely use AI tools to assist production, and UGC (user-generated content) will gradually use AI tools to assist production. Replace pgc.
Overall, sora may be the first to change the US$200 billion short video creation ecosystem with its cost reduction and efficiency improvement features in film and television production, giving birth to a large number of "short video directors", and then gradually spreading to short plays, long videos, Animation, games and other scenes.
is particularly worth mentioning as it is a popular short drama nowadays. The production cost of this type of film and television drama itself is not high, and the plot is simple and bloody. However, if it becomes popular and becomes popular, the profit will often be very high. If sora is used to further reduce costs and improve efficiency, commercial space will undoubtedly be greatly increased.
However, behind the cost reduction and efficiency increase of AI, it is often accompanied by the adjustment of human positions. Industry research company CVL Economics released a survey of 300 Hollywood industry leaders in January this year. Three-quarters of the respondents admitted that AI tools have promoted the reduction or consolidation of company positions.
Some organizations estimate that in the next three years, nearly 204,000 positions, including those related to visual effects and other post-production, will be adversely affected by AI.
It can be seen that sora not only gives ordinary people the potential to "become directors", but it also brings higher unemployment risks to some original "professional talents".
people and AI, who is the protagonist?
While some people are optimistic about sora's huge potential in the film and television industry, others have raised doubts.
These doubts involve all levels of effects, costs, logic, legal , etc. In fact, they are also the obstacles and problems faced by Sora whether it can truly subvert the film and television industry.
At present, the image quality and color in sora videos are relatively excellent. However, when it comes to lens movement and more refined content control, such as light and shadow, slow motion, etc., sora is currently not able to reach the level of professional film and television creation, and the effects are not surprising enough.
Some people may say that sora has just been launched, and the effect will be better with iterative upgrades in the future, but this will also bring about increased costs. A report from OpenAI states that the cost of training large models is expected to rise from US$100 million to US$500 million by 2030.
Although sora can theoretically help the film and television industry reduce costs and increase efficiency, the computing power consumption of AI video applications far exceeds that of text, audio and images. Whether it can achieve a balance between investment and income is still a question mark. If a balance cannot be achieved, these costs may be transferred to users, and ultimately users will not be able to achieve significant cost control.
Even if the cost is controlled, there is still a hard nut to crack - logic. A video editor at
found that there were many logical errors in the video generated by sora, such as the flame of a candle blowing out but not moving, the liquid flowing in the wrong direction after the glass was broken, the man running in the opposite direction, etc.
openai explained that currently Sora may have difficulty accurately simulating the physical principles of complex scenes, and may not be able to understand specific instances of causal relationships, and may also confuse spatial details in text instructions.
In fact, to make a truly excellent film requires not only visual logic, but also plot logic from beginning to end. Although AI can produce videos with excellent effects, it may not be able to accurately grasp the video script, lines and plot logic.
Sometimes, a simple line reflects hundreds of years of human cultural accumulation, and there are also huge cultural differences between different countries and races. Whether large AI models can accurately reflect these differences in video production remains questionable.
A considerable number of film and television practitioners believe that no matter in terms of film quality, production cost, plot logic, etc., the current sora is not enough to be applied in actual scenarios of mass production of film and television works.
In addition, copyright is also a new issue.
Under the traditional legal framework, copyright protects the results of people's creative labor. There is currently no clear legal boundary as to whether the video content generated by sora is considered a "work" and enjoys copyright protection. If there is no copyright, then what sora produces is just batch materials, which cannot constitute a complete film and television work, and it cannot allow sora users to become real directors.
In fact, the film and television industry has formed a complete and mature industrial chain. In addition to having a huge industrial system, it also has an extremely profound cultural core. It is difficult to truly subvert the film and television production industry with generation tools like Sora alone.
Of course, sora also has its unique value in the film and television industry. For example, it can be used in film and television pre-development, concept design, material generation, publicity and promotion, etc., which can greatly improve output efficiency. It has been a good assistant to users in recent times, nothing more.
people are still the protagonists of this era, and AI is in the supporting role.
As for many film and television practitioners, there is no need to be too anxious about the arrival of AI. At different times in history, when a technological revolution destroyed some jobs, it often created more new job opportunities - while cars made horse-drawn carriages disappear, they also created more new professional needs for drivers, car mechanics, etc. , and people always play a leading role in it.