|Wang Shuoguo and Jiang Yubin Editor|Jiang Yubin
html Post-095 Hangzhou girls take the lead, AI video generates the track, and new stars shine. html In early June, pika received US$80 million in Series B financing., a Silicon Valley startup company, has been established for only one year and has completed 5 rounds of financing. It is backed by a number of star investors in Silicon Valley. The founders of
are two "academic" girls - Guo Wenjing and Meng Chenlin, both doctoral students in the AI Lab of Stanford University.
Guo Wenjing’s father is the actual controller of Xindaya, a listed software company, and his mother graduated from the Massachusetts Institute of Technology in the United States.
html In April, pika was selected into Forbes AI 50 list. According to reports, after the new round of financing, the company’s valuation will exceed US$470 million, close to 3.5 billion yuan.beats the big one by a small amount
entered "Musk wearing a space suit, 3D animation" in the dialog box, and "astronaut" Musk immediately appeared on the screen, and the SpaceX rocket behind him sprayed flames and soared into the sky - this is a paragraph Demo promotional video of pika 1.0.
In November last year, Guo Wenjing’s team released this Wensheng video product, which attracted attention with its movie-like texture and animation-level special effects.
pika 1.0 can generate videos of various styles such as 3D animation, animation, cartoons and movies. Based on the generated video, users can also enter short instructions again, modify parts of the video, or perform editing such as canvas extension and duration expansion.
Guo Wenjing's winning method is to "win the big with the small", that is, use fewer resources to get better results.
She explained that video is a kind of high-dimensional data. For example, a 1080p video with 24 frames per second has a resolution of 1920×1080, and the dimensions per second reach 150 million. Multiplied by the video duration, AI can process this amount of data. Level is very difficult.
"Every dimension of the video is interrelated." Guo Wenjing's team captured this feature.
"We only need to know the information of the first frame, for example, the background of the person walking, the details of his clothing, and the subsequent frames do not need the complete picture."
In other words, the team gave up large-scale training of AI high-dimensional data, but concentrated Focus on efficient architecture and data compression methods. The
method can remove 90% of redundant information, save computing power, reduce the cost of training the model, and provide better results.
Guo Wenjing also embedded motion priors, image priors, etc. into the pika 1.0 model. The so-called "prior" of
refers to the pre-understanding of common patterns or behaviors to help the model better understand and predict the content in the video.
For example, the user wants a video of someone walking in a certain posture, but it is difficult to describe it in words. The solution is to provide a guide, such as a reference video of walking posture, as the first frame to facilitate users to control the generated effect.
"We want to build a model like human thinking." Based on user suggestions,
Guo Wenjing launched the functional component lip sync in February, which supports video character mouth animation and audio synchronization. Users can input text to generate audio or upload their own audio.
"In the next few years, generating and editing videos will be as easy as using mobile phone photos now."
genius team
Guo Wenjing is a veritable "second generation of science".
's mother graduated from the Computer Science Department of the Massachusetts Institute of Technology, and his father Guo Huaqiang graduated from Zhejiang University with a master's degree and is now the actual controller of Xinyada.
Guo Wenjing was in the competitive class of Hangzhou No. 2 Middle School in high school. He was invited by MIT to participate in the North American Programming Invitational Competition and won second place. His opponents were teams from universities such as Harvard and Stanford.
studied at Harvard for his undergraduate and master's degrees. In addition to his studies, Guo interned at Microsoft, Google, etc. In his sophomore year, he also worked as an engineer in Meta's AI research department, and later entered the Stanford University AI Laboratory to pursue a Ph.D.
The idea of establishing pika originated from a competition during my PhD studies.
Guo Wenjing participated in the first AI Film Festival of Runway, a veteran AI video company, and found that the tools of Runway and Adobe Photoshop were not easy to use. She wondered if she could develop a "better and smarter" AI video generation tool by herself.
In April last year, Guo Wenjing and his classmate Meng Chenlin dropped out of Stanford and established pika.
Guo Wenjing (left) and Meng Chenlin
Although small in size, all members are talented. Meng Chenlin, co-founder of
, has published many papers, among which the denoising diffusion implicit model (ddim) has become the default method for content generation in the industry and is used by OpenAI, Google, etc. Chen Siyu, a member of the founding team of
, and Guo Wenjing are classmates in Hangzhou No. 2 Middle School. They are also members of the national training team in the disciplines of informatics and physics. They were recommended to the Turing class of Peking University for their undergraduate studies.
At present, pika only has a team of 13 people, with 6 ioi (International Informatics Olympiad) gold medalists, 3 of whom are world number one.
pika's consultant lineup is also star-studded: Christopher Manning, director of the Stanford AI Laboratory; Ron Fedkiw, two-time Oscar winner of science and technology awards...
"We are competing for people with OpenAI and Elon Musk every day." Guo Wenjing is eager for talent. , "We want to be the next sora, or even surpass sora, so that the big guys can give full play."
industry leaders work together and the team operates efficiently.
One time, an angel investor proposed to the team the idea of embedding text in videos. At 3 a.m., he received a reply saying the feature was ready. The investor immediately decided to make the next round of investment in pika.
"We will be more aggressive in making large-scale video models," Guo Wenjing said. Pika plans to rapidly expand its research and engineering teams after this round of financing.
html In April, Adobe announced that it would embed three major external partners, OpenAI, Runway and Pika, into the video editing tool Premiere.commercial breakthrough
compares sora's video generation with a maximum length of 60 seconds, domestic Wensheng video large model vidu's 16 seconds, and pika's only 4 seconds.
is of sufficient length and is the basis for narrative and plot development - this is a question that rising stars must solve.
"There is a breakthrough in video length. It is not difficult to reach 60 seconds." Guo Wenjing revealed.
pika currently has millions of users and generates millions of videos every week. The attention of
has declined. According to
similarweb data, in April, the number of visits to the pika website was 2 million, a 64% decrease from its highest point. Fortunately,
is still favored by capital.
pika’s latest US$80 million Series B financing was led by spark capital. The company’s valuation exceeded US$470 million, doubling the previous round.
Guo Wenjing has led the team to complete five rounds of financing, with a total financing amount of US$135 million, equivalent to approximately 1 billion yuan.
investors include almost all the big guys in Silicon Valley, including quora founder Adam D'angelo, former GitHub CEO Nat Friedman, Silicon Valley investor Daniel Gross, etc.
Guo Wenjing has begun commercial exploration, and pika's revenue mainly comes from membership subscription fees. Starting from January
html, the company's products will start to be paid. According to the official website, monthly subscriptions are divided into two levels: standard version and professional version, which charge US$10/month and US$60/month respectively."I think TOC still has a chance in the United States. If 100,000 users are willing to pay US$100 per month, we will have a revenue of US$100 million." Guo Wenjing said frankly that it is very difficult to make money through the C-side in China.
She revealed that currently, the cost of pika generating a 3-second video is much lower than that of sora.
"If you exclude the investment in large model training and the cost of GPU (graphics processing unit), the company as a whole is profitable." Guo said.
The most important thing is to launch the next exciting new product.
It is reported that pika will release a major update within the year, and the upgrade of new products will focus on its "controllability."
Young academics and industry leaders formed a team, and this dark horse of AI galloped forward.
"is different from openai. Our goal is not to make AGI (artificial general intelligence), but to make products that serve creators. The essence is to help everyone realize their creativity." Guo Wenjing said.