Our special correspondent Ren Zhong Our special correspondent in the United States Liu Yiran
"Artificial intelligence (ai) music technology competition: audio, another chatgpt for music production." The American "Rolling Stone" magazine reported on this title that in ai music Just weeks after generator Suno caused a stir, a new contender called Audio has emerged, backed by heavyweights in the tech and music worlds. The voice of rock star Tom Petty, "resurrected" by this music generator, is almost indistinguishable from Petty himself. Just last year, many experts believed that AI models that could generate complete, high-fidelity songs from text prompts would not be available anytime soon, but now, a technology race has begun around music production models.
's "Google Lyria"
According to PR Newswire, audio's AI music generator attracted attention as soon as it was unveiled on the 10th of this month. This music generator, known as the "music version of Sora", was developed by audio company founded in New York last year. The company's mission is to bring world-changing products to market, making it easy for anyone to create music that resonates with people's emotions in an instant. Whether creating professional tracks or generating fun soundtracks for memes, Audio expands the way everyone creates and shares music. For this reason, Audio is praised by the industry as the first company to achieve freedom in song creation.
udio is mainly built by a group of former Google AI engineers and researchers. Specifically, among the five co-founders, except Andrew Sanchez, the other four are from Google's AI research department DeepMind. They are Ding Fengning, Connor Durkan, Charlie Nash and Jaroslav Ganin. Although these four researchers are not well-known "big names" in the industry, once Udio was founded, it received investment from Silicon Valley tycoons including top US venture capital institutions and Instagram Chief Technology Officer Mike Krieger.
Alex, an expert in the field of artificial intelligence, wrote on social media that audio is the "Google Lyria" that has escaped (lyria is Google's AI music generation model) and is open to the public. Several former DeepMind researchers raised funds and trained on their own machines over the course of three months. Indeed, Ding Fengning, Nash, Durkan and Ganin all made important research contributions to Lyria before leaving Google. In the introduction to Lyria, deepmind proudly stated, "Generative music technology can change the future of music creation and use. Our cutting-edge work in this field will further inspire the creativity of artists, music producers and fans everywhere."
"A group of very pragmatic people"
Among the above-mentioned founders, Ding Fengning has worked at deepmind for the longest time and is now the CEO of audio. Ding Fengning is a Chinese-American who graduated from Andover Phillips High School. In 2011, Ding Fengning entered the MIT academic project Primes (full name: MIT Mathematics, Engineering and Science Research Project) to conduct research on representation theory. During this period, he won fourth place in the Intel Young Science Genius Award, known as the "Little Nobel Prize", for his paper on basic algebra. After completing this project, he entered Harvard University in 2012 and received a bachelor's degree in mathematics and a master's degree in computer science. In July 2018, Ding Fengning officially joined the Google DeepMind team and became a senior R&D engineer, responsible for leading a team of 30 people engaged in reinforcement learning and multi-modal modeling. During his 5 years in office, he participated in the development of the lyria model. Lyria was officially released on November 16, 2023. In that month, Ding Fengning resigned from Google and subsequently created audio.
Andrew Sanchez is co-founder and Director of Operations at audio. From July 2022 to October 2023, Sanchez served as the head of the search engine yext ai team. Before entering the technology industry, he completed his undergraduate studies at Harvard University and obtained his master's and doctorate degrees at Oxford University. His doctoral thesis was on "The History of Cybernetics."
Compared to Ding Fengning and Sanchez’s world-famous school experience, Durkan graduated from University College Cork in Ireland, majoring in mathematics, and spent a year as an exchange student at the University of California, Berkeley. Thereafter, he obtained master's (mathematics) and PhD (machine learning) degrees from the University of Edinburgh, UK.During his doctoral studies, he interned at deepmind for 5 months. After graduation, Durkan officially joined deepmind as a senior researcher. In January 2024, he switched to audio and became a co-founder. Another co-founder, Charlie Nash, also graduated from the University of Edinburgh, with undergraduate and postgraduate majors related to mathematics. In 2019, Nash officially joined deepmind as a researcher, and officially joined audio last year. Another top student in the
team, Ganin, is from Russia. He graduated from Moscow State University with a bachelor's degree and a master's degree in mathematics. After that, Ganin went to Canada to study for a doctorate in computer science at the University of Mondelière, and successfully graduated in 2019. While studying for his Ph.D., he interned at deepmind for 8 months. After graduation, he continued to work at deepmind until he left in November last year, and then participated in the founding of audio. For this brand-new team, investor Krieger praised: "These technical partners are a group of very pragmatic people, and the project has been progressing very quickly."
has received US$10 million in seed funding.
has been established for a long time. Short, but audio company has received $10 million in seed funding. Ding Fengning emphasized, "Currently there is no product that can compete with Udio's ease of use, voice quality and musicality. This is the best proof for our participants." The
"maginative" website reported that Udio's and What sets it apart is its user-friendly approach to music creation. Users only need to enter the relevant description of the desired music type, provide personalized lyrics or keywords, etc., and then obtain a piece of music material in a few seconds. Currently, suno can create two-minute long musical clips based on given prompts. Udio provides more customization options, it can generate music clips of at least 30 seconds, and users can extend the length as needed.
However, the launch of audio has also caused concerns among some musicians, who worry that AI music generators may use copyrighted materials to train their own models without permission. "Rolling Stone" magazine stated that although neither Audio nor Suno has explicitly admitted or denied, there is sufficient reason to believe that the two companies have used unauthorized copyrighted music for AI training. Currently, the issue of whether copyrighted materials can be legally used for AI training is still pending in a number of legal cases. ▲