A two-armed robot stood beside the stove, picked up the edge of the bowl with two fingers, and poured the shrimp into the pan with a "pop". The robot holds the pot handle in its left hand and uses the spatula in its right hand to stir-fry. After a while of sizzling, the robot put the shrimps on a nearby plate. This fried shrimp dish is done.
is different from the complicated training of previous robots. This robot learned the skill of frying shrimps fully autonomously after 50 times of "teleoperation" training. A technology blogger explained that the robot has a strong learning ability. "It directly clones human behavior through imitation learning, which allows it to learn any skill."
This is the "housework robot" mobile aloha developed by a Chinese team at Stanford University. In addition to frying shrimps, it can also fold quilts, do laundry, water flowers, use a vacuum cleaner, go up and down elevators and give high-fives to humans, etc. On the morning of January 4, local time, after the team members uploaded the demonstration video, it quickly gained great attention on the Chinese and foreign Internet, and the related topic received millions of views. "The elderly can be taken care of by robots", "freeing people from housework" and "robots are the last mile to solve the problem of smart homes", netizens have been talking about it.
According to public information, the project is led by two Stanford doctoral students, with Stanford University Assistant Professor Chelsea Finn serving as the instructor.
"The project information was uploaded in the morning, Western American time, while China was sleeping. After more than ten hours, I found that someone was also posting in WeChat Moments." On January 10, a reporter from the Beijing News contacted the Chinese team of the scientific research team of the project. Member Fu Zipeng is a PhD student in computer science at Stanford University, focusing on AI and robotics. He introduced that mobile aloha is a control system independently developed by him and two other team members to realize fully automatic housework for robots.
Fu Zipeng explained that for the first time, the team tried to combine "teleoperation", AI algorithm "imitation learning" and "collaborative training" pipeline, so that the robot's learning efficiency and effect are greatly enhanced, and it can learn more human actions - first of all, humans The robot can be commanded through "teleoperation", that is, operating the robot's movements from a certain distance. For example, if a human performs the actions of dumping or frying shrimp, the robot will follow suit. This kind of "following" is a kind of training and will generate corresponding data. This data is then used and analyzed by AI algorithms for robots to imitate. As the number of "teleoperation" training increases, the robot will learn specific human movements and then be able to operate autonomously.
’s team’s experimental data shows that after the same human action has been trained by “teleoperation” more than 50 times, the success rate of autonomous robot operation is over 90%.
"In the past, humans might need to write tens of thousands of lines of code to tell the robot where to stop, what it saw, and what to hand over - with 'teleoperation' providing enough data, and then letting the AI algorithm analyze it. By using it, after 'collaborative training', you can get a robot that can operate autonomously. Even ordinary people who are not computer majors can perform 'teleoperation' and train the robot." Fu Zipeng said. According to information released by
’s team, the production cost of this robot is approximately US$32,000; its software and hardware information are open source, and relevant data is published on the github platform of team members. This means that, based on free public data, any individual or institution can assemble and study their own version of the robot.
The robot rides the elevator autonomously. Photo provided by interviewee
Is the era of robots coming? On related topics, we had the following conversation with Fu Zipeng:
“At the moment, robots are successfully learning simple, short movements.”
Beijing News : When did you start this research? Some comments say that compared to other dual-arm robots on the market, mobile aloha is much cheaper. How did you do it?
Fu Zipeng: In the summer vacation of 2023, we will start purchasing various hardware. In early October, after all the hardware was in place, we began to assemble the robot and study improved algorithms.
Because we bought the hardware and assembled it ourselves, the chassis and robotic arms we bought were relatively affordable, and the costs of design and assembly were not included, so the total cost was not high.
Beijing News : What are the human actions that robots have successfully learned so far?
Fu Zipeng: After 50 times of "teleoperation" training, the robot can now fry shrimps autonomously, put pots in cupboards, and use vertical elevators. The success rate is above 90%.
In the video we uploaded, some of the other actions performed by the robot, such as cracking eggs and cooking a complete dinner, were not done by it autonomously, but by our "remote operation". Because this type of action is relatively complex and the training cost is too high, the robot has not yet learned to perform it autonomously.
Right now, robots are successfully learning simple, short movements.
Beijing News : Is it possible to further improve the 90% success rate through more "teleoperation" training?
Fu Zipeng : It is possible, but there will be a diminishing marginal effect. For example, if you take a robot to fry shrimps through "teleoperation", the first 20 or 30 times of training can increase the robot's autonomous operation success rate to 70%; another 20 times of training, that is, a total of 50 times of training, can increase the success rate. The success rate will be increased to 90%; if you continue to train a thousand times or ten thousand times, that is, frying shrimp a thousand times or ten thousand times, you may only be able to increase the success rate to 95% to 98%.
Therefore, to further improve the success rate, the business community needs to do it. Our team only works with two or three people on a daily basis. We have limited funds and time, so it is difficult to make breakthroughs in this area.
Beijing News : What are the main difficulties and breakthroughs in this research?
Fu Zipeng : The first is the difficulty in hardware. Nowadays, most robots with "teleoperation" systems can either only perform simple grabbing and putting down actions; or they can move with fine movements but cannot move, such as medical robots; or they can move, but can perform limited operations, such as Sweeping robot. We want to build a multi-functional robot that can be used in household chores, offices and other scenarios. But at the same time, costs must be controlled.
also has a breakthrough in software. This time we combined the data collected by "teleoperation" with the AI algorithm, and used the "collaborative training" pipeline to further improve the learning efficiency of the robot. This is something that no one has tried before.
Fu Zipeng is "remotely operating" the robot. Picture provided by the interviewee
“We currently have no plans to commercialize this project”
Beijing News : Why did you choose to open source this project?
Fu Zipeng : We hope to better promote this project and allow more people to participate and study together. We currently have no plans to commercialize this project.
Beijing News : Do you think this robot can now be put into production and available to ordinary consumers?
Fu Zipeng : It’s okay for researchers, but not for ordinary consumers. Researchers can buy it for their own research, but if consumers want to use it for housework, they first need to do a lot of "teleoperation" training. For example, if you want it to learn to fry shrimps, you have to take it with you and fry them more than fifty times. The cost of such training is too high.
In order for this robot to enter the family life of ordinary people, the data collection process must first be simplified. This is currently a technical bottleneck.
Beijing News : What is the future research and improvement direction of mobile aloha?
Fu Zipeng : The first is the generalization ability of the robot. For example, if I teach it to wash a dish, can it learn to wash dishes directly by analogy? Judging from the current level of research, it is difficult to achieve this goal in one step. The way we envision it is that we teach it to wash dishes, but also teach it to wash cups, wash pots, and wash as many different utensils as possible. In this case, the likelihood that it will learn to clean a vessel it has never seen before is much higher.
In other words, the more tasks a robot has done, the stronger its generalization ability will be.
In addition, currently, to "teleoperate" a robot, a person must stand behind the robot. In the future, we hope to "teleoperate" robots through remote methods such as video, which will make it more convenient to train robots.
Through "teleoperation", robots are learning to shave humans. Picture provided by interviewee
“It may take a long time to have an all-round robot”
Beijing News : In addition to housework robots, what other application fields are the academic community still studying for robots?
Fu Zipeng: What I learned about include medical robots, logistics sorting robots, production line robots, etc. Different fields have different acceptance levels of applied robots. For example, logistics sorting robots do not have such high requirements for safety and accuracy. But for example, in the medical field, it is actually difficult to accept a complete operation by a fully automatic robot. It can only be used as an auxiliary to help surgeons save a certain amount of operation time.
Beijing News : Will robots be able to autonomously help humans do all housework in the future?
Fu Zipeng : It will be difficult in the short term. For housework robots, there are too many variables in tasks - different styles of refrigerators, tabletops of different heights, and carpets of different materials all require the robot to learn to varying degrees. Therefore, it is difficult to build a fully autonomous housework robot that can help humans do all housework.
On the contrary, the relatively mature autonomous robots currently are self-driving robots. Studies have shown that the level of self-driving robots is basically higher than that of ordinary human drivers. Because traffic is actually a very fixed scene, the robot has only one task - to go from a to b without collision. No matter where you drive, the underlying logic is the same. It will be relatively easy for the robot to learn.
Beijing News : What do you think the relationship between robots and humans will be like in the future?
Fu Zipeng: The robots in the implementation stage are very stupid. What they can do is far less than a human child. It can only improve human efficiency in certain areas, such as driving. But if you want to say that in all aspects, it is difficult to imagine having a general-purpose, all-round robot that can handle most things better than humans, and it may take a long time to realize it.
Beijing News reporter, Feng Yuxin
editor, Chen Xiaoshu
proofreader, Wang Xin