Google Robot Tech Can Understand You on a Human Level
With AI language skills you can command robots in plain English. The same AI tech can navigate the chaos of the real world.
The tech of teaching robots to understand language helps them deal with the open-ended complexity of the real world, as stated by Google.
Google has grafted its latest artificial intelligence technology for handling language, called PaLM, onto robots from Everyday Robots, which is one of the experimental divisions of parent company Alphabet. It showcases the resulting technology, called PaLM-SayCan, on Tuesday.
With the PalM robot, Google’s AI language model brings varied knowledge of the real world to help a robot interpret human commands and place together a sequence of actions to respond. That’s a big difference when compared with the precisely scripted actions most robots follow in tightly controlled circumstances like installing windshields on a car assembly line. Essentially, Google also factors in the robot’s abilities as a way to set the course of action that’s possible with the robot’s skills and environment.
Google’s PaLM-SayCan robots use AI language models to decode that picking up a sponge is useful for someone who needs help with a spilled drink.
This is a research project that’s ready for prime time. But it has been in its testing phase in an actual office kitchen, not like a controlled environment, to build robots that can be useful in the unpredictable activities of our actual lives. Like other projects of Tesla’s bipedal Optimus bot, Boston Dynamics’ creations, and Amazon’s Astro, it indicates how robots could eventually move out of science fiction.
During the testing phase, a Google AI researcher says to a PaLM-SayCan robot, “I spilled my drink, can you help?” it sails on its wheels through a kitchen in a Google office building, and with its digital camera vision, it spots a sponge at the counter and picks it with its motorized arm and carries it back to the researcher. The robot also recognizes cans of Pepsi and Coke, opens drawers, and equally locates bags of chips. With the PaLM’s abstraction abilities, it can even understand that yellow, green, and blue bowls can metaphorically represent a desert, jungle, and ocean, respectively.
“As we improve the language models, the robotic performance also improves,” said Karol Hausman, a senior research scientist at Google who was part of the Google robot demonstrators.
The rapid advancement of AI, has profoundly transformed how computer technology works and what it can do. With modern neural network technology, which is loosely modeled on human brains and also known as deep learning, AI systems are trained on vast quantities of routine real-world data. After seeing many photos of a kitty, for example, AI systems can recognize one without having to be told it usually has four legs, pointy ears, and whiskers.
Google used an about 6,144-processor machine to train PaLM, short for Pathways Language Model, on a vast multilingual collection of web documents, books, Wikipedia articles, conversations, and programming code seen on Microsoft’s GitHub site. The outcome is a robot that can explain jokes, complete sentences, answer questions, and follow its chain of thoughts to reason.
The PaLM-SayCan work marries this language understanding with the robot’s abilities. When you send a command to the robot, it pairs the language model’s suggestions with a set of about 100 skills it’s learned. And selects the best fit for the language and the robot’s skills.
Though robot tech is limited by its training and circumstances, it’s far more flexible than an industrial robot. During the experimental process, one of the instructors asks a PaLM-SayCan robot to “build me a burger,” it stacks wooden block versions of buns, pattie, lettuce, and a ketchup bottle in the correct order.
The robot’s skills and environment offer a real-world grounding for the broader possibilities of the language model, Google said. “The skills will act as the [language model’s] ‘hands and eyes,'” they said in a PaLM-SayCan research paper.
The outcome is a robot that can cope with a more complicated environment, identifying separate objects based on their usage and need. “Our performance level is high enough that we can run this outside a laboratory setting,” Hausman said.
At Google robotics offices in Mountain View, California, up to 30 wheeled Everyday Robots patrol the office. Each has a broad base for balance and locomotion, a thicker stalk rising to a human’s chest height to support an articulated “head,” a face with various cameras and a green glowing ring indicating when a robot is active, an articulated grasping arm and a spinning lidar sensor that uses a laser to create a 3D scan of its environment. On the back is a big red stop button, but the robots are built to avoid collisions.
Some robots learn picking of things at stations where they’re situated. Though it’s a time-consuming effort, once one of the robots has learned most of the skills, it can be transferred to others.
Other robots glide around the offices, each with a single arm folded behind and a face pointing toward QR codes taped to windows, fire extinguishers, and a large Android robot statue. The work of these ambulatory robots is to try to learn how to behave politely around humans, said Vincent Vanhoucke, a Google distinguished scientist and director of the robotics lab.
“AI has been very successful in digital worlds, but it still has to make a significant dent solving real problems for real people in the real physical world,” Vanhoucke said. “We think it’s a great time right now for AI to migrate into the real world.”