An endless variety of virtual creatures scamper and scuttle across the screen, struggling over obstacles or dragging balls toward a target. They look like half-formed crabs made of sausages—or perhaps Thing, the disembodied hand from The Addams Family. But these “unimals” (short for “universal animals”) could in fact help researchers develop more general-purpose intelligence in machines.
Agrim Gupta of Stanford University and his colleagues (including Fei-Fei Li, who co-directs the Stanford Institute for Human-Centered AI and led the creation of ImageNet) used these unimals to explore two questions that often get overlooked in AI research: how intelligence is tied to the way bodies are laid out, and how abilities can be developed through evolution as well as learned.
“This work is an important step in a decades-long attempt to better understand the body-brain relationship in robots,” says Josh Bongard, who studies evolutionary robotics at the University of Vermont and was not involved in the work.
If researchers want to re-create intelligence in machines, they might be missing something, says Gupta. In biology, intelligence arises from minds and bodies working together. Aspects of body plans, such as the number and shape of limbs, determine what animals can do and what they can learn. Think of the aye-aye, a lemur that evolved an elongated middle finger to probe deep into holes for grubs.
AI typically focuses only on the mind part, building machines to do tasks that can be mastered without a body, such as using language, recognizing images, and playing video games. But this limited repertoire could soon get old. Wrapping AIs in bodies that are adapted to specific tasks could make it easier for them to learn a wide range of new skills. “One thing every single intelligent animal on the planet has in common in a body,“ says Bongard. “Embodiment is our only hope of making machines that are both smart and safe.“
Unimals have a head and multiple limbs. To see what they could do, the team developed a technique called deep evolutionary reinforcement learning (DERL). The unimals are first trained using reinforcement learning to complete a task in a virtual environment, such as walking across different types of terrain or moving an object.
The unimals that perform the best are then selected and mutations are introduced, and the resulting offspring are placed back in the environment, where they learn the same tasks from scratch. The process repeats hundreds of times: evolve and learn, evolve and learn.
The mutations unimals are subjected to involve adding or removing limbs, or changing the length or flexibility of limbs. The number of possible body configurations is vast: there are 10^18 unique variations with 10 limbs or fewer. Over time, the unimals’ bodies adapt to different tasks. Some unimals have evolved to move across flat terrain by falling forwards; some evolved a lizard-like waddle; others evolved pincers to grip a box.
The researchers also tested how well the evolved unimals could adapt to a task they hadn’t seen before, an essential feature of general intelligence. Those that had evolved in more complex environments, containing obstacles or uneven terrain, were faster at learning new skills, such as rolling a ball instead of pushing a box. They also found that DERL selected body plans that learned faster, even though there was no selective pressure to do so. “I find this exciting because it shows how deeply body shape and intelligence are connected,” says Gupta.
“It’s already known that certain bodies accelerate learning,” says Bongard. “This work shows that AI that can search for such bodies.” Bongard’s lab has developed robot bodies that are adapted to particular tasks, such as giving callus-like coatings to feet to reduce wear and tear. Gupta and his colleagues extend this idea, says Bongard. “They show that the right body can also speed up changes in the robot’s brain.”
Ultimately, this technique could reverse the way we think of building physical robots, says Gupta. Instead of starting with a fixed body configuration and then training the robot to do a particular task, you could use DERL to let the optimal body plan for that task evolve and then build that.
Gupta’s unimals are part of a broad shift in how researchers are thinking about AI. Instead of training AIs on specific tasks, such as playing Go or analyzing a medical scan, researchers are starting to drop bots into virtual sandboxes—such as POET, OpenAI’s virtual hide-and-seek arena, and DeepMind’s virtual playground XLand—and getting them to learn how to solve multiple tasks in ever-changing, open-ended training dojos. Instead of mastering a single challenge, AIs trained in this way learn general skills.
For Gupta, free-form exploration will be key for the next generation of AIs. “We need truly open-ended environments to create intelligent agents,” he says.