Li Feifei’s team gets AI new ideas from animals and proposes an RL calculation framework

2021-02-10 13:47:36

Yang Jing From
Aofei Temple Qubit Report | Public Account QbitAI

What if machines could learn and evolve like animals?

This is the latest research by Li Feifei's team.

Li Feifei’s team gets AI new ideas from animals and proposes an RL calculation framework

 

In the past 600 million years, animals have learned and evolved into different forms in complex environments, and have used the evolved forms to learn complex tasks. Such repeated learning and evolution have created the cognitive wisdom of animals.

However , the relationship between environmental complexity , evolutionary form and learnability of intelligent control is still elusive.

This paper proposes a deep evolutionary reinforcement learning computing framework DERL . It can evolve different forms and learn some challenging movement and manipulation tasks in a complex environment.

Li Feifei’s team gets AI new ideas from animals and proposes an RL calculation framework

 

Finally, using DERL, the researchers proved several relationships among environmental complexity, morphological intelligence, and the learnability of control.

Morphological intelligence through learning and evolution

Creating an adaptive form and learning to manipulate tasks in a complex environment is challenging and presents dual difficulties.

The first is to search among a large number of possible morphological combinations. The second is the computational time required to assess adaptability through lifelong learning.

Therefore, previous work either evolves in a limited morphological space, or focuses on finding the best parameters of a fixed morphology, or learns in flat terrain.

In order to overcome these substantial limitations, this paper proposes a Deep Evolutionary Reinforcement Learning (DERL) computing framework.

Li Feifei’s team gets AI new ideas from animals and proposes an RL calculation framework

 

This paper proposes an efficient asynchronous method for parallelizing learning and evolving basic calculations among many computing elements.

As shown in Figure (b), the outer loop of evolution optimizes the shape of the machine through mutation operations, such as height, location, box size and other attributes.

The internal reinforcement learning loop is used to optimize the parameters of the neural controller.

Also introduced a UNIMAL, namely Universal aniMAL form design space, as shown in Figure (d), it not only has a high degree of expressiveness, but also enriches useful controllable forms.

The complex environment is composed of three randomly generated obstacles: hills, steps, and gravel. The model must start from the initial position (green object in Figure e) and move a box to the target position (red square).

Li Feifei’s team gets AI new ideas from animals and proposes an RL calculation framework

 

In addition, DERL has created an exemplified model that can not only learn from less data, but also can generalize to solve multiple new tasks, thereby alleviating the inefficiency of reinforcement learning samples.

The mode of operation of DERL is to imitate the intertwined process of morphological search and neural learning of several generations of models in the Darwinian evolution process, and evaluate the speed and effect of a given morphology to solve complex tasks through intelligent control.

There are a total of 8 test tasks, involving stability, agility and maneuverability tests to evaluate the promotion of reinforcement learning by each form.

Li Feifei’s team gets AI new ideas from animals and proposes an RL calculation framework

 

The researchers selected 10 best-performing forms in 3 evolutionary runs in each environment. Then, train all 8 test tasks from the beginning for each morphology.

Finally, the best model shape evolved in different environments was selected.

Li Feifei’s team gets AI new ideas from animals and proposes an RL calculation framework

 

It was found that through the Baldwin effect, model adaptability can be rapidly transferred from its phenotype learning ability to its genotype-encoded morphology in the course of several generations of evolution.

(Baldwin effect: Human behaviors and habits without any genetic information basis. After many generations of transmission, they eventually evolve into behaviors and habits based on genetic information.)

These evolved morphologies give the model better and faster learning capabilities to adapt to new tasks.

The team guessed that it might be achieved by increasing passive stability and energy efficiency.

Li Feifei’s team gets AI new ideas from animals and proposes an RL calculation framework

 

In addition, the following relationship is confirmed between environmental complexity, form intelligence and learnability control.

First, the complexity of the environment promotes the evolution of form intelligence, which is quantified by the ability of a form to promote learning new tasks.

Secondly, it will quickly select a form with a faster learning speed during evolution. This result constitutes the first proof of the Baldwin effect of morphology that has been conjectured for a long time.

Third, experiments show that both the Baldwin effect and the emergence of form intelligence have a mechanism basis, that is, through the evolution of physically more stable and energy efficient forms, which can promote learning and control.

team introduction

This article is led by Li Feifei's team and is jointly researched by teams from Stanford University's Department of Computer Science, Department of Applied Physics, and Wu Cai De Institute of Neuroscience.

Li Feifei’s team gets AI new ideas from animals and proposes an RL calculation framework

 

The first author is Agrim Gupta, a second-year doctoral student at Stanford University, dedicated to the study of computer vision.

Li Feifei’s team gets AI new ideas from animals and proposes an RL calculation framework

 

Link to the paper:
https://arxiv.org/abs/2102.02202

- Finish-

Guess you like

Origin blog.csdn.net/weixin_42137700/article/details/113818239
RL