New progress in NYU's embodied intelligence: Learning to open cans through visual feedback increases task success rate by 135%, LeCun likes it...

Cressy from Aofei Temple
Qubit | Public account QbitAI

Notice how the robot easily cuts a piece of wire with its pliers.

710d7ae554762ce394c752a683834cf1.gif

The iron box with the lid opened after three clicks, five divided by two.

cde88c85033a12e7225bb0afddc36af0.gif

In addition, tasks such as object grabbing can be completed easily.

Behind this robot is the latest embodied intelligence achievement launched by New York University and the Meta AI Laboratory.

Researchers have proposed a new training method called TAVI, which combines vision and touch to more than double the effectiveness of robots in performing tasks.

At present, the research team's paper has been published publicly and the relevant code has been open sourced.

4c5342ef0ff0c0a34761a0b42090ded3.png

Seeing the performance of this robot, Meta chief scientist LeCun couldn't help but lament that this is an amazing progress.

19c00165eede8f22861b49d41850fa7d.png

So what else can a robot trained in this way do?

Picking things up and putting them away is a piece of cake

It can separate two bowls that are stacked together and take the top one.

If you observe carefully, you can see that during the separation process, the robot's hand made a chasing motion, allowing the yellow bowl to slide along the inner wall of the green bowl.

16b56734b6ccd96cc17782c0fc315814.gif

This robot can not only "divide" but also "combine".

After picking up the red object, the robot accurately placed it into the purple lid.

bac93ff88f4663fa95c7b1ef1f41c6b7.gif

Or, turn the rubber over.

I saw it pick up a large piece of rubber, and then use the box below to adjust the angle.

Although I don’t know why I didn’t use more fingers, I learned to use tools after all.

207696a9a01e658ef8c8f970defffe4e.gif

In short, the movements of embodied intelligent robots trained using the TAVI method are somewhat similar to humans.

4bd4c75a5bbaa4911b4d376e804238ed.png

According to the data, the TAVI method is significantly better than the method using only tactile or visual feedback in 6 typical tasks.

Compared with the AVI method that does not use tactile information, the average success rate of TAVI is increased by 135%, and compared with the image + tactile reward model method, it is also doubled.

However, the success rate of T-DEX training method, which also uses a mixed visual and tactile model, is less than a quarter of that of TAVI.

57f2ba09e67300146cee1e95496fe892.png

The robot trained by TAVI also has strong generalization ability - the robot can also complete tasks for objects it has never seen before.

In the two tasks of "taking the bowl" and "packing the box", the robot's success rate when facing unknown objects exceeded half of the time .

In addition, the robot trained by the TAVI method can not only complete various tasks well, but also perform multiple sub-tasks in sequence.

In terms of robustness , the research team conducted tests by adjusting the camera angle, and the robot still maintained a high success rate.

416b0ba83edd812985cce63d58796e65.png

So, how does the TAVI method achieve such an effect?

Evaluating robot performance using visual information

The core of TAVI is to use visual feedback to train the robot. The work is mainly divided into three steps.

8be2777f4807633117d83a0f21ddb63e.png

The first is to collect demonstration information given by humans from two dimensions: visual and tactile.

The visual information collected is used to build a reward function for use in the subsequent learning process.

In this process, the system uses comparative learning to obtain visual features useful for completing the task, and evaluates the completion of the robot's actions.

Then, tactile information and visual feedback are combined to train through reinforcement learning, allowing the robot to try again and again until it obtains a higher completion score.

The learning of TAVI is a step-by-step process. As the learning steps increase, the reward function becomes more and more perfect, and the robot's movements become more and more precise.

2bfba086d84afff8effeb65cbcfa84ee.png

In order to improve the flexibility of TAVI, the research team also introduced a residual strategy.

When encountering differences from the basic strategy, you only need to learn the different parts without having to start from scratch.

The ablation experiment results show that if there is no residual strategy and the robot learns from scratch every time, the success rate of the robot completing the task will be reduced.

If you are interested in embodied intelligence, you can read the research team’s paper for more details.

Paper address:
https://arxiv.org/abs/2309.12300
GitHub project page:
https://github.com/irmakguzey/see-to-touch

Guess you like

Origin blog.csdn.net/QbitAI/article/details/133532320