New Ideas for Robot Development - Reinforcement Learning

As we all know, a robot is an intelligent machine that can work semi-autonomously or fully autonomously. Robots can perform tasks such as work or movement through programming and automatic control, and in the execution process, the most common use is to judge commands or logic. In other words, the reason why the robot can execute and output the results we expect is inseparable from the judgment command.

Today we will simulate future robot development scenarios from the development of judgment logic. First, let's imagine a scenario: a mouse walks a maze .

The green line in the picture is the correct route of the maze. From the starting point to the end point, 26 intersections need to be passed . From this we can simulate two development paths:

  • Design the program according to the simplest development method. On the premise of knowing the answer, make 26 correct judgments to make the mouse get out of the maze. Although the development process of this method is cumbersome, fortunately, the mouse saves the most effort. If it is a maze development with a higher difficulty level, this method is not suitable.

  • Design the program according to the conditions, that is, each intersection will be tried by the little mouse until the correct route is found. Although this method is simple to develop, it is time-consuming and labor-intensive. If you change to a maze with a higher level of difficulty, the little mouse may run out of power and fail to find the right way.

Just imagine, is there any way to save the complicated links in the middle, reduce the development effort of programmers, and make the little mouse look smarter at the same time?

Use the idea of ​​reinforcement learning to solve

By artificially setting the reward function, let the little mouse calculate, try and make mistakes, and hit a wall by itself in the simulation environment, and accumulate a certain amount of case experience, and the little mouse will automatically generate the best algorithm or experience set that meets the maze.

When the little mouse encounters an intersection in the actual environment and needs to judge, it will find the most suitable path or algorithm in the experience collection to help it make the most suitable judgment, so as to achieve a program effect that is more suitable than the human-designed program.

In this development process, most of the work accumulated in the case study is done by the machine itself. Compared with humans, the machine learning time is shorter and more efficient, and it can cope with different levels of maze scenes and has stronger applicability.

From this we can see that reinforcement learning can play such a big role only in the small scene of robot development such as maze walking. If it is a robot development under the conditions of multi-degree-of-freedom, multi-control domains, and multi-scenario, does the robot program development method with reinforcement learning ideas have a chance to become the mainstream method in the future?

Recently, Amu Lab also tested our KKSwarm unmanned car using the idea of ​​reinforcement learning. Next, we will show you our experimental results:

Unmanned vehicles transport goods in an orderly manner through reinforcement learning

Today, the KKSwarm cluster test platform has realized the connection between reinforcement learning theory and development, and provides a mapping between the reinforcement learning demo simulation environment and the actual physical environment. At the same time, the KKSwarm project is still being strengthened. We look forward to the leaders in the field of artificial intelligence in the future to explore the future of robots with us.

-END-

Special benefits: In order to encourage developers to open source, anyone who develops a new algorithm based on the KKSwarm system and meets the following standards will be given a corresponding cash reward (only once according to the highest contribution), so as to achieve the purpose of jointly building open source projects.

Contributors of category A (a reward of 20,000 yuan) must meet the following requirements at the same time:

1. The open source algorithm is published in SCI journals;

2. Provide the corresponding algorithm code or download link;

3. Provide corresponding demo videos.

Category B contributors (rewards of 8,000 to 10,000 yuan) must meet the following requirements at the same time:

1. The results are published in EI journals;

2. Provide the corresponding algorithm code or download link;

3. Provide corresponding demo videos.

Category C contributors (2,000 to 5,000 yuan reward), subject to review:

Results are published in the form of papers, blogs, tutorials, etc.; or provide distinctive demonstration videos.

Materials that can be paired with learning

1. Open source project address:

https://github.com/amov-lab/kk-robot-swarm

https://github.com/kkswarm/kk-robot-swarm

2. Wiki information: https://wiki.amovlab.com/public/misaro-doc/

Guess you like

Origin blog.csdn.net/msq19895070/article/details/127088676