VR table tennis project Unity3D development experience sorting out, 5 training AI with machine learning

The book continues from the above, this matter has to analyze which machine learning methods can be used.

The first is the most common supervised learning. I need to collect training data and optimize model parameters. More importantly, you have to write a training framework and recognition framework by hand. After all, it is a dedicated scene, so if you really want to apply it, you have to use it. Although it is said that some data interfaces can be made to connect the data, and then use a mature framework to learn, but the recognition process will also use external services in the future, which is very troublesome and not conducive to promotion.

Data: Incoming Velocity, Outgoing Velocity

Tags: racket speed, racket angle

 

Then there is the recently popular reinforcement learning. I specifically checked some information. There are mainly three modules, state coding, behavior decision-making, and observation feedback. No training data is required, and all training processes are learned by the model based on feedback after decision-making. But behavior and feedback take time, and I'm not sure how long it will take for the learning to work. Also, I don’t understand how to store the learned things. Is it to store a corresponding table between the state and the correct decision? Wouldn't it be miserable when the state space exploded?

According to my experience, it is more appropriate to use reinforcement learning in this scenario.

Status: Incoming Velocity, Expected Exit Velocity

Decisions: racket speed, racket angle.

Feedback: The difference between the actual exit velocity and the expected exit velocity.

 

I'll see if I can do as much as possible, and learn by myself.

In the future, it may be slower or even interrupted. The main thing is to turn a good technical problem into an academic problem now, for fear of dissuading many people.

List some learning materials first to see if there is a simple and easy-to-understand method.

https://baike.baidu.com/item/%E5%BC%BA%E5%8C%96%E5%AD%A6%E4%B9%A0/2971075?fr=aladdin

https://www.cnblogs.com/jsfantasy/p/12177216.html

https://blog.csdn.net/CaiDaoqing/article/details/92969238

https://blog.csdn.net/qq_39388410/article/details/88795124

https://blog.csdn.net/u011649885/article/details/75276392/?utm_medium=distribute.pc_relevant.none-task-blog-title-3&spm=1001.2101.3001.4242

 

-----------------------Splitting line, the following is useless -------------------- ----------------------------------

It suddenly occurred to me that if a state decision table is really established, the difficulty will be much less difficult without considering the state space problem. ps: Academic questions are brought back to technical questions

First try writing a range by hand

Range of incident velocity, x (-100, 100), y (-100, 100), z (0, 100), rounded up, step size is 1

Exit speed value range, x (-100, 100), y (-100, 100), z (0, 100), rounded up, step size is 1

Decision value range, vx (-100, 100), vy (-100, 100), vz (-100, 100), rounded, the step size is 1

dx(-90, 90), dy(-90, 90), dz(0, 0), rounded, with a step size of 1

Run through all the states and decisions to naturally get the correspondence between the incoming speed and the decision-making and outgoing speed.

Number of combinations (200*200*100)*(200*200*200*180*180)=1,036,800,000,000,000,000

forget it

Measure a reasonable range and try again

Debug.Log( calcLine(1, 0.5f, 1.5f, 1, 0, -0.1f, 0.3f));
Debug.Log(calcLine(1, 0.5f, 1.5f, 0, 0, -1.5f, 0.3f));
Debug.Log(calcLine(1, 0.5f, 1.5f, -1, 0, -1.5f, 0.3f));
Debug.Log(calcLine(1, 0.5f, 1.5f, 1, 0, -0.1f, 1f));
Debug.Log(calcLine(1, 0.5f, 1.5f, 0, 0, -1.5f, 2f));
Debug.Log(calcLine(1, 0.5f, 1.5f, -1, 0, -1.5f, 2f));

Debug.Log(calcSpeed(0.2f, 0.0025f, Vector3.zero,calcLine(1, 0.5f, 1.5f, 1, 0, -0.1f, 0.3f)));
Debug.Log(calcSpeed(0.2f, 0.0025f, Vector3.zero, calcLine(1, 0.5f, 1.5f, 0, 0, -1.5f, 0.3f)));
Debug.Log(calcSpeed(0.2f, 0.0025f, Vector3.zero, calcLine(1, 0.5f, 1.5f, -1, 0, -1.5f, 0.3f)));
Debug.Log(calcSpeed(0.2f, 0.0025f, Vector3.zero, calcLine(1, 0.5f, 1.5f, 1, 0, -0.1f, 1f)));
Debug.Log(calcSpeed(0.2f, 0.0025f, Vector3.zero, calcLine(1, 0.5f, 1.5f, 0, 0, -1.5f, 2f)));
Debug.Log(calcSpeed(0.2f, 0.0025f, Vector3.zero, calcLine(1, 0.5f, 1.5f, -1, 0, -1.5f, 2f)));
 
(0.0, 5.3, 1.5)
(1.4, 3.5, 4.2)
(2.8, 3.5, 4.2)
(0.0, 9.3, 0.8)
(0.8, 6.3, 2.3)
(1.6, 6.3, 2.3)

(0.0, 2.7, 0.8)
(0.7, 1.8, 2.1)
(1.4, 1.8, 2.1)
(0.0, 4.7, 0.4)
(0.4, 3.2, 1.2)
(0.8, 3.2, 1.2)

Speed ​​(regardless of difference) value range, x (-2.8, 2.8), y (0, 9.3), z (0, 4.2), the step size is 0.1

Decision value range, vx (-1.4, 1.4), vy (0, 4.7), vz (0, 2.1), step size is 0.1

dx(-45, 45), dy(-45, 45), dz(0, 0), with a step size of 1

Run through all the states and decisions to naturally get the correspondence between the incoming speed and the decision-making and outgoing speed.

Number of combinations 56*93*42*28*47*21*90*90=48,964,403,577,600

In fact, a lot of operations have been done to reduce the state space, but it seems that no matter how much it is reduced, it is enough.

-----------------------It's useless and someone will read it---------------------- --------------------------------

Conclusion: You have to write the learning algorithm down-to-earth. learn first

Guess you like

Origin blog.csdn.net/u010752777/article/details/108284937