1. Software version
MATLAB2019a
2. Theoretical knowledge of this algorithm
Specific reference to the following documents:
Our reinforcement learning control structure is shown in the following figure:
Evaluation function design:
Parameter adjustment rule design:
Our rules here are as follows:
y = alpha*(1-Vt);
Because of the research method of the paper, we test the cashback. If the front and rear adjustment amount Pt is unchanged or small, it will cause the deltaP to be extremely large, thereby destroying the stability of the algorithm.
Then we adjust P and I , and D respectively,
That is to say, it corresponds to the 5 adjustment modules in the model, but in this topic, it seems that this five-level effect range is not good. I set alpha to 1 here, so it is essentially three-level. I did not delete the corresponding model, but put it in the original model for your reference.
Decision-making mechanism:
3. Core code
4. Operation steps and simulation conclusion
Then the learning process of RL is as follows:
5. References
[1] Gao Ruijuan, Wu Mei. Principle and application of PID parameter tuning based on improved reinforcement learning [J]. Modern Electronic Technology, 2014, 37(4):4.A05-66