MATLAB Reinforcement Learning Toolbox (14) Import strategy and value function representation

Import strategy and value function representation

To create a function approximator for reinforcement learning, you can use the Deep Learning Toolbox™ network import function to import a pre-trained deep neural network or deep neural network layer architecture. You can import:

  1. The Open Neural Network Exchange (ONNX™) model requires the deep learning toolbox converter support package software for the ONNX model format. For more information, please importONNXLayers .
  2. TensorFlow™-Keras network requires the deep learning toolbox importer support software package for TensorFlow-Keras model. For more information, see importKerasLayers .
  3. The Caffe convolutional network requires a deep learning toolbox importer support software package for the Caffe model. For more information, see importCaffeLayers .

When importing a deep neural network architecture, please consider the following.

  1. The imported architecture must have a single input layer and a single output layer. Therefore, it is not supported to import the entire reviewer network with observation and action input layers.

  2. The size of the input and output layers of the imported network architecture must match the size of the corresponding operations, observations or rewards of your environment.

  3. After importing the network architecture, the names of the input layer and output layer must be set to match the names of the corresponding actions and observation specifications.

For more information about the deep neural network architecture that supports reinforcement learning, see Creating Strategy and Value Function Representations .

Import the application of actors and commenters in image observation

For example, suppose you have a 50 x 50 grayscale image observation signal and a continuous action space in your environment. To train the strategy gradient agent, you need the following function approximators, both of which must have a single 50 x 50 image input observation layer and a single scalar output value.

  1. Actor -select an action value based on the current observation

  2. Commenter -Estimate the expected long-term reward based on current observations

In addition, suppose you want to import the following network architecture:

  1. A deep neural network architecture for actors, with a 50 x 50 image input layer and scalar output layer, and saved in ONNX format (criticNetwork.onnx).

  2. The deep neural network architecture for commenters has a 50 x 50 image input layer and a scalar output layer, and is saved in the ONNX format (actorNetwork.onnx).

To import a network of commenters and actors, use the importONNXLayers function without specifying the output layer.

criticNetwork = importONNXLayers('criticNetwork.onnx');
actorNetwork = importONNXLayers('actorNetwork.onnx');

These commands will generate a warning stating that the network is trainable before adding the output layer. When you use the imported network to create an actor or commenter representation, the Reinforcement Learning Toolbox™ software will automatically add an output layer for you.

After importing the network, create a representation of the role and comment function approximator. To this end, we must first obtain observation and action norms from the environment.

obsInfo = getObservationInfo(env);
actInfo = getActionInfo(env);

Creating a commenter means that the name of the input layer of the commenter network is designated as the observation name. Since the reviewer network has a single observation input and a single action output, use a value function to represent it.

critic = rlValueRepresentation(criticNetwork,obsInfo,...
    'Observation',{
    
    criticNetwork.Layers(1).Name});

Creating an actor means that the name of the input layer of the role network is designated as the observation name, and the name of the output layer of the actor network is designated as the observation name. Since the actor network has a single scalar output, use the deterministic actor representation.

actor = rlDeterministicActorRepresentation(actorNetwork,obsInfo,actInfo,...
    'Observation',{
    
    actorNetwork.Layers(1).Name},...
    'Action',{
    
    actorNetwork.Layers(end).Name});

Then you can:

  1. Use these representations to create agents. For more information, see Reinforcement Learning Agent .

  2. Use setActor and setCritic to set actor and commenter representations in existing agents, respectively .

Guess you like

Origin blog.csdn.net/wangyifan123456zz/article/details/109561058