MATLAB Reinforcement Learning Toolbox (9) Create continuous or discrete [action observation] specifications for the reinforcement learning environment

Continuous [action observation] specification

The rlNumericSpec object specifies the continuous action or observation data specification used in the reinforcement learning environment.

grammar

spec = rlNumericSpec(dimension)
spec = rlNumericSpec(dimension,Name,Value)

description

spec = rlNumericSpec(dimension)
creates a data specification for continuous operations or observations, and sets the dimension attributes.

spec = rlNumericSpec(dimension,Name,Value)
uses name-value pair parameters to set attributes.

performance

LowerLimit — Lower limit of the data space
‘-inf’ (default) | scalar | matrix

The lower limit of the data space, specified as a scalar or matrix with the same size as the data space. When LowerLimit is specified as a scalar, rlNumericSpec will apply it to all entries in the data space.

UpperLimit — Upper limit of the data space
‘inf’ (default) | scalar | matrix

The upper limit of the data space, specified as a scalar or matrix with the same size as the data space. When UpperLimit is specified as a scalar, rlNumericSpec will apply it to all entries in the data space.

Name — Name of the rlNumericSpec object
string (default)

The name of the rlNumericSpec object, specified as a string.

Dimension — Dimension of the data space
numeric vector (default)

This attribute is read-only.
The dimension of the data space, specified as a numeric vector.

DataType — Information about the type of data
string (default)

This attribute is read-only.
Information about the data type, specified in the form of a string.

Object function

function Features
rlSimulinkEnv Use the dynamic model implemented in Simulink to create a reinforcement learning environment
rlFunctionEnv Use functions to specify custom reinforcement learning environment dynamics
rlRepresentation (Not recommended) Model representation of reinforcement learning agent

Reinforcement learning environment for example Simulink model

For this example, consider the rlSimplePendulumModel Simulink model. The model is a simple frictionless pendulum, initially suspended in a downward position.

Open the model

mdl = 'rlSimplePendulumModel';
open_system(mdl)

Create rlNumericSpec and rlFiniteSetSpec objects respectively.

obsInfo = rlNumericSpec([3 1]) % vector of 3 observations: sin(theta), cos(theta), d(theta)/dt

Insert picture description here

actInfo = rlFiniteSetSpec([-2 0 2]) % 3 possible values for torque: -2 Nm, 0 Nm and 2 Nm

Insert picture description here
You can use dot notation to assign property values ​​to rlNumericSpec and rlFiniteSetSpec objects.

obsInfo.Name = 'observations';
actInfo.Name = 'torque';

Assign agent program block path information, and use the information extracted in the previous steps to create a reinforcement learning environment for the Simulink model.

agentBlk = [mdl '/RL Agent'];
env = rlSimulinkEnv(mdl,agentBlk,obsInfo,actInfo)

Insert picture description here
You can also include the reset function using dot notation. For this example, initialize theta0 randomly in the model workspace.

env.ResetFcn = @(in) setVariable(in,'theta0',randn,'Workspace',mdl)

Insert picture description here

Discrete [action observation] specification

The rlFiniteSetSpec object specifies the discrete operation or observation data specification used in the reinforcement learning environment.

grammar

spec = rlFiniteSetSpec(elements)

description

spec = rlFiniteSetSpec(elements) creates a data specification with a set of discrete actions or observations, and sets the Elements property.

performance

Elements — Set of valid actions or observations
vector | cell array

A set of effective operations or observations for the environment, designated as one of the following:

  1. Vector—Specify a valid value for a single action or single observation.

  2. Cell array—When you perform multiple operations or observations, please specify a valid combination of values. Each entry of the cell array must have the same size.

Name — Name of the rlFiniteSetSpec object
string (default)

The name of the rlFiniteSetSpec object, specified as a string. Use this attribute to set a meaningful name for the finite set.

Description — Description of the rlFiniteSetSpec object
string (default)

The description of the rlFiniteSetSpec object, specified as a string. Use this attribute to make a meaningful description of the limited setting value.

Dimension — Size of each element
vector (default)

  1. This attribute is read-only.

  2. The size of each element, specified as a vector.

  3. If you specify Elements as a vector, Dimension is [1 1]. Otherwise, if you specify a cell array, Dimension indicates the size of the item in Elements.

DataType — Information about the type of data
string (default)

  1. This attribute is read-only.

  2. Information about the data type, specified in the form of a string.

Object function

function Features
rlSimulinkEnv Use the dynamic model implemented in Simulink to create a reinforcement learning environment
rlFunctionEnv Use functions to specify custom reinforcement learning environment dynamics
rlRepresentation (Not recommended) Model representation of reinforcement learning agent

Reinforcement learning environment for example Simulink model

For this example, consider the rlSimplePendulumModel Simulink model. The model is a simple frictionless pendulum, initially suspended in a downward position.

Open the model.

mdl = 'rlSimplePendulumModel';
open_system(mdl)

Create rlNumericSpec and rlFiniteSetSpec objects respectively.

obsInfo = rlNumericSpec([3 1]) % vector of 3 observations: sin(theta), cos(theta), d(theta)/dt

Insert picture description here

actInfo = rlFiniteSetSpec([-2 0 2]) % 3 possible values for torque: -2 Nm, 0 Nm and 2 Nm

Insert picture description here
You can use dot notation to assign property values ​​to rlNumericSpec and rlFiniteSetSpec objects.

obsInfo.Name = 'observations';
actInfo.Name = 'torque';

Assign agent program block path information, and use the information extracted in the previous steps to create a reinforcement learning environment for the Simulink model.

agentBlk = [mdl '/RL Agent'];
env = rlSimulinkEnv(mdl,agentBlk,obsInfo,actInfo)

Insert picture description here
You can also include the reset function using dot notation. For this example, initialize theta0 randomly in the model workspace.

env.ResetFcn = @(in) setVariable(in,'theta0',randn,'Workspace',mdl)

Insert picture description here

Specify discrete value sets for multiple operations

If the actor of your reinforcement learning agent has multiple outputs, and each output has a discrete action space, you can use the rlFiniteSetSpec object to specify possible discrete action combinations.

Assume that the effective value of the dual output system is [1 2] for the first output and [10 20 30] for the second output. Create discrete action space specifications for all possible input combinations.

actionSpec = rlFiniteSetSpec({
    
    [1 10],[1 20],[1 30],...
                              [2 10],[2 20],[2 30]})

Insert picture description here

Guess you like

Origin blog.csdn.net/wangyifan123456zz/article/details/109501783