Interpretation of the results file generated by Unity-ML-Agents-training-PushBlock

foreword

Training result file path: E:\ml-agents-release_19\results\push_block_test_02 (The specific path depends on your computer)

Please refer to the ML-Agents installation and PushBlock training process: (Note: push_block_test_02 has not been fully trained)

Reference blog: https://blog.csdn.net/aaaccc444/article/details/130253172

1. push_block_test_02

 1.1 PushBlock

1.1.1 checkpoint.pt

checkpoint.ptIt is a checkpoint file saved by ML-Agents during training , which contains all model parameters in the current training state . This file is saved using the PyTorch torch.save()function , which contains a dictionary object that records all the neural network parameters and optimizer parameters in the current training state.

The main purpose of saving checkpoint files is to facilitate subsequent model inference or continue training .

In the inference phase, you can load the previously saved checkpoint file and use the saved model parameters for model prediction; in the continuous training phase, you can load the previously saved checkpoint file and continue training from the last training state instead of from scratch Start retraining, saving training time.

checkpoint.pttorch.load()The file is a model weight file in PyTorch, and the weight information in it can be loaded and read using a PyTorch loader such as . For specific usage methods, please refer to the relevant chapters in the official PyTorch documentation.

1.1.2 events.out.tfevents.1681914599.DESKTOP-EL11195.21140.0

(1)events.out.tfevents.*The file is the log file of TensorBoard, which records various information during the training process, such as loss function changes, accuracy rate changes, etc. TensorBoard can be used to read these log files and generate corresponding visualization charts. To view events.out.tfevents.*the file with TensorBoard, you can run the following command at the command line:

tensorboard --logdir=path/to/logs

where path/to/logsis events.out.tfevents.*the directory where the file is located. After executing the command, TensorBoard will start a web server and automatically open the corresponding webpage in the browser. On the web page, you can select different charts to view the changing trends of various information during the training process.

For example:

(mlagents) E:\ml-agents-release_19>tensorboard --logdir=E:\ml-agents-release_19\results\push_block_test_02\PushBlock
TensorFlow installation not found - running with reduced feature set.
Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
TensorBoard 2.12.2 at http://localhost:****/ (Press CTRL+C to quit)

(2). 1681914599.DESKTOP-EL11195.21140.0 This part is the logo when TensorBoard saves the log. It consists of three parts, namely:

  • 1681914599: The timestamp of the experiment run . Indicates the start time of the experiment run, usually in seconds . In TensorFlow, each event file is stamped with a timestamp to help distinguish events from different runs. Timestamps usually refer to the total number of seconds from January 1, 1970 00:00:00 UTC to a certain time. This timestamp is a time representation method commonly used in computers, which can facilitate time comparison and calculation. In ML-Agents, timestamps are generally used to represent the start time or end time of an experiment run.
  • DESKTOP-EL11195: computer name. "DESKTOP-EL11195" is the hostname of the computer running TensorBoard. It is a unique identifier for a computer on a network and helps distinguish between different computers.
  • 21140.0: TensorBoard process ID. When running TensorBoard, TensorBoard will automatically assign a unique process ID (process ID) as an identifier , so that when multiple TensorBoard instances are running at the same time, different instances can be distinguished. This process ID is usually an integer, and the fractional part is the timestamp when TensorBoard was started. For example, 21140.0 means that the process ID is 21140 and the startup timestamp is 0.
  • This information helps to distinguish different logs between multiple TensorBoard processes and machines.

1.1.3 PushBlock-16269.onnx

PushBlock-16269.onnx is a model file for reinforcement learning training in ML-Agents. ONNX (Open Neural Network Exchange) is an open format for representing deep learning models. In ML-Agents, after training, the trained model will be saved as a file in ONNX format for subsequent reasoning . In this example, PushBlock-16269.onnx should be the model file generated when training to a specific step , and the number "16269" in it represents the number of training steps when the file was generated.

.onnx files can be opened with ONNX Runtime or other deep learning frameworks that support the ONNX format.

Here is a sample code to open .onnx files using Python and ONNX Runtime:

import onnxruntime

# 加载模型
model_path = "path/to/model.onnx"
session = onnxruntime.InferenceSession(model_path)

# 输入数据
input_data = # 生成输入数据,格式为 numpy array

# 预测结果
output = session.run(None, {'input': input_data})

# 处理输出数据
# ...

It should be noted that the input data in the corresponding format needs to be generated according to the name and shape of the input and output nodes of the model. At the same time, the libraries and operating environments that different models depend on may also be different, and need to be installed and configured according to specific situations.

1.1.4 PushBlock-16269.pt

PushBlock-16269.ptIt is the model weight filePushBlock generated during the training scene , which saves the weight parameters of the neural network in the PyTorch format . After the training is over, according to the parameters set by the user in the configuration file, some key steps in the training process will be saved as checkpoint files , so that the training can be resumed when needed or used for tasks such as inference and prediction.state_dictkeep_checkpoints

PushBlock-16269.ptThe file contains the model weights when training to step 16269 , which can be used to load the model and continue training or perform inference prediction and other tasks . These model parameters are usually saved in a file .ptwith a suffix of , and the file name usually contains information such as the model name and the number of training steps.

The .pt file is a binary file saved by the PyTorch model, and the model can be loaded using PyTorch .

Here is a simple example that loads a .pt file and uses the model for inference:

import torch

# 加载模型
model = torch.load("model.pt")

# 准备输入数据,这里的例子是一个大小为 (1, 3, 64, 64) 的张量
input_data = torch.randn(1, 3, 64, 64)

# 模型推断
output = model(input_data)

It should be noted that if your model is trained on the GPU , you need to call torch.cuda.set_device()to specify which GPU device to use before loading the model , otherwise the model may fail to load. If your model is trained on CPU, you can ignore this step.

# 指定使用第 0 个 GPU 设备
torch.cuda.set_device(0)

# 加载模型
model = torch.load("model.pt")

1.2 run_logs

 1.2.1 timers.json

timers.jsonIt is a log file that records the time of each stage in the training process. It can help developers understand the time-consuming situation of each operation in the training process for tuning. During the ML-Agents training process, the training time is divided into many different stages, such as Policy Evaluation, Environment Step, etc., and timers.jsoninformation such as the name, average time consumption, maximum time consumption, and minimum time consumption of each phase will be recorded.

For example:

{
  "Policy.Evaluate": {
    "count": 1602,
    "mean": 0.00036819224020337235,
    "max": 0.002402134418487072,
    "min": 0.00023801150512695312,
    "stddev": 0.00010763366970706554,
    "total": 0.5895571707486954
  },
  "Environment.Step": {
    "count": 1602,
    "mean": 0.013603404338185787,
    "max": 0.08366680145263672,
    "min": 0.006999969482421875,
    "stddev": 0.007571674113586825,
    "total": 21.780235171318054
  },
  ...
}

The above is timers.jsonan example of , where two phases are recorded: Policy.Evaluateand Environment.Step. For each stage, information such as the number of times it occurs during training, the average time-consuming, the maximum time-consuming and the minimum time-consuming are recorded. Developers can timers.jsonjudge which operations take a long time by viewing the file, so as to optimize the training process.

1.2.2 training_state.json

training_state.jsonIt is a state file saved during the training process of ML-Agents, which contains some information during the training process. Here is an example:

"checkpoints": [
            {
                "steps": 262415,
                "file_path": "results\\push_block_test_02\\PushBlock\\PushBlock-262415.onnx",
                "reward": 4.971129466943881,
                "creation_time": 1681915062.002365,
                "auxillary_file_paths": [
                    "results\\push_block_test_02\\PushBlock\\PushBlock-262415.pt"
                ]
            },

This code is an example of checkpoints in the training_status.json file, which represents the information of the checkpoint file generated after the model is trained to a certain step . in:

  • steps: Indicates the number of model training steps corresponding to the checkpoint file.
  • file_path: Indicates the path of the checkpoint file.
  • reward: Indicates the average reward value corresponding to the checkpoint.
  • creation_time: Indicates the creation time of the checkpoint file.
  • auxiliary_file_paths: Indicates the paths of auxiliary files required by the checkpoint file, such as the state of the optimizer saved during training.
"metadata": {
        "stats_format_version": "0.3.0",
        "mlagents_version": "0.28.0",
        "torch_version": "1.7.1+cpu"
    }

This code is the metadata field in the training_status.json file, which records the metadata information of the training status, including:

  • stats_format_version: The version number of the training statistics format.
  • mlagents_version: The version number of the ML-Agents toolkit used.
  • torch_version: The version number of PyTorch used.

1.3 configuration.yaml 

configuration.yamlIt is the main configuration file used to configure training and inference parameters in the ML-Agents tool . This file contains many configuration options for defining the agent, environment, neural network and training parameters. During training and inference, ML-Agents will read the required configuration information from this file to determine how to perform training and inference.

Here are configuration.yamlsome common configuration options in :

  • brain: Define the neural network model of the agent, the decision-making strategy of the agent, the learning algorithm of the agent, etc.
  • environment: Define the properties of the environment, such as environment name, observation space, action space, etc.
  • hyperparameters: Define hyperparameters for training, such as learning rate, batch size, optimizer, etc.
  • behaviors: Define multiple agents and how they relate and interact.
  • checkpoint_settings: Define parameters for saving checkpoints during training, such as saving frequency, saving path, etc.

By modifying configuration.yamlthe parameters in the file, the training and inference behavior of ML-Agents can be changed to suit different application scenarios and task requirements.

configuration.yaml

default_settings: null
behaviors:
  PushBlock:
    trainer_type: ppo
    hyperparameters:
      batch_size: 128
      buffer_size: 2048
      learning_rate: 0.0003
      beta: 0.01
      epsilon: 0.2
      lambd: 0.95
      num_epoch: 3
      learning_rate_schedule: linear
      beta_schedule: linear
      epsilon_schedule: linear
    network_settings:
      normalize: false
      hidden_units: 256
      num_layers: 2
      vis_encode_type: simple
      memory: null
      goal_conditioning_type: hyper
      deterministic: false
    reward_signals:
      extrinsic:
        gamma: 0.99
        strength: 1.0
        network_settings:
          normalize: false
          hidden_units: 128
          num_layers: 2
          vis_encode_type: simple
          memory: null
          goal_conditioning_type: hyper
          deterministic: false
    init_path: null
    keep_checkpoints: 5
    checkpoint_interval: 500000
    max_steps: 2000000
    time_horizon: 64
    summary_freq: 60000
    threaded: false
    self_play: null
    behavioral_cloning: null
env_settings:
  env_path: null
  env_args: null
  base_port: 5005
  num_envs: 1
  num_areas: 1
  seed: -1
  max_lifetime_restarts: 10
  restarts_rate_limit_n: 1
  restarts_rate_limit_period_s: 60
engine_settings:
  width: 84
  height: 84
  quality_level: 5
  time_scale: 20
  target_frame_rate: -1
  capture_frame_rate: 60
  no_graphics: false
environment_parameters: null
checkpoint_settings:
  run_id: push_block_test_02
  initialize_from: null
  load_model: false
  resume: true
  force: false
  train_model: false
  inference: false
  results_dir: results
torch_settings:
  device: null
debug: false

1.4 PushBlock.onnx 

PushBlock.onnx is a model file in Unity ML-Agents Toolkit, which is used to store the trained deep learning model. This file is saved using the ONNX (Open Neural Network Exchange) format, which is designed as one of the standard formats for sharing models between different deep learning frameworks. By using the PushBlock.onnx file, we can use the trained model in Unity to make the agent perform tasks better in the environment.

Recommended installation netronfor viewing onnx model structures ( pip install netron)

netron online version : enter the link in the browser: https://lutzroeder.github.io/netro/

Reference blog: https://blog.csdn.net/aaaccc444/article/details/130253172 

Reference blog: Network visualization tool netron detailed installation process_pip install nerton_Jiang Dabai*'s Blog-CSDN Blog

Guess you like

Origin blog.csdn.net/aaaccc444/article/details/130302452