The relationship between model weights and deep learning training frameworks

Model weights files in different formats

Usually we can see model parameter files with Caffe or Pytorch or TensorFlow

Different teams may use different deep learning training frameworks, and then use their internal model weight saving function to save, so that they can reuse or release open source later , so that others can use their own trained model weights, based on it further improvement or training

Then if the team uses the TensorFlow framework, the weight file is saved in the form of TensorFlow

If it is Caffe, then the weight file is saved in the form of Caffe

Conversely, Pytorch is the same

The main difference between weight files of different training frameworks (such as Caffe and PyTorch) usually lies in the way they are stored and organized

But it saves the weight between the neural network layer and the layer, which is unchanged

So the format of the weight file can be converted between different training frameworks

Conversion of weight file format

For example, the team using Caffe released the model weight file in Caffe format

Then another team using Pytorch wants to call the model weight file, then they only need to use the model weight file format conversion tool to convert the weight file in Caffe format to Pytorch format

These tools read one frame's weights file and create a new weights file in the format of another frame's weights file

The weight and structure of the essential neural network model are unchanged , but the format of the weight file depends on the framework used

Calling the model weight file

The weight file is usually called when the model loads the weights .

Generally speaking, we will specify the path of the weight file in the code , and then use the model loading function to read the weight file and load the weight to the corresponding position of the model . This process may vary slightly in different deep learning frameworks.

Principles and Limitations of the Weight Format Conversion Tool

The weight conversion tool works by reading a framework's weights file , understanding its organization , and then reorganizing and saving those weights as required by another framework .

However, it should be noted that although the weights of the neural network can be converted between different frameworks, not all operations can be one-to-one correspondence between different frameworks . For example, some specific layers (such as custom layers) or some specific operations may exist in one framework but not in another framework , which may encounter problems when doing weight transformation. In this case, you may need to write some code yourself to implement these specific operations or layers .

Format and suffix of TensorFlow and Pytorch weight save files

  1. TensorFlow
    • TensorFlow typically uses the SavedModel format to save the complete model , including weights, computation graph, and possible metadata . This is a directory that contains weight files in binary format and a ** used to describe the model's computational graph .pb文件. The name of the SavedModel directory usually has no fixed suffix, but the internal calculation graph files usually have the suffix of .pb**.
    • TensorFlow can also choose to save only the weights of the model , which usually uses the Checkpoint format , which contains one or more .chkpt files containing the weights of the model .
  2. PyTorch
    • PyTorch usually uses files with .pth or .pt suffixes to save the weights of the model . These files are actually a Python object serialization format called pickle , which PyTorch uses to hold all the parameter information of the model .
    • PyTorch can also save the complete model , including the structure and weights of the model , which usually uses .pth or .pt suffix files .

Guess you like

Origin blog.csdn.net/ahahayaa/article/details/131363156