[Small Ideas] Phase 1: Model engineering, vector similarity, early stopping mechanism, BERT fine-tuning small trick

background

Now that I'm busy with work and life, I don't have much free time to write long articles. But I am still learning, thinking and summarizing. Later, I will also summarize some thoughts and summaries on some algorithms in deep learning. Some thoughts, because writing is relatively simple, I will gradually synchronize to each platform, welcome to come and communicate.
When I am a little free, I will summarize the previous small summaries into blog posts. Welcome to leave a message to communicate. Idea Synchronization Platform:

  1. Zhihu: Pi Gandong
  2. Little Red Book: Kepiziju
  3. CSDN: Kepiziju
  4. Today's headlines: Copizi chrysanthemum
  5. Station B: Kepiziju

Model engineering

Deep learning model engineering can be used in many ways, such as using the Java DJL library to package the model to build a java SDK, using fastapi (python web framework, flask is also possible) to guarantee the model, provide external services, or write it as a python SDK For others to use, if you want to protect the source code of the model service, you can also use pyinstaller to package it. If it is convenient for deployment, you can also use docker to build a mirror image. In addition, you can also use onnx for engineering.

vector similarity

In deep learning papers, we often see the result of vector multiplication to measure the similarity of two vectors. Why is this done?
We can think about the calculation formula of cosine similarity. After multiplying two vectors and dividing by the product of the modulus of the two vectors, the product of the modulus is mainly used for normalization, and the similarity is normalized to [0, 1] only within the range.
In this way, it is easy to understand the result of multiplying vectors to measure the similarity of two vectors.

cosine similarity

early stop mechanism

EarlyStopping, the early stop mechanism, is often used in deep learning model training. The principle behind the early stop mechanism is to monitor some parameters in the model training process, and stop the training when the requirements are met to reduce the model training time. Common early stopping strategies include: monitor the loss of the model on the development set, and stop training when the loss value is basically unchanged over several consecutive batches. In addition, it can also monitor f1 value, accuracy, etc. The specific monitoring parameters can be determined according to the task. A good model training framework will provide related early stopping mechanism APIs, such as pytorch-lightning.

early stop mechanism

BERT fine-tuning small trick

When using BERT to fine-tune downstream tasks, you can consider adjusting the batch size, learning rate, and epoch hyperparameters of downstream tasks to refine your own model:

  1. batch-size: 16,32
  2. learning rate(Adam): 5e-5,3e-5,2e-5
  3. number of epochs:2,3,4

Please add a picture description

Guess you like

Origin blog.csdn.net/meiqi0538/article/details/127739612