Multi-tasking loss optimization

1. Problems faced by multi-task learning optimization

Multi-task learning often has a seesaw phenomenon, that is, when two tasks are jointly learned, the effect of one task may become better, while the effect of the other task may become worse. In essence, the core is the following three problems in the training process:

  1. Inconsistent multi-task gradient direction: the same set of parameters, different task update directions are inconsistent, resulting in model parameter oscillations and negative migration between tasks, which generally occurs in scenarios with large differences between multiple tasks;
  2. Multi-task convergence speed is inconsistent: Different tasks have different convergence speeds. Some tasks are relatively simple and have fast convergence speed, while others are relatively difficult. state of underfitting;
  3. The magnitude of the multi-task loss value varies greatly: the value range of the loss value of different tasks varies greatly, and the model is dominated by the task with a relatively large loss. In this case, different loss functions are used for the two tasks, or the value of the fitting value is greatly different. etc. are the most common.

2. Multi-tasking design

It is easier to think that the multi-objective loss design should meet the following two points to avoid being dominated by a certain task:

  1. The magnitude of the loss of each task is close, preferably consistent; (the magnitude of the value range is close), the magnitude of the loss can divide the loss of each task by the corresponding initial loss (cross-entropy loss/L2)
  2. Each task is learned at a similar rate. (The learning speed is close), the speed is the ratio of the loss corresponding to the number of adjacent iterations, the smaller the value, the higher the speed
  3. Set weights between tasks

2.3 How to design each loss weight in multi-task learning

Multi-task loss optimization recommended on AI (adaptive weight articles)

Multi task learning in Deep Learning of ShowMeAI Knowledge Community ——optimization strategy part-Knowledge

Optimization in Multi-task learning (Optimization in Multi-task learning) bzdww

How to balance the multiple losses of deep learning? - Know almost

Multi-task learning MTL model: Multi-objective Loss optimization strategy - 知乎

How to balance multi-task model fusion? - Short book

About artificial intelligence: Multi-task and multi-target CTR estimation technology- Fun Zone

PCGrad method: how to balance the multiple losses of deep learning? - Know almost

                        Multi-task learning - [ICLR 2020] PCGrad_Xiaoye Maomao (Zhuo Shoujie)'s Blog-CSDN Blog

Task Uncertainty: One of the Balanced Loss Methods in Multi-Task Learning - Algorithms

 The author's own method, no paper, for reference only:

Jishi Developer Platform - Computer Vision Algorithm Development Platform

2.2. The method of improving multi-task learning effect through gradient optimization

Aiming at the above problems in the optimization process of multi-task learning, there are a series of works in the industry to solve them. Today I will introduce 4 methods to improve the effect of multi-task learning through gradient optimization.

Specific reference:  How should each loss weight be designed in multi-task learning? - Know almost

2.3 Using Uncertainty to Weigh Losses

Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics_cdknight_happy的博客-CSDN博客

Uncertainty Loss Uncertainty Loss_CharpYu's Blog-CSDN Blog

2.4, multi-task analysis and solution

Reference article:

1. Loss optimization of multi-objective model - Zhihu

2. Paper reading: Gradient Surgery for Multi-Task Learning

Guess you like

Origin blog.csdn.net/ytusdc/article/details/128511116