【Multi-task Learning】Multi-task Learning Hands-on Coding with Datasets, a thorough understanding of multi-task learning


foreword

The models we talked about before usually focus on a single task, such as predicting the category of pictures, etc. During training, we will focus on the optimization of a specific indicator. But sometimes, we need to know a picture and know the
type of news from it (Politics/sports/entertainment) and whether it is male news or female news.
We focus on the optimization of a specific indicator, and may ignore the useful information on the indicators of interest. Specifically, it is the additional information brought by training related tasks , by sharing representations in multiple related tasks, we can make the model generalize better on our original tasks. This method is called multi-task learning.


1. Multi-task learning

1.1 Definition

Simultaneously complete multiple predictions, shared representations, and shared feature extraction. This allows the model to focus on some unique features. In fact, a set of network for feature extraction, combined with multiple loss functions, is a multi-task loss. Image positioning is a single task, if you still
need Knowing the categories, it becomes multi-task learning.
insert image description here

1.2 Principle

The model of multi-task learning is usually implemented by re-sharing the hidden layer (feature extraction layer) for all tasks, and using multiple output layers for different tasks. The more tasks that are automatically learned, the model can obtain representations that capture all tasks, and The risk of overfitting on the original task is smaller.
In multi-task learning, for the feature extraction of a task, since other tasks can also filter the extracted features, it can help the model focus on those that really work. Features.
The model will learn features that express multiple tasks as much as possible, and these features will have good generalization ability.

2. Multi-task learning code

Simultaneously predict the color and category of an item.

2.1 Preliminary Study on Datasets

One branch is used to classify the kind of clothing (such as shirt, skirt, jeans, shoes, etc.) given an input image; the other
branch is used to classify the color of that clothing (black, red, blue, etc.).
In total, our dataset consists of 2525 images divided into 7 "color+category" combinations, including:

Black jeans (344 images)
black shoes (358 images) blue skirt (386 images) blue jeans (356 images)
blue shirt (369 images) red skirt (380 images) red shirt (332 images) Image) data set download link: https://pan.baidu.com/s/1JtKt7KCR2lEqAirjIXzvgg Extraction code: 2kbc




insert image description here
insert image description here
insert image description here

2.2 Preprocessing

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import torchvision
import glob
from torchvision import transforms
from torch.utils import data
from PIL import Image

img_paths = glob.glob(r"F:\multi-output-classification\dataset\*\*.jpg")
img_paths[:5]

insert image description here
The path folder represents the label, so to get its label:

label_names = [img_path.split("\\")[-2] for img_path in img_paths]
label_names[:5]

insert image description here

label_array = np.array([la.split("_") for la in label_names])
label_array

insert image description here

label_color = label_array[:,0]
label_color

insert image description here

label_item = label_array[:,1]
label_item


Turn them into index, because only numbers are recognized in torch

unique_color = np.unique(label_color)
unique_color
unique_item = np.unique(label_item)
unique_item
item_to_idx = dict((v,k) for k, v in enumerate(unique_item))
item_to_idx
color_to_idx = dict((v,k) for k, v in enumerate(unique_color))
color_to_idx
label_item = [item_to_idx.get(k) for k in label_item]
label_color = [color_to_idx.get(k) for k in label_color ]
transform = transforms.Compose([
    transforms.Resize((96,96)),
    transforms.ToTensor(),
])

custom data set

class Multi_dataset(data.Dataset):
    def __init__(self,imgs_path, label_color, label_item) -> None:
        super().__init__()
        self.imgs_path = imgs_path
        self.label_color = label_color
        self.label_item = label_item
    
    def __getitem__(self, index):
        img_path = self.imgs_path[index]
        pil_img = Image.open(img_path)
        # 防止有图片有黑白图
        pil_img = pil_img.convert('RGB')
        pil_img = transform(pil_img)
        label_c = self.label_color[index]
        label_i = self.label_item[index]
        return pil_img, (label_c,label_i)
    def __len__(self):
        return len(self.imgs_path)

Divide the training set

count = len(multi_dataset)
count
# 划分训练集 测试集
train_count = int(count*0.8)
test_count =  count - train_count
train_ds, test_ds = data.random_split(multi_dataset,[train_count, test_count])
len(train_ds),len(test_ds)
BATCHSIZE = 32
train_dl = data.DataLoader(train_ds,batch_size=BATCHSIZE,shuffle=True)
test_dl = data.DataLoader(test_ds,batch_size=BATCHSIZE)

insert image description here
insert image description here

2.3 Network structure design

## 定义网络
class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3,16,3)
        self.conv2 = nn.Conv2d(16,32,3)
        self.conv3 = nn.Conv2d(32,64,3)
        self.fc = nn.Linear(64*10*10, 1024)
        self.fc1 = nn.Linear(1024,3)
        self.fc2 = nn.Linear(1024,4)
    
    def forward(self,x):
        # 3X96X96-->3X48*48--->3X24X24--->3X12X12
        x = F.relu(self.conv1(x))
        x = F.max_pool2d(x, 2)
        x = F.relu(self.conv2(x))
        x = F.max_pool2d(x,2)
        x = F.relu(self.conv3(x))
        x = F.max_pool2d(x,2)
        x = x.view(-1,64*10*10)
        c = F.relu(self.fc(x))
        i = self.fc2(x)
        return c,i
        
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = Net().to(device)
model
Net(
  (conv1): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1))
  (conv2): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1))
  (conv3): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1))
  (fc): Linear(in_features=6400, out_features=1024, bias=True)
  (fc1): Linear(in_features=1024, out_features=3, bias=True)
  (fc2): Linear(in_features=1024, out_features=4, bias=True)
)

2.4 Training

loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.0005)

3. Summary

to be continued

Guess you like

Origin blog.csdn.net/weixin_40293999/article/details/130549484