[Li Hongyi 2022 Machine Learning Spring] hw2_Classification (strong baseline)

Experimental record (acc ≈ kaggle score)

Experimental configuration:

concat_nframes = 19
batch_size = 2048
num_epoch = 50 
learning_rate = 0.0005
scheduler = lr_scheduler.CosineAnnealingLR(optimizer, T_max=10, eta_min=1e-5)
L2 = 1e-4

nn.BatchNorm1d(2048),
nn.Dropout(0.5), // 没有试过不加dropout,会不会过拟合,不加Dropout结果可以告诉我

Experimental results:

hidden_layer=6, hidden_dim=1024: 0.733225
hidden_layer=2, hidden_dim=1700: 0.746941
hidden_layer=3, hidden_dim=2048: 0.752439
hidden_layer=6, hidden_dim=input_dim->2048->2048->1024->512->256- >output_dim: 0.753701 (last year's program, without cosine annealing, the learning rate accuracy rate is only about 0.70)

Grading

insert image description here
insert image description here

reward

Cosine annealing learning rate (It's a big killer, maybe it will be used in the future, to escape from the local optimum)
Reference: https://www.cnblogs.com/chouxianyu/p/12573673.html

Using the cosine annealing learning rate, some students may ask, why is it always cosine annealing?In the words of Mr. Li Hongyi, this is the meaning of the ancient sages and sages, it's right to use, but my understanding is that when using cosine annealing, you can intuitively see which learning rates are more appropriate, which is very helpful for us to choose the correct learning rate parameters.

torch.optimFor more learning rate change functions , please refer to the pytorch official website How to adjust learning rate: https://pytorch.org/docs/stable/optim.html#torch.optim.lr_scheduler.CosineAnnealingWarmRestarts

Simple usage explanation:

import torch.optim.lr_scheduler as lr_scheduler // 引包
scheduler = lr_scheduler.CosineAnnealingLR(optimizer, T_max=10, eta_min=1e-5)  // 定义scheduler 
scheduler.step()  // 每一轮之后改变学习率

insert image description here


I couldn't get past the strong baseline with last year's strong program, so I couldn't figure it out...
This year's 2022 data preprocessing is not friendly, it's a bit slow, but it can be adjusted to a total of 19 frames. Last year's 11 frames have been given.
Then use the cosine annealing learning rate to increase from 0.70 to 0.75, lol~
The biggest gain is: cosine annealing learning rate

References

Li Hongyi 2022 Machine Learning HW2 Analysis : https://blog.csdn.net/weixin_42369818/article/details/123632053?spm=1001.2014.3001.5502

[LR Scheduler] Cosine Annealing : https://blog.zhujian.life/posts/6eb7f24f.html

Guess you like

Origin blog.csdn.net/weixin_43154149/article/details/123809026