CS231n课程笔记：Leture7 Training Neural Networks II

企业开发 2023-04-08 07:39:22 阅读次数: 0

目录

Fancier optimization

Transfer Learning

Fancier optimization

# Vanilla Gradient Descent

while True：
    weights_grad = evaluate_gradient(loss_fun, data, weights)
    weights += - step_size * weights_grad

for this type of objective function

what does a saddle point mean？

that means the at my current point some directions the loss goes up

Nesterov momentum

AdaGrad

RMSProp

At the very first time step, you can see that at the beginning, we've initialized our second moment with zero.Now after one update of the second moment, typically this beta two, second moment decay rate, is something like 0.9 or 0.99 something very close to one.

After one update, our second moment is still very very close to zero.Now when we're making our update step here and we divide by our second moment, now we're deviding by a very small number

Adam adds this bias correction term to aviod this problem of taking very large steps

If you can afford to do full batch updates then try out L-BFGS

Regularization

dropout !!

More common: Inverted dropout

data augmentation!

(颜色抖动)

Transfer Learning

猜你喜欢

转载自blog.csdn.net/m0_53292725/article/details/127022889

CS231n课程笔记：Leture7 Training Neural Networks II

CS231n课程笔记：Leture6 Training Neural Networks I

训练神经网络（CS231n 7. Training Neural Networks II）

CS231n 7. Training Neural Networks II 训练神经网络

CNN笔记（CS231N）——训练神经网络II（Training Neural Networks, Part 2）

Training Neural Networks, part II

CS231n课程笔记：Leture5 Convolutional Neural Networks

cs231n 学习 -- Lecture 6/7 Training Neural Networks

CNN笔记（CS231N）——训练神经网络I（Training Neural Networks, Part I）

CS231n Lecture6-Training Neural Networks, part I学习笔记

[Lecture 7 ] Training Neural Networks II（训练神经网络II）

【CS231n】Lecture 6：Training Neural Networks,Part 2

cs231n : Convolutional Neural Networks

CS231n笔记 Lecture 4 Introduction to Neural Networks

Population Based Training of Neural Networks

（转）A Recipe for Training Neural Networks

Training Neural Networks, part I

【阅读笔记】Differentiable plasticity: training plastic neural networks with backpropagation

【阅读笔记】Training Deep Neural Networks on Imbalanced Data Sets

《Understanding the difficulty of training deep feedforward neural networks》笔记

MLCC笔记15 - 训练神经网络 (Training Neural Networks)

论文笔记:Bag of Freebies for Training Object Detection Neural Networks 论文笔记:Bag of Freebies for Training Object Detection Neural Networks

(Review cs231n) Training of Neural Network2

CS231n Convolutional Neural Networks for Visual Recognition

CS231n: Lecture 10 | Recurrent Neural Networks

CS231n Lecture4-Introduction to Neural Networks

CS231n:Convolutional Neural Networks for Visual Recognition

cs231n 学习 -- Lecture 5 Convolutional Neural Networks

cs231n 学习 -- Lecture 4 Backpropagation and Neural Networks

CS231n课程笔记：Leture4 Backpropagation and Neural Network

今日推荐

美国拟限制 AI 大模型出口中国和俄罗斯

苹果将与 OpenAI 达成协议，将 ChatGPT 应用于 iPhone

openKylin 社区生态委员会第六次会议圆满召开

阿里云正式发布通义千问 2.5

Python 3.13 发布首个 Beta：实验性自由线程模式和 JIT、改进交互式解释器

Stack Overflow 拿我的代码去训练 AI 大模型，还封了我的账号

Pop!_OS 的 COSMIC 桌面完成 App Store 上架工作

报告：Django 仍然是 74% 开发者的首选

《2024 年一季度互联网投融资运行情况》研究报告

15 年前上了“FFmpeg 耻辱柱”，今天他还得谢谢咱——腾讯QQPlayer一雪前耻？

TIOBE 5 月榜单：Fortran “复活”进入 Top 10

GCC 14.1 发布

周排行

NEFU 117 素数个数的位数

Closest Common Ancestors (Lca,tarjan)

ELK部署

【转载】Hive笔记整理（三）

SQL语句（一）基本表的定义

关于Java web开发中的MySQL的事务语句

MFC创建自定义窗体

如何用一句话激怒程序员？

《逆袭大学》文摘——9.4 基础和应用的平衡中找到大学的节奏

【spring源码分析】@Value注解原理

每日归档

更多

2024-05-11(38)

2024-05-10(38)

2024-05-09(35)

2024-05-08(42)

2024-05-07(14)

2024-05-06(40)

2024-05-05(0)

2024-05-04(7)

2024-05-03(19)

2024-05-02(0)