神经网络中的正则化 - 代码天地

神经网络中的正则化

企业开发 2022-06-26 22:22:25 阅读次数: 0

本文已参与「新人创作礼」活动，一起开启掘金创作之路。

Adding regularization will often help To prevent overfitting problem (high variance problem ).

1. Logistic regression

回忆一下训练时的优化目标函数

\min \limits_{w,b}J\left(w,b\right), \ \ \ \ w\in\mathbb{R}^{n_x},b\in\mathbb{R} \tag{1-1}

其中

J\left(w,b\right)=\frac{1}{m}\sum_{i=1}^{m}L\left(\hat y^{(i)},y^{(i)}\right)\\ \tag{1-2}

$L_2 \ \ regularization$ (most commonly used)：

其中

Why do we regularize just the parameter w? Because w Is usually a high dimensional parameter vector while b is A scalar. Almost all The parameters are in w rather than b.
$L_1 \ \ regularization$

J\left(w,b\right)=\frac{1}{m}\sum_{i=1}^{m}L\left(\hat y^{(i)},y^{(i)}\right)+\frac{\lambda}{m}\left\lvert w \right\rvert_1\tag{1-5}

其中

\left\lvert w \right\rvert_1=\sum_j^{n_x}\left\lvert w_j \right\rvert \tag{1-6}

w will end up being sparse. In other words the w vector will have a lot of zeros in it. This can help with compressing the model a little.

2. Neural network "Frobenius norm"

其中

\left\lVert w^{[l]} \right\rVert_F^2=\sum_i^{n^{[l-1]}}\sum_j^{n^{[l]}}\left(w_{ij}\right)^2 \tag{2-2}

$L_2$ regulation is also called Weight decay:

\begin{aligned} dw^{[l]}&=\left(from\ backprop\right)+\frac{\lambda}{m}w^{[l]}\\ w^{l}:&=w^{[l]}-\alpha dw^{[l]}\\ &=\left(1-\frac{\alpha\lambda}{m}\right)w^{[l]}-\alpha(from\ backprop)\\ \tag{2-3} \end{aligned}

能够防止权重 $w$ 过大，从而避免过拟合

3. inverted dropout

对于不同的训练样本都可以随机消除一部分结点
反向随机失活（前向和后向都需要dropout）：

\begin{aligned} d^3&=np.random.rand(a_3.shape[0],a_3.shape[1]) < keep.prob\\ a^3&=np.multiply(a_3,d_3)\ \ \ \#a3*d3, element\ wise\ multiplication\\ a^3/&=keep.prob\ \ \ \#in\ order\ to\ not\ reduce\ the\ expected\ value\ of\ a^3\ \ inverted\ dropout\\ z^{[4]}&=w^{[4]}a^{[3]}+b^{[4]}\\ z^{[4]}/&=keep.prob\\ \tag{3-1} \end{aligned}

this inverted dropout technique by dividing by the keep.prob, it ensures that the expected value of a3 remains the same. This makes test time easier because you have less of a scaling problem. 测试时不需要使用drop out

猜你喜欢

转载自juejin.im/post/7109128137614721032

聊聊神经网络中的正则化

神经网络中的正则化

神经网络正则化

利用Keras，在神经网络中实现正则化

深度神经网络之正则化

学习笔记—神经网络与正则化

深度神经网络（DNN）的正则化

神经网络优化-正则化&DropOut

正则化、dropout深层神经网络

神经网络搭建（一、正则化）

神经网络 07(正则化)

神经网络优化----正则化（正则化损失函数）

神经网络与深度学习(四) —— 网络优化与正则化

神经网络的初始化与正则化

正则化与参数初始化对神经网络的影响

Batch Norm 对神经网络中的每一层进行正则化(未完成)

神经网络损失函数中的正则化项L1和L2

谷歌提出新型正则化方法，让深度神经网络克服大数据中的噪声

闲话深度神经网络中的正则化方法之一：Dropout

Dropout正则化和其他方法减少神经网络中的过拟合

从MAP角度理解神经网络训练过程中的正则化

神经网络中的常用损失函数以及正则化缓解过拟合

神经网络中的归一化

吴裕雄 python 神经网络——TensorFlow训练神经网络：不使用正则化

【深度学习_2.1.2】神经网络正则化

6.14关于神经网络正则化的学习

深度学习神经网络中正则化的使用

正则化对深层神经网络的影响分析

深度学习笔记4：深度神经网络的正则化

机器学习&神经网络—模型评估、正则化

今日推荐

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

LFOSSA 源来如此公开课 | 掌握云原生未来：CNCF 认证全面攻略与备考秘籍

国产云输入法——仅华为无云端数据上传安全问题

开源日报 | 工业开源项目OGG 1.0；姐姐，你要和我一起配置火狐吗；苹果AI遥遥落后？Fedora 40

开放签电子签章：停止新增，优化体验，前进更进（五一假期前工作）

开源日报 | 中学生开源前端动画引擎；全球首个Llama3 8B中文版开源模型；联想电脑恐出局；Linus讽刺AI炒作

周排行

浏览器对同一域名进行请求的最大并发连接数

React Hook之自定义Hook

【转】MyBatis缓存机制

-Java-泛型

自动化测试常用脚本-发送邮件

LeetCode#859: Buddy Strings

java、Python处理字符串

第二篇の博客

Hadoop伪分布式环境安装

SQL Server进阶（十一）临时表、表变量

每日归档

更多

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)

2024-04-22(39)

2024-04-21(0)

2024-04-20(6)

2024-04-19(5)

2024-04-18(0)