Learning representations by back-propagating errors - 代码天地

Learning representations by back-propagating errors

其他 2018-09-11 19:42:17 阅读次数: 0

这篇文章清晰地阐释了backpropagation的过程。

backpropagation的精髓就在于chain rule。同时利用神经网络的结构，使得gradient descent的过程就如同反向的神经网络一般。

假设我们有如下一个神经网络

其中

$Y_{ij}=\frac{1}{1+e^{-{X_{ij}}}}$

$X_{ij}=\sum_{j = 1}^{n}{W_{i-1,j}Y_{i-1,j}}$

最终的loss function被定义为

$L=\frac{1}{2}\sum_{i=1}^{m}{(y^{(i)}_{out}-d^{(i)})^2}$

gradient descent所做的事情就是不断调整W的取值，使得最终的loss function最小化。

文中提到如果没有hidden layer，那么GD是一件相对容易的事情；但是如果有hidden layer，事情会复杂很多。

我们需要通过计算W的偏导数来更新W的数值，根据chain rule，可以得到第一步的分解式

$\frac{\partial L}{\partial W_{ij}^{(k)}}=\frac{\partial L}{\partial Y_{k+1,j}}\frac{\partial Y_{k+1,j}}{\partial W_{ij}^{(k)}}$

其中分解的第一个偏导数可以直接从loss function的定义中得到。

至于第二个偏导数我们可以利用chain rule继续分解

$\frac{\partial Y_{k+1,j}}{\partial W_{ij}}=\frac{\partial Y_{k+1,j}}{\partial X_{k+1,j}}\frac{\partial X_{k+1,j}}{{\partial W_{ij}}}$

这一步中的第一个偏导数是sigmoid function的偏导数可以很简单地计算出来，第二个偏导数我们发现就是

$\frac{\partial X_{k+1,j}}{{\partial W_{ij}^{(k)}}}=Y_{ij}$

然后递归下去，便可以更新所有W的值。

猜你喜欢

转载自blog.csdn.net/holmosaint/article/details/82193134

Learning representations by back-propagating errors

抽点时间读经典AI论文之Learning representations by back-propagating errors

Learn note04--Deep Learning Introduce and Back-Propagating

Deep Learning, NLP, and Representations翻译学习

「Transfer Learning」Note on Discriminative Patch Representations

《Few-Shot Learning with Global Class Representations》

A Simple Framework for Contrastive Learning of Visual Representations

论文阅读：DeepWalk Online Learning of Social Representations

【论文导读】SoundNet: Learning Sound Representations from Unlabeled Video

Learning deep representations of fine-grained visual descriptions

论文笔记：Learning Attribute-Specific Representations for Visual Tracking

GraRep: Learning Graph Representations with Global Structural Information论文解读（翻译）

论文阅读：Deep Image Retrieval: Learning global representations for image search

[ICML19] Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations

学习推荐的解离化表征形式Learning Disentangled Representations for Recommendation

读文献：A Simple Framework for Contrastive Learning of Visual Representations心得

Learning to Cartoonize Using White-box Cartoon Representations

graph2vec: Learning Distributed Representations of Graphs 代码解读

Learning latent representations for style control and transfer in end-to-end TTS

【自监督论文阅读笔记】Unsupervised Learning of Dense Visual Representations

A Simple Framework for Contrastive Learning of Visual Representations[论文学习] SimCLR

An Efficient Transformer for Simultaneous Learning of BEV and Lane Representations ...——论文笔记

【论文精读】Deep Marching Cubes: Learning Explicit Surface Representations

【NeuIPS 2021】《Meta-learning sparse implicit neural representations》

Errors

《Learning Deep Representations of Fine-Grained Visual Descriptions》论文及代码阅读笔记

经典网络解读系列（五）:《Learning Deep Representations of Fine-Grained Visual Descriptions》

2016 CVPR-Learning Deep Feature Representations with Domain Guided Dropout for Person Re-ID

【论文阅读】Author2Vec: Learning Author Representations by Combining Content and Link Information

(28)[AISTATS15] Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing

今日推荐

Linus “吃狗粮”最积极！

开源日报 | Winamp播放器即将开源；生成式AI之战升级第二轮；Linus“吃狗粮”最积极；AI进入泡沫前期；吴泳铭为阿里云带来了什么？

NetBSD 禁止提交由 AI 生成的代码

Apache Doris 2.0.10 版本正式发布！

开源日报 | 大模型开战；大模型独角兽被曝卖身；周鸿祎建议谷歌开源所有产品；最大开源AI社区提供1000万美元共享GPU

开源日报 | Chrome内置Gemini的意义不在于Gemini；中国AI追随之路的五大误区；ECharts创始人“下海”养鱼；谷歌I/O开发者大会什么都有，只是没有惊喜

微软回应中国区AI团队“打包赴美”传闻

周排行

LogN级别的区间查询算法(线段树), 你学会了吗

数论概论(英文版.第4版)

idea 更新后和新的直接安装前，都需要配置 idea64.exe.vmoptions 后再使用

CANOpen系列教程04_CAN总线波特率、位时序、帧类型及格式说明

Java序列化基础

java排序算法整理

异常：org.apache.ibatis.reflection.ReflectionException

（算法练习）——二路归并排序

go 闭包函数

好程序员web前端技术分享媒体查询

每日归档

更多

2024-05-21(8)

2024-05-20(36)

2024-05-19(0)

2024-05-18(4)

2024-05-17(34)

2024-05-16(6)

2024-05-15(24)

2024-05-14(0)

2024-05-13(18)

2024-05-12(0)