soft Attention 和Hard Attention - 代码天地

soft Attention 和Hard Attention

其他 2018-08-11 02:22:40 阅读次数: 0

以下内容摘自：https://zhuanlan.zhihu.com/p/31547842

2.1 soft Attention 和Hard Attention

把输入X编码成一个固定的长度，对于句子中每个词都赋予相同的权重，这样做是不合理的，因此，对输入的每个词赋予相同权重，这样做没有区分度，往往是模型性能下降。
Attention Mechanism，用于对输入X的不同部分赋予不同的权重，进而实现软区分的目的。
Kelvin Xu等人与2015年发表论文《Show, Attend and Tell: Neural Image Caption Generation with Visual Attention》，在Image Caption中引入了Attention，当生成第i个关于图片内容描述的词时，用Attention来关联与i个词相关的图片的区域。

Soft Attention：

传统的Attention Mechanism就是Soft Attention。Soft Attention是参数化的（Parameterization），因此可导，可以被嵌入到模型中去，直接训练。梯度可以经过Attention Mechanism模块，反向传播到模型其他部分。

Hard Attention：

相反，Hard Attention是一个随机的过程。Hard Attention不会选择整个encoder的输出做为其输入，Hard Attention会依概率Si来采样输入端的隐状态一部分来进行计算，而不是整个encoder的隐状态。为了实现梯度的反向传播，需要采用蒙特卡洛采样的方法来估计模块的梯度。
两种Attention Mechanism都有各自的优势，但目前更多的研究和应用还是更倾向于使用Soft Attention，因为其可以直接求导，进行梯度反向传播。

更多关于attention的内

2.2 local / global attention
2.3 Self Attention

还有一篇相对来说理论性较强的https://www.cnblogs.com/taojake-ML/p/6113459.html

猜你喜欢

转载自blog.csdn.net/ccbrid/article/details/79730645

soft Attention 和Hard Attention

Soft Attention and Hard Attention

Attention机制论文阅读——Soft和Hard Attention

图像注意力：Soft & Hard attention

【论文笔记】Attention总结二：Attention本质思想 + Hard/Soft/Global/Local形式Attention

零基础学nlp【5】 hard attention 和 soft attention（Show, attend and tell: Neural image caption generation ）

attention

Attention 和self-attention

Attention机制论文阅读——global attention和local attention

self-attention和cross-attention

Deep Metric Learning by Online Soft Mining and Class-Aware Attention

Attention 和 Transformer

attention与self attention的区别

Axial Attention 轴向attention

Attention与Self-Attention

MultiHead-Attention和Masked-Attention的机制和原理

Self-Attention 和 Transformer

【NLP】Attention机制和RNN

Seq2Seq中的Attention和self-attention

self.attention 和attention 有什么区别

Attention Mechanism Bahdanau attention vs Luong attention

Attention机制（Bahdanau attention & Luong Attention）

Attention Points

attention机制

Attention模型

Attention Model

Attention in CV

ATTENTION MECHANISM

Attention总结

attention 讲解

今日推荐

面壁智能发布 Eurux-8x22B 开源大模型 —— 堪称「理科状元」

开源日报 | 谷歌扶持鸿蒙上位；开源Rabbit R1；Docker加持的安卓手机；微软的焦虑和野心；海尔电器把开放平台关了

中国码农的“35岁魔咒”

蘭雅 CorelDRAW 插件 2024.5.1 国际劳动节版，免费下载

Arc Browser for Windows 1.0 正式 GA

90后程序员开发视频搬运软件、不到一年获利超 700 万，结局很刑！

周排行

OOP第二次作业

java web 乱码问题

android 禁止scrollview 因控件变化自动滚动到底的方法

mysql服务解压版的安装(5.7)

centos7 nginx+tomcat配置https 安装免费SSL Let’s Encrypt

使用Mosquitto遗嘱机制实现感知客户端上下线功能的方法

面向对象之------多态与多态性

开发Teams Tabs应用程序

C# 希尔排序

第2章 Jupyter Notebooks

每日归档

更多

2024-05-06(40)

2024-05-05(0)

2024-05-04(7)

2024-05-03(19)

2024-05-02(0)

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)