【Deep Learning】A Structured Self-attentive Sentence Embedding - 代码天地

【Deep Learning】A Structured Self-attentive Sentence Embedding

其他 2020-02-27 10:30:06 阅读次数: 0

A Structured Self-attentive Sentence Embedding [arXiv]

TLDR; The authors present a new sentence embedding algorithm where each embedding corresponds to a 2-d matrix. This is done by 1. Running the sentence through an RNN, 2. learning multiple attention values for each RNN state, 3. encouraging each attention vector to focus on different parts of the sentence by adding a penalty term. The model is evaluated on several classification tasks and outperforms other methods.

Key Points

Sentence Encoder is a bidirectional RNN with each state being concatenated forward and backward states
Attention values are calculated over encoder states as A = softmax(U*tanh(W*H^t)). The number of rows in U defines how many attention vectors we want.
- This is just basic attention, nothing special here, except that it’s calculated multiple times with different weights
Penalization term encourages rows in attention matrix to be different from one another.
- P = ||AA^T - I||^2
Learned sentence embeddings can easily be visualized through attention scores
The authors empirically show that the penalization term helps and that multiple attention vectors are important

Thoughts

The term multi-hop attention or is misleading here. There are no multiple hops of attention, just a single hop of multiple attention calculations independent of one another. Multiple hops would corresponds to taking attention over attention, etc.
Nice analysis on the hyperparameter choices of the mechanism
I was surprised that the authors subsample data for some of the dataset, I wonder what the reason for that is.
I think this technique can eaisly be extended to other tasks than use attention. E.g. it would be interesting to apply it to NMT.
Overall I think this is a simple but cool technique

发布了1164 篇原创文章 · 获赞 1245 · 访问量 970万+

私信关注

猜你喜欢

转载自blog.csdn.net/weixin_40400177/article/details/103605572

【Deep Learning】A Structured Self-attentive Sentence Embedding

读书笔记16：A structured self-attentive sentence embedding（ICLR 2017）

论文阅读：Deep Metric Learning via Lifted Structured Feature Embedding

What is Machine Learning, Deep Learning and Structured Learning?

Machine Learning and having it deep and structured

【Deep Learning】A Sensitivity Analysis of Convolutional Neural Networks for Sentence Classification

论文阅读 Deep Attentional Structured Representation Learning for Visual Recognition

【论文笔记】Deep Structured Output Learning for Unconstrained Text Recognition

其他加速方案--Learning Structured Sparsity in Deep Neural Networks

句向量 Sentence Embedding

【人脸识别~采样策略】Sampling Matters in Deep Embedding Learning

基于特征距离的采样策略：Sampling Matters in Deep Embedding Learning

【Deep Learning】Sentence Level Recurrent Topic Model: Letting Topics Speak for Themselves

《Learning Deep Structured Semantic Models for Web Search using Clickthrough Data 》论文总结

论文《Depth acquisition with the combination of structured light and deep learning stereo matching》学习

【论文学习】Deep Learning for Unsupervised Insider ThreatDetection in Structured Cybersecurity DataStreams

【DSSM】Learning Deep Structured Semantic Models for Web Search using Clickthrough Data

【深度学习】Sentence Embedding-BERT-Flow

【深度学习】Sentence Embedding-BERT-Whitening

Learning Combinatorial Embedding Networks for Deep Graph Matching（基于图嵌入的深度图匹配）

From Word Embedding to Sentence Embedding:从词向量到句向量

论文笔记：Deep Attentive Tracking via Reciprocative Learning

每日一篇小论文 ---- Attentive Statistics Pooling for Deep Speaker Embedding

Deep Learning

Unsupervised Learning: Word Embedding

Deep Learning - Machine Learning

Deep Fragment Embeddings for Bidirectional Image Sentence Mapping

Deep Self-Taught Learning for Weakly Supervised Object Localization

「Deep Learning」Note on Self-Attention Generative Adversarial Network

深入探寻《Self-Attentive Sequential Recommendation》ICDM‘18

今日推荐

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

火速冲上 GitHub 热榜 —— 开源编程语言、框架哪有这么可爱？

北京人形机器人创新中心发布全球首个纯电驱拟人奔跑的全尺寸人形机器人“天工”

LFOSSA 源来如此公开课 | 掌握云原生未来：CNCF 认证全面攻略与备考秘籍

周排行

循环神经网络（rnn）讲解

Tigao教程四：单独的关节运动

金蝶K3WISE15.0-注册套打教程

如何在Mac上配置Kubernetes

Android应用结束自身进程的方法

SpringMVC学习十三拦截器栈

中国驻洛杉矶总领馆举行新春招待会

HttpClient get post 发送

11 - three.js 笔记 - 绘制三维字体模型

Mysql递归获取某个父节点下面的所有子节点和子节点上的所有父节点

每日归档

更多

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)

2024-04-24(36)

2024-04-23(26)

2024-04-22(39)