tensorflow nmt 源码结构梳理 - 代码天地

tensorflow nmt 源码结构梳理

其他 2018-06-22 12:14:15 阅读次数: 3

nmt.py
main()->run_main(train_fn, inference_fn)
其中,train_fn指train.py中的train()
run_main中，根据参数：flags.inference_input_file决定是走train逻辑还是走infer逻辑
如果是infer，则取最新的checkpoint，执行inference_fn
如果是train，则走train.py的train()

train.py
train()
解析参数，attention选择不同的model_creator
利用model helper 和 model_creator 构建train_model, eval_model, infer_model

model_helper.py
create_train_model
根据输入的参数以及训练数据生成文件迭代器
选择要生成的模型：一般我们选择AttentionModel，AttentionModel继承了Model类，重写了_build_decoder_cell方法，

这里主要是添加了attention_mechanism，使用tf.contrib.seq2seq.AttentionWrapper生生成了lstm cell，这里使用了model_helper.py里的create_rnn_cell()

create_rnn_cell()->_cell_list(){
根据num_layers，调用最终的single_cell_fn生成单个cell，具体的single_cell的方法为model_helper.py里的_single_cell

single_cell里，
(1)根据不同的参数，可以生成不同的cell，这里包括BasicLSTMCell, GRUCell, LayerNormBasicLSTMCell,
(2)使用DropoutWarapper为每一层添加dropout
(3)根据参数添加residualwrapper
(4)添加device wrapper
}

Model里的主要方法：
_build_encoder()
_build_bidirectional_rnn()
_build_decoder_cell
Model继承自BaseModel,BaseModel的构造方法里：
init方法是真正的模型构造器，
主要包括以下几个步骤：
初始化，
embedding,
projection(output_layer Dense layer)
build train graph: build encoder, build decoder, compute loss (其中build encoder/decoder都是各个子类做具体的实现)

train.py构建完成开始train loop

猜你喜欢

转载自blog.csdn.net/xiewenbo/article/details/80586644

tensorflow nmt 源码结构梳理

tensorflow nmt源码解析

Tensorflow nmt的整体结构

Tensorflow nmt的超参数

nmt

Tensorflow/nmt里构造网络的核心代码

在pycharm和tensorflow环境下运行nmt

tensorflow nmt基本配置（tf-1.4）

Tensorflow nmt的数据预处理过程

TensorFlow NMT的数据处理过程

【NLP】NMT之RNN结构

机器学习入门0005 tensorflow_NMT模型

Tensorflow 机器翻译NMT笔记 1 快速上手

nmt 错误

nmt 观察

【NLP】NMT之BLEU

Attention-based NMT

NMT 机器翻译

nmt acl2016 参考

JDK本地内存追踪NMT

学习CANopen --- [3] NMT报文

tensorflow的结构

LuaJit源码结构梳理

Tensorflow的基本结构(什么是tensorflow)

TensorFlow源码分析——Tensor结构解析

TensorFlow技术内幕（三）：源码结构

nmt框架笔记之数据读取

NMT：神经网络机器翻译

google nmt 实验踩坑记录

subword-nmt bpe 分词的使用

今日推荐

基于大语言模型的开源知识库问答系统 MaxKB GitHub Star 数量突破 5,000 个！

美国拟限制 AI 大模型出口中国和俄罗斯

苹果将与 OpenAI 达成协议，将 ChatGPT 应用于 iPhone

openKylin 社区生态委员会第六次会议圆满召开

阿里云正式发布通义千问 2.5

Python 3.13 发布首个 Beta：实验性自由线程模式和 JIT、改进交互式解释器

Stack Overflow 拿我的代码去训练 AI 大模型，还封了我的账号

Pop!_OS 的 COSMIC 桌面完成 App Store 上架工作

报告：Django 仍然是 74% 开发者的首选

《2024 年一季度互联网投融资运行情况》研究报告

15 年前上了“FFmpeg 耻辱柱”，今天他还得谢谢咱——腾讯QQPlayer一雪前耻？

TIOBE 5 月榜单：Fortran “复活”进入 Top 10

周排行

BPM为企业带来的实际利益

好程序员web前端分享css常用属性缩写

Java文件下载（excel）

css样式的动态添加及显示和隐藏等零碎用法

axios全局配置以及拦截器

使用Logstash来实时同步MySQL和log日志数据到ES

C++获取当前时间（年月日、时分秒、毫秒）

Odoo产品分析 (四) -- 工具板块(11) -- 网站即时聊天(1)

Java环境配置正确，但是java、javac、java -version均返回“不是内部或外部命令，也不是可运行的程序或批处理文件”？

01 官网下载各种CentOS教程（超详细版）

每日归档

更多

2024-05-14(0)

2024-05-13(18)

2024-05-12(0)

2024-05-11(38)

2024-05-10(38)

2024-05-09(35)

2024-05-08(42)

2024-05-07(14)

2024-05-06(40)

2024-05-05(0)