T2T Transformer 笔记

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/hellonlp/article/details/78753205


讨论:

https://www.jiqizhixin.com/articles/2017-06-28-5

https://ricardokleinklein.github.io/2017/11/16/Attention-is-all-you-need.html


1. Mutli GPU 和 Single 配置的区别

https://github.com/tensorflow/tensor2tensor/issues/124

https://github.com/tensorflow/tensor2tensor/issues/17


2. 用Multi GUP在跑相同step的速度比Single GPU慢

https://github.com/tensorflow/tensor2tensor/issues/146

https://github.com/tensorflow/tensor2tensor/issues/390


3. batch size参数

https://github.com/tensorflow/tensor2tensor/issues/17#issuecomment-310268149

https://github.com/tensorflow/tensor2tensor/issues/415#issue-273498229


4. 数据处理

4.1)https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/bin/t2t-datagen

generate_data_for_problem(problem) 


4.2)https://github.com/tensorflow/tensor2tensor/blob/92983eaaa457ec18729b1883ba5ae4a6614bdcb5/tensor2tensor/data_generators/generator_utils.py

generate_files(generator,output_filenames,max_cases=None)

"""Generate cases from a generator and save as TFRecord files.  


Generated cases are transformed to tf.Example protos and saved as TFRecords in sharded files named output_dir/output_name-00..N-of-00..M=num_shards.


 Args: generator: a generator yielding (string -> int/float/str list) dictionaries. output_filenames: List of output file paths. max_cases: maximum number of cases to get from the generator; if None (default), we use the generator until StopIteration is raised. """


注意:

writers[shard].write(sequence_example.SerializeToString()) 序列化数据集


4.3)

https://github.com/tensorflow/tensor2tensor/blob/92983eaaa457ec18729b1883ba5ae4a6614bdcb5/tensor2tensor/data_generators/generator_utils.py

get_or_generate_vocab(data_dir, tmp_dir, vocab_filename, vocab_size, sources)


get_or_generate_vocab_inner(data_dir, vocab_filename, vocab_size, generator)

"""Inner implementation for vocab generators. 


Args: 

data_dir: The base directory where data and vocab files are stored. If None, then do not save the vocab even if it doesn't exist. 

vocab_filename: relative filename where vocab file is stored 

vocab_size: target size of the vocabulary constructed by SubwordTextEncoder 

generator: a generator that produces tokens from the vocabulary 


Returns: A SubwordTextEncoder vocabulary object. """

vocab = text_encoder.SubwordTextEncoder.build_to_target_size( vocab_size, token_counts, 1, 1e3)


4.4)https://github.com/tensorflow/tensor2tensor/blob/e3cd447aa605515753ebfc3dbf1a4d4c5ae32425/tensor2tensor/data_generators/text_encoder.py

build_to_target_size(cls, target_size, token_counts, min_val, max_val, num_iterations=4)

"""Builds a SubwordTextEncoder that has `vocab_size` near `target_size`. 


Uses simple recursive binary search to find a minimum token count that most closely matches the `target_size`. 


Args: target_size: Desired vocab_size to approximate. 

token_counts: A dictionary of token counts, mapping string to int. 

min_val: An integer; lower bound for the minimum token count. 

max_val: An integer; upper bound for the minimum token count. 

num_iterations: An integer; how many iterations of refinement. Returns: A SubwordTextEncoder instance. 

Raises: ValueError: If `min_val` is greater than `max_val`. """


一个重要概念:minimum token count
"""Bisection to find the right size."""

# We build iteratively. On each iteration, we segment all the words, # then count the resulting potential subtokens, keeping the ones # with high enough counts for our new vocabulary.


5. 训练

https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/layers/common_hparams.py


6. Server

6.1 https://research.googleblog.com/2017/11/latest-innovations-in-tensorflow-serving.html

6.2 https://towardsdatascience.com/how-to-deploy-machine-learning-models-with-tensorflow-part-1-make-your-model-ready-for-serving-776a14ec3198

6.3 http://blog.csdn.net/wangjian1204/article/details/68928656

6.4  https://weiminwang.blog/2017/09/12/introductory-guide-to-tensorflow-serving/

6.5 https://github.com/tensorflow/tensor2tensor/issues/368

6.6 https://github.com/tensorflow/tensor2tensor/issues/349


1)跑Big model崩溃了
tensorflow.python.framework.errors_impl.InvalidArgumentError: Number of ways to split should evenly divide the split dimension, but got split_dim 0 (size = 4) and num_split 3

Caused by op u'transformer/split', defined at: ...

参考:https://github.com/tensorflow/tensor2tensor/issues/266

直接把batch size开小解决了问题。

2)文件取名

newsdev2017-zhen-src.pre.bpe.zh  这个名字t2t会认为是tar文件,报错:tarfile.ReadError: file could not be opened successfully



猜你喜欢

转载自blog.csdn.net/hellonlp/article/details/78753205
今日推荐