Github库:https://github.com/tensorflow/models/tree/master/research/im2txt
场景:给一张图片,描述图片内容。如下图:
环境:ubuntu,记住用户必须对目录有权限。
1、下载开源包:git clone https://github.com/tensorflow/models.git
进入Im2txt子项目:cd research/im2txt
2、安装依赖包:
First ensure that you have installed the following required packages:
- Bazel (instructions)
- Python 2.7
- TensorFlow 1.0 or greater (instructions)
- NumPy (instructions)
- Natural Language Toolkit (NLTK):
- First install NLTK (instructions)
- Then install the NLTK data package "punkt" (instructions)
- Unzip
1)Bazel: 我选择Installing using binary installer方式安装。
参考网址:https://docs.bazel.build/versions/master/install-ubuntu.html
2)NLTK:sudo pip2 install -U nltk
数据包安装:进入python2
>>>import nltk
>>> nltk.download("punkt")
3、准备训练数据
3.1 MSCOCO数据集
bazel build //im2txt:download_and_preprocess_mscoco
bazel-bin/im2txt/download_and_preprocess_mscoco "im2txt/data/mscoco"
准备成功如下图:
3.2 已训练CNN模型Inception v3 Checkpoint
wget "http://download.tensorflow.org/models/inception_v3_2016_08_28.tar.gz"
tar -xvf "inception_v3_2016_08_28.tar.gz" -C im2txt/data
rm "inception_v3_2016_08_28.tar.gz"
4、训练模型
bazel build -c opt //im2txt/...
nohup bazel-bin/im2txt/train \
--input_file_pattern="im2txt/data/mscoco/train-?????-of-00256" \
--inception_checkpoint_file="im2txt/data/inception_v3.ckpt" \
--train_dir="im2txt/model/train" \
--train_inception=false \
--number_of_steps=1000000 &
evaluation script和Fine Tune the Inception v3 Model我这里没有去运行。有需要可参照官方介绍进行。训练完成如下图。
5、图片生成文本
bazel build -c opt //im2txt:run_inference
bazel-bin/im2txt/run_inference \
--checkpoint_path=im2txt/model/train \
--vocab_file=im2txt/data/mscoco/word_counts.txt \
--input_files=im2txt/data/mscoco/raw-data/val2014/COCO_val2014_000000051008.jpg
结果:
Captions for image COCO_val2014_000000051008.jpg:
0) a cat laying on top of a laptop computer . (p=0.010717)
1) a cat laying on top of a laptop keyboard . (p=0.001352)
2) a cat laying on top of a laptop on a bed . (p=0.000764)