Google开源库Image Captioning部署记录

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/fjssharpsword/article/details/85244169

Github库:https://github.com/tensorflow/models/tree/master/research/im2txt

场景:给一张图片,描述图片内容。如下图:

环境:ubuntu,记住用户必须对目录有权限。

1、下载开源包:git clone https://github.com/tensorflow/models.git   

      进入Im2txt子项目:cd research/im2txt

2、安装依赖包:

First ensure that you have installed the following required packages:

1)Bazel: 我选择Installing using binary installer方式安装。

     参考网址:https://docs.bazel.build/versions/master/install-ubuntu.html

2)NLTK:sudo pip2 install -U nltk

     数据包安装:进入python2

>>>import nltk
>>> nltk.download("punkt")

3、准备训练数据

3.1  MSCOCO数据集

bazel build //im2txt:download_and_preprocess_mscoco
bazel-bin/im2txt/download_and_preprocess_mscoco "im2txt/data/mscoco"

准备成功如下图:

3.2 已训练CNN模型Inception v3 Checkpoint

wget "http://download.tensorflow.org/models/inception_v3_2016_08_28.tar.gz"
tar -xvf "inception_v3_2016_08_28.tar.gz" -C im2txt/data
rm "inception_v3_2016_08_28.tar.gz"

4、训练模型

bazel build -c opt //im2txt/...

nohup bazel-bin/im2txt/train \
  --input_file_pattern="im2txt/data/mscoco/train-?????-of-00256" \
  --inception_checkpoint_file="im2txt/data/inception_v3.ckpt" \
  --train_dir="im2txt/model/train" \
  --train_inception=false \
  --number_of_steps=1000000 &

evaluation script和Fine Tune the Inception v3 Model我这里没有去运行。有需要可参照官方介绍进行。训练完成如下图。

5、图片生成文本

bazel build -c opt //im2txt:run_inference
bazel-bin/im2txt/run_inference \
  --checkpoint_path=im2txt/model/train \
  --vocab_file=im2txt/data/mscoco/word_counts.txt \
  --input_files=im2txt/data/mscoco/raw-data/val2014/COCO_val2014_000000051008.jpg

 结果:

Captions for image COCO_val2014_000000051008.jpg:
  0) a cat laying on top of a laptop computer . (p=0.010717)
  1) a cat laying on top of a laptop keyboard . (p=0.001352)
  2) a cat laying on top of a laptop on a bed . (p=0.000764)

猜你喜欢

转载自blog.csdn.net/fjssharpsword/article/details/85244169