Windows下使用caffe识别Mnist图片

这是一个windows下的使用caffe的入门教程，使用单纯编译好的exe文件来进行手写体训练并识别的教程。

按照以下的路线来叙述：

配置caffe环境—–>训练mnist的model—–>使用训练好的model来预测图片

Windows下使用caffe识别Mnist图片

一、配置Windows下的caffe环境

1.下载编译好的caffe

https://github.com/BVLC/caffe/tree/windows

选择符合你的计算机配置的版本，我们这里下载的是Release版本的caffe，所以不需要我们去对它进行编译，大家根据是否具有NVIDIA显卡去选择GPU版本或CPU版本的caffe，另外要编译python接口，根据大家使用的python版本选择一下。

我这里选择的是“Visual Studio 2015,CUDA 8.0, Python 2.7: Caffe Release”

下载完成之后解压，可以看到其简单的目录结构。

其中的bin文件夹都是编译好的exe文件和其所依赖的链接库。

2.编译python接口

PS：我这里使用的是Anaconda2的python。

注意：和Ubuntu编译python的make pycaffe的方式不同，下载的这个caffe已经将pycaffe编译好了，你只需要配置好protobuf，不然会报错如下：

caffe python error: No module named google.protobuf.internal

我试过编译protobuf-2.6.1，发现不适用于我们下载的这个caffe，所以最后改用protobuf-3.3.0并成功，我这里给出我使用的protobuf-3.3.0。

链接：https://pan.baidu.com/s/1fpIdES3sAP-6uyzr_ByO6w 密码：qhyv

解压之后进入里面的python文件夹，cmd命令行到这个目录下面，执行：

python setup.py build
python setup.py test
python setup.py install

即可完成protobuf的配置。

如果中途没有遇到bug（注意build过程需要Download文件）的话，那就说明protobuf是配置好了，可以尝试 import google.protobuf.internal 看看会不会遇到什么问题以检验一下。

没有问题的话caffe的import应该也没有问题。但是要注意caffe的python文件夹下面有一个requirements.txt，我想是Anaconda已经都配置好这些库了，所以这里没有遇到问题，如果使用的是原生的python‘而不具备这些库的话可能需要pip自行安装了。

PS：编译protobuf的python接口需要主目录下的src文件夹内有编译好的protoc.exe文件，我提供的protobuf-3.3.0里面是有的，大家在网上下载的其他的版本可能没有这个文件，另外这个是对Windows平台下的编译，此教程不适用于Linux平台。

最后，大家cmd命令行到caffe\python目录下，运行python并import caffe看看是否导入成功，如果没有报错则Windows下的pycaffe以及编译好可以使用了。

二、训练mnist的model

PS：我们先运行Mnist来验证caffe，因为我们使用的是编译好的程序，所以不需要配置其他变量，但这里需要的是下载Mnist的四个文件，在官网下载可能比较慢，这里百度云分享给大家。（这里除了四个数据文件还有接下来我用到的一些文件）

链接：https://pan.baidu.com/s/1DuMMufJBfaI8vHKWghv4Hg 密码：iqmi

如果大家百度云下载比较慢，可以参考IT之家的一篇博客，有讲解如何下载百度云的文件，我试过可以。

1.下载Mnist的数据文件

大家下载完之后得到这四个文件。我接下来的脚本是按照我的目录结构改的，使用的是相对路径，大家可以根据自己的目录结构对代码做修改。（我在主目录下创建了一个mnist文件夹，并将数据文件都放进去了）

2.定义网络结构以及设置超参数

接下来在mnist里面创建一个文件名为“lenet_train_test.prototxt”的文件来定义网络结构：

name: "LeNet"
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "./mnist_train_lmdb"
    batch_size: 64
    backend: LMDB
  }
}
layer {
  name: "mnist"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "./mnist_test_lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool2"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}

再创建一个名为“lenet_solver.prototxt”的文件来定义网络的超参数：

net: "./lenet_train_test.prototxt"
test_iter: 100
test_interval: 500
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
lr_policy: "inv"
gamma: 0.0001
power: 0.75
display: 100
max_iter: 10000
snapshot: 5000
snapshot_prefix: "./"
solver_mode: GPU

3.转化LMDB文件

有了整体网络的定义，我们就可以利用我们现有的数据来进行训练了，但是，我们不能使用我们如下的数据，需要将它们转化为caffe可识别的LMDB的文件形式。

而我们之前已经编译好的caffe提供了很多工具来方便我们使用。

我们可以使用convert_mnist_data.exe来将我们四个数据文件转化为LMDB格式的文件。这里可以把转化数据文件的操作写成一个windows批处理的脚本，双击运行就可以了。

在mnist文件夹下创建一个名为“convert_mnist_data.bat”的文件，使用记事本编译这个bat文件，复制粘贴：

..\bin\convert_mnist_data.exe  .\train-images.idx3-ubyte .\train-labels.idx1-ubyte .\mnist_train_lmdb  
echo.  
..\bin\convert_mnist_data.exe  .\t10k-images.idx3-ubyte  .\t10k-labels.idx1-ubyte .\mnist_test_lmdb 
pause

保存之后，双击convert_mnist_data.bat运行，即可看到生成的两个LMDB的文件夹。

4.训练网络并生成model

到这里我们已经具备了训练Mnist的所有条件了，这时候我们可以再写一个train_mnist.bat文件来自行我们的训练，和刚才一样创建一个名为“train_mnist.bat”的文件，复制粘贴：

..\bin\caffe.exe train --solver=./lenet_solver.prototxt
pause

我使用的是相对路径，如果大家的目录结构和我一样的话应该不会遇到什么问题。

这时候双击train_mnist.bat即调用caffe.exe对我们的数据文件进行训练了。（并且是使用GPU进行训练的）

训练完成之后，可以看到训练精度是达到0.9906的，并且过程中没迭代5000次便生成一次快照，这些在lenet_solver.prototxt都是可定义的。

三、使用训练好的model来预测图片

也就是说，我们有了model，代表我们网络的各个层的权重，当我们要利用我们现有的model去预测一张新的图片的时候，我们需要有原来的网络结构为model中的参数指明位置。

之前的lenet_train_test.prototxt定义了我们的数据集的位置，而在预测的时候不需要这些数据。

所以这次我们重新创建一个lenet.prototxt（在mnist文件夹下面）：

name: "LeNet"
layer {
  name: "data"
  type: "Input"
  top: "data"
  input_param { shape: { dim: 64 dim: 1 dim: 28 dim: 28 } }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  convolution_param {
    num_output: 50
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool2"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 10
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "prob"
  type: "Softmax"
  bottom: "ip2"
  top: "prob"
}

我们预测的时候用到了python的caffe接口，使用的是python代码调用caffe，我创建了一个“testModel.py”文件，内容如下：

PS：注意修改’D:\APP\caffe\python‘为你自己的目录；import cv2失败的同学需要把cv2.pyd拷贝到Anaconda2\Lib\site-packages目录下面，再重新import试一下，python3的同学网上找找别的方法。

import numpy as np
import cv2
import sys
sys.path.insert(0,'D:\APP\caffe\python')
import caffe

MEAN = 128
SCALE = 0.00390625
imgPath=sys.argv[1]

caffe.set_mode_gpu()
caffe.set_device(0)
net = caffe.Net('lenet.prototxt', '_iter_10000.caffemodel', caffe.TEST)
net.blobs['data'].reshape(1, 1, 28, 28)

image = cv2.imread(imgPath, cv2.IMREAD_GRAYSCALE).astype(np.float) - MEAN
image *= SCALE
net.blobs['data'].data[...] = image
output = net.forward()
pred_label = np.argmax(output['prob'][0])
print('\nPredicted digit for {} is {}'.format(imgPath, pred_label))