Chainer快速上手篇一小众的AI框架之MNIST单机训练

1、写在前面的话

最近一直在忙找工作、毕业大论文的事情，也没多少时间来维护博客以及自己的Github！时间如白驹过隙，自己最近也是在公司实习，公司有个日本客户需要用到Chainer这个并不常见的深度学习框架，所以让我有机会接触Chainer。正好抓住2018年的尾巴写下这最后一篇博客，以此开启2019新篇章，( •̀ ω •́ )YE！

2、让大家熟悉下Chainer

大家可能对Chainer这个词很陌生，但是如果我说tensorflow，keras，caffe这些深度学习框架呢？哦，大家是不是想起什么了，没错，就是你心里想的那样！Chainer诞生于2015年，于2016年转向开源正式进入公众视野，尽管其GitHub代码库非常活跃，但却并没能引起业界的应有重视。可这并不影响该框架的性能，英特尔公司就决定将Chainer作为一种理想的AI工作负载开发途径，并以此为基础促进自家芯片的市场需求量。而且该框架在日本也被广泛使用，存在即合理，就让我们开始学习Chainer这一框架吧！
Chainer：A Powerful, Flexible, and Intuitive Framework for Neural Networks，这是Chainer的官网链接.

3、为什么要用Chainer？

基于Python的 - Chainer是用Python开发的，允许在运行时检查和自定义python中的所有代码和可理解的python消息；
目前大多数深度学习框架都是基于Define-and-Run的方案，而Chainer采用Define-by-Run的方案，神经网络定义在运行时即时定义，允许网络动态更改。Define-by-Run的方案是结构领着数据走，有了结构才能够通过喂数据来训练网络。而Define-by-Run的方案是数据领着结构走，有了数据参数的定义才有网络的概念，数据走到哪，网络延伸到哪；
完全可定制 - 由于Chainer底层代码也是Python，所有类别和方法都可以适应最新的版本或专业方法；
广泛而深入的支持 - Chainer积极地用于当前神经网络（CNN，RNN，RL等）的大多数方法，积极地在开发时添加新方法，并为多种硬件提供支持以及提供多GPU的并行化。

4、如何安装Chainer？

4.1 安装环境

Ubuntu14.04/16.04 LTS（64bit）或者CentOS 7（64bit）
Python 2.7.6+，3.4.3+，3.5.1+，3.6.0+
Numpy 1.9，1.10，1.11，1.12，1.13，1.14，1.15
NVIDIA CUDA GPU ,Compute Capability of the GPU must be at least 3.0
CUDA Toolki 支持版本: 7.0, 7.5, 8.0, 9.0, 9.1 and 9.2

4.2 开始安装

1、更新系统的源

sudo apt-get update -y

2、安装python环境，我是直接安装的(2.7.6+版本），你们可自行通过python官网下载想要安装的python版本。

sudo apt-get install python

3、upgrade下pip工具以及setuptools工具

pip install -U setuptools pip

4、安装chainer，vvvv代表想要安装chainer的版本，可为空，则会安装系统默认支持版本

pip install chainer=vvvv --no-cache-dir

至此，chainer已经可以支持CPU运行了。但是若想其运行在GPU上还需在你电脑上安装NVIDIA CUDA / cuDNN环境，并且需安装与CUDA对应版本的CuPy包。CuPy是CUDA上与NumPy兼容的多维数组的实现。Cupy由 cupy.ndarray构成，是多维数组类的核心，很多函数都在里面，同时也支持numpy.ndarray的接口。
5、安装CuPy，想要了解更多CuPy内容可参考链接

(For CUDA 8.0)
pip install cupy-cuda80
 
(For CUDA 9.0)
pip install cupy-cuda90
 
(For CUDA 9.1)
pip install cupy-cuda91
 
(For CUDA 9.2)
pip install cupy-cuda92

6、卸载chainer

pip uninstall chainer

5、利用Chainer实践Mnist例子

直接附上代码：

#!/usr/bin/env python
# coding: utf-8

from __future__ import print_function
import numpy as np
import chainer
from chainer import backend, backends
from chainer.backends import cuda
from chainer import Function, report, training, utils, Variable
from chainer import datasets, iterators, optimizers, serializers
from chainer import Link, Chain, ChainList
import chainer.functions as F
import chainer.links as L
import matplotlib.pyplot as plt
from chainer.datasets import mnist
from chainer import iterators
from chainer.dataset import concat_examples
from chainer.backends.cuda import to_cpu

train, test = mnist.get_mnist(withlabel=True, ndim=1)
x, t = train[0]
plt.imshow(x.reshape(28, 28), cmap="gray")
plt.savefig("5.png")

batchsize = 128

train_iter = iterators.SerialIterator(train, batchsize)
test_iter = iterators.SerialIterator(test, batchsize, repeat=False, shuffle=False)

# 定义训练模型
class MyNetwork(Chain):

    def __init__(self, n_mid_units=100, n_out=10):
        super(MyNetwork, self).__init__()
        with self.init_scope():
            self.l1 = L.Linear(None, n_mid_units)
            self.l2 = L.Linear(n_mid_units, n_mid_units)
            self.l3 = L.Linear(n_mid_units, n_out)

    def forward(self, x):
        h = F.relu(self.l1(x))
        h = F.relu(self.l2(h))
        return self.l3(h)


model = MyNetwork()

gpu_id = -1  # Set to 0 if you use GPU
if gpu_id >= 0:
    model.to_gpu(gpu_id)


# 定义优化器
optimizer = optimizers.MomentumSGD(lr=0.01, momentum=0.9)
optimizer.setup(model)

# 开始训练模型
max_epoch = 20
while train_iter.epoch < max_epoch:

    # ---------- One iteration of the training loop ----------
    train_batch = train_iter.next()
    image_train, target_train = concat_examples(train_batch, gpu_id)

    # Calculate the prediction of the network
    prediction_train = model(image_train)

    # Calculate the loss with softmax_cross_entropy
    loss = F.softmax_cross_entropy(prediction_train, target_train)

    # Calculate the gradients in the network
    model.cleargrads()
    loss.backward()

    # Update all the trainable parameters
    optimizer.update()
    # --------------------- until here ---------------------

    # Check the validation accuracy of prediction after every epoch
    if train_iter.is_new_epoch:  # If this iteration is the final iteration of the current epoch

        # Display the training loss
        print('epoch:{:02d} train_loss:{:.04f} '.format(
            train_iter.epoch, float(to_cpu(loss.data))), end='')

        test_losses = []
        test_accuracies = []
        while True:
            test_batch = test_iter.next()
            image_test, target_test = concat_examples(test_batch, gpu_id)

            # Forward the test data
            prediction_test = model(image_test)

            # Calculate the loss
            loss_test = F.softmax_cross_entropy(prediction_test, target_test)
            test_losses.append(to_cpu(loss_test.data))

            # Calculate the accuracy
            accuracy = F.accuracy(prediction_test, target_test)
            accuracy.to_cpu()
            test_accuracies.append(accuracy.data)

            if test_iter.is_new_epoch:
                test_iter.epoch = 0
                test_iter.current_position = 0
                test_iter.is_new_epoch = False
                test_iter._pushed_position = None
                break

        print('val_loss:{:.04f} val_accuracy:{:.04f}'.format(
            np.mean(test_losses), np.mean(test_accuracies)))

# 保存最佳模型
serializers.save_npz('train_mnist.model', model)

# 利用保存的模型对新数据进行预测
model = MyNetwork()
serializers.load_npz('train_mnist.model', model)

x, t = test[0]
plt.imshow(x.reshape(28, 28), cmap='gray')
plt.savefig('7.png')
print('label: ', t)

# 预测
print(x.shape, end=' -> ')
x = x[None, ...]
print(x.shape)

y = model(x)
y = y.data
pred_label = y.argmax(axis=1)
print("predicted label:", pred_label[0])

代码不难，我也在一些关键部分作了注释，大家想要了解更多关于Chainer深度学习网络框架的内容，请移步到Chainer的官网。
注意：如果你已配好chainer的GPU环境，上述代码同样可以在GPU上运行，只要将代码中的gpu_id = -1设为gpu_id = 0即可。