官网实例详解4.6（cifar10_cnn_capsule.py）-keras学习笔记四

基于CIFAR10（小批量图片）数据集训练简单的胶囊（组神经元）深度卷积神经网络
代码注释
"""Train a simple CNN-Capsule Network on the CIFAR10 small images dataset.
基于CIFAR10（小批量图片）数据集训练简单的胶囊（组神经元）深度卷积神经网络
Without Data Augmentation:
It gets to 75% validation accuracy in 10 epochs,
and 79% after 15 epochs, and overfitting after 20 epochs.
不扩展数据集，10个周期后达到7%%的准确率，15个周期后达到79%的准确率，20个周期后过拟合。

With Data Augmentation:
It gets to 75% validation accuracy in 10 epochs,
and 79% after 15 epochs, and 83% after 30 epcohs.
In my test, highest validation accuracy is 83.79% after 50 epcohs.
不扩展数据集，10个周期后达到7%%的准确率，15个周期后达到79%的准确率，30个周期达到83%的准确率。
测试中，最好成绩是50个周期达到83.79%的准确率

This is a fast Implement, just 20s/epcoh with a gtx 1070 gpu.
快速实现，基于一个gtx 1070 GPU（图像处理器，显卡）每个周期执行需要20秒
"""

from __future__ import print_function
from keras import backend as K
from keras.engine.topology import Layer
from keras import activations
from keras import utils
from keras.datasets import cifar10
from keras.models import Model
from keras.layers import *
from keras.preprocessing.image import ImageDataGenerator


# the squashing function.
# 压平函数
# we use 0.5 in stead of 1 in hinton's paper.
# 在hinton的论文中，使用0.5代替1
# if 1, the norm of vector will be zoomed out.
# 如果是1，向量范数将缩小
# if 0.5, the norm will be zoomed in while original norm is less than 0.5
# 如果是1，原范数小于0.5，向量范数将放大
# and be zoomed out while original norm is greater than 0.5.
# 原范数大于0.5，向量范数将缩小
def squash(x, axis=-1):
    s_squared_norm = K.sum(K.square(x), axis, keepdims=True) + K.epsilon()
    scale = K.sqrt(s_squared_norm) / (0.5 + s_squared_norm)
    return scale * x


# define our own softmax function instead of K.softmax
# because K.softmax can not specify axis.
# 自定义softmax函数，替换K.softmax函数，因为K.softmax函数不能指定轴
def softmax(x, axis=-1):
    ex = K.exp(x - K.max(x, axis=axis, keepdims=True))
    return ex / K.sum(ex, axis=axis, keepdims=True)


# define the margin loss like hinge loss
# 定义利润边缘损失，如hinge损失
def margin_loss(y_true, y_pred):
    lamb, margin = 0.5, 0.1
    return y_true * K.square(K.relu(1 - margin - y_pred)) + lamb * (
        1 - y_true) * K.square(K.relu(y_pred - margin))


class Capsule(Layer):
    """A Capsule Implement with Pure Keras
    基于纯Keras的胶囊（组神经元）实现
    There are two vesions of Capsule.
    以下是2个版本的胶囊（组神经元）
    One is like dense layer (for the fixed-shape input),
    一种类似与全连接层（对于固定形状输入），
    and the other is like timedistributed dense (for various length input).
    一种类似时间分布的全连接层（对于变长输入），

    The input shape of Capsule must be (batch_size,
                                        input_num_capsule,
                                        input_dim_capsule
                                       )
    胶囊（组神经元）输入（数据）形状为：

    and the output shape is (batch_size,
                             num_capsule,
                             dim_capsule
                            )
     输出（数据）形状为：

    Capsule Implement is from https://github.com/bojone/Capsule/
    胶囊（组神经元）实现见：https://github.com/bojone/Capsule/
    Capsule Paper: https://arxiv.org/abs/1710.09829
    胶囊（组神经元）论文：https://arxiv.org/abs/1710.09829
    """

    def __init__(self,
                 num_capsule,
                 dim_capsule,
                 routings=3,
                 share_weights=True,
                 activation='squash',
                 **kwargs):
        super(Capsule, self).__init__(**kwargs)
        self.num_capsule = num_capsule
        self.dim_capsule = dim_capsule
        self.routings = routings
        self.share_weights = share_weights
        if activation == 'squash':
            self.activation = squash
        else:
            self.activation = activations.get(activation)

    def build(self, input_shape):
        input_dim_capsule = input_shape[-1]
        if self.share_weights:
            self.kernel = self.add_weight(
                name='capsule_kernel',
                shape=(1, input_dim_capsule,
                       self.num_capsule * self.dim_capsule),
                initializer='glorot_uniform',
                trainable=True)
        else:
            input_num_capsule = input_shape[-2]
            self.kernel = self.add_weight(
                name='capsule_kernel',
                shape=(input_num_capsule, input_dim_capsule,
                       self.num_capsule * self.dim_capsule),
                initializer='glorot_uniform',
                trainable=True)

    def call(self, inputs):
        """Following the routing algorithm from Hinton's paper,
        根据Hinton论文的路由算法，
        but replace b = b + <u,v> with b = <u,v>.
        但是 用b = <u,v>替换b = b + <u,v>

        This change can improve the feature representation of Capsule.
        这种改变可以改善胶囊（组神经元）的特征表示。

        However, you can replace
        而且，可以把替换
            b = K.batch_dot(outputs, hat_inputs, [2, 3])
        with
        使用
            b += K.batch_dot(outputs, hat_inputs, [2, 3])
        to realize a standard routing.
        实现标准路由。
        """

        if self.share_weights:
            hat_inputs = K.conv1d(inputs, self.kernel)
        else:
            hat_inputs = K.local_conv1d(inputs, self.kernel, [1], [1])

        batch_size = K.shape(inputs)[0]
        input_num_capsule = K.shape(inputs)[1]
        hat_inputs = K.reshape(hat_inputs,
                               (batch_size, input_num_capsule,
                                self.num_capsule, self.dim_capsule))
        hat_inputs = K.permute_dimensions(hat_inputs, (0, 2, 1, 3))

        b = K.zeros_like(hat_inputs[:, :, :, 0])
        for i in range(self.routings):
            c = softmax(b, 1)
            if K.backend() == 'theano':
                o = K.sum(o, axis=1)
            o = self.activation(K.batch_dot(c, hat_inputs, [2, 2]))
            if i < self.routings - 1:
                b = K.batch_dot(o, hat_inputs, [2, 3])
                if K.backend() == 'theano':
                    o = K.sum(o, axis=1)

        return o

    def compute_output_shape(self, input_shape):
        return (None, self.num_capsule, self.dim_capsule)


batch_size = 128
num_classes = 10
epochs = 100
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
y_train = utils.to_categorical(y_train, num_classes)
y_test = utils.to_categorical(y_test, num_classes)

# A common Conv2D model
# 2维卷积模型
input_image = Input(shape=(None, None, 3))
x = Conv2D(64, (3, 3), activation='relu')(input_image)
x = Conv2D(64, (3, 3), activation='relu')(x)
x = AveragePooling2D((2, 2))(x)
x = Conv2D(128, (3, 3), activation='relu')(x)
x = Conv2D(128, (3, 3), activation='relu')(x)


"""now we reshape it as (batch_size, input_num_capsule, input_dim_capsule)
then connect a Capsule layer.
重塑（数据）形状为 (batch_size, input_num_capsule, input_dim_capsule)
然后，链接一个胶囊（组神经元）层

the output of final model is the lengths of 10 Capsule, whose dim=16.
最终模型输出为16维的10个胶囊（组神经元）

the length of Capsule is the proba,
胶囊（组神经元）的长度是proba
so the problem becomes a 10 two-classification problem.
因此问题成为10个两分类问题。
"""

x = Reshape((-1, 128))(x)
capsule = Capsule(10, 16, 3, True)(x)
output = Lambda(lambda z: K.sqrt(K.sum(K.square(z), 2)))(capsule)
model = Model(inputs=input_image, outputs=output)

# we use a margin loss
# 使用边缘损失（函数）
model.compile(loss=margin_loss, optimizer='adam', metrics=['accuracy'])
model.summary()

# we can compare the performance with or without data augmentation
# 比较效果，有和没有数据集（扩大）
data_augmentation = True

if not data_augmentation:
    print('Not using data augmentation.')
    model.fit(
        x_train,
        y_train,
        batch_size=batch_size,
        epochs=epochs,
        validation_data=(x_test, y_test),
        shuffle=True)
else:
    print('Using real-time data augmentation.')
    # This will do preprocessing and realtime data augmentation:
    #  预处理和实时数据扩大（通过平移、翻转等图像变换增加图像样本数量）。
    datagen = ImageDataGenerator(
        featurewise_center=False,  # set input mean to 0 over the dataset  # 基于数据集，使输入数据平均值为0
        samplewise_center=False,  # set each sample mean to 0 # 使样本平均值为0
        featurewise_std_normalization=False,  # divide inputs by std of the dataset # 通过数据标准化划分输入数据
        samplewise_std_normalization=False,  # divide each input by its std # 通过标准化划分输入数据
        zca_whitening=False,  # apply ZCA（Zero-phase Component Analysis） whitening # 对输入数据施加ZCA白化
        rotation_range=0,  # randomly rotate images in the range (degrees, 0 to 180) # 旋转图像0-180度
        width_shift_range=0.1,  # randomly shift images horizontally (fraction of total width) # 水平平移图像（基于图像宽度比例）
        height_shift_range=0.1,  # randomly shift images vertically (fraction of total height) # 垂直平移图像（基于图像高度比例）
        horizontal_flip=True,  # randomly flip images # 水平翻转图像
        vertical_flip=False)  # randomly flip images # 垂直翻转图像

    # Compute quantities required for feature-wise normalization
    # 特征归一化的计算量
    # (std, mean, and principal components if ZCA whitening is applied).
    # （如果ZCA白化（一种降维方法）会使用标准化、均值和主成分方法）
    datagen.fit(x_train)

    # Fit the model on the batches generated by datagen.flow().
    # 使用datagen.flow()生成的批次数据在模型训练
    model.fit_generator(
        datagen.flow(x_train, y_train, batch_size=batch_size),
        epochs=epochs,
        validation_data=(x_test, y_test),
        workers=4)
代码执行
Keras详细介绍
英文：https://keras.io/
中文：http://keras-cn.readthedocs.io/en/latest/
实例下载
https://github.com/keras-team/keras
https://github.com/keras-team/keras/tree/master/examples
完整项目下载
方便没积分童鞋，请加企鹅452205574，共享文件夹。
包括：代码、数据集合（图片）、已生成model、安装库文件等。
官网实例详解4.6（cifar10_cnn_capsule.py）-keras学习笔记四

猜你喜欢