基于CIFAR10(小批量图片)数据集训练简单的胶囊(组神经元)深度卷积神经网络
代码注释
"""Train a simple CNN-Capsule Network on the CIFAR10 small images dataset. 基于CIFAR10(小批量图片)数据集训练简单的胶囊(组神经元)深度卷积神经网络 Without Data Augmentation: It gets to 75% validation accuracy in 10 epochs, and 79% after 15 epochs, and overfitting after 20 epochs. 不扩展数据集,10个周期后达到7%%的准确率,15个周期后达到79%的准确率,20个周期后过拟合。 With Data Augmentation: It gets to 75% validation accuracy in 10 epochs, and 79% after 15 epochs, and 83% after 30 epcohs. In my test, highest validation accuracy is 83.79% after 50 epcohs. 不扩展数据集,10个周期后达到7%%的准确率,15个周期后达到79%的准确率,30个周期达到83%的准确率。 测试中,最好成绩是50个周期达到83.79%的准确率 This is a fast Implement, just 20s/epcoh with a gtx 1070 gpu. 快速实现,基于一个gtx 1070 GPU(图像处理器,显卡)每个周期执行需要20秒 """ from __future__ import print_function from keras import backend as K from keras.engine.topology import Layer from keras import activations from keras import utils from keras.datasets import cifar10 from keras.models import Model from keras.layers import * from keras.preprocessing.image import ImageDataGenerator # the squashing function. # 压平函数 # we use 0.5 in stead of 1 in hinton's paper. # 在hinton的论文中,使用0.5代替1 # if 1, the norm of vector will be zoomed out. # 如果是1,向量范数将缩小 # if 0.5, the norm will be zoomed in while original norm is less than 0.5 # 如果是1,原范数小于0.5,向量范数将放大 # and be zoomed out while original norm is greater than 0.5. # 原范数大于0.5,向量范数将缩小 def squash(x, axis=-1): s_squared_norm = K.sum(K.square(x), axis, keepdims=True) + K.epsilon() scale = K.sqrt(s_squared_norm) / (0.5 + s_squared_norm) return scale * x # define our own softmax function instead of K.softmax # because K.softmax can not specify axis. # 自定义softmax函数,替换K.softmax函数,因为K.softmax函数不能指定轴 def softmax(x, axis=-1): ex = K.exp(x - K.max(x, axis=axis, keepdims=True)) return ex / K.sum(ex, axis=axis, keepdims=True) # define the margin loss like hinge loss # 定义利润边缘损失,如hinge损失 def margin_loss(y_true, y_pred): lamb, margin = 0.5, 0.1 return y_true * K.square(K.relu(1 - margin - y_pred)) + lamb * ( 1 - y_true) * K.square(K.relu(y_pred - margin)) class Capsule(Layer): """A Capsule Implement with Pure Keras 基于纯Keras的胶囊(组神经元)实现 There are two vesions of Capsule. 以下是2个版本的胶囊(组神经元) One is like dense layer (for the fixed-shape input), 一种类似与全连接层(对于固定形状输入), and the other is like timedistributed dense (for various length input). 一种类似时间分布的全连接层(对于变长输入), The input shape of Capsule must be (batch_size, input_num_capsule, input_dim_capsule ) 胶囊(组神经元)输入(数据)形状为: and the output shape is (batch_size, num_capsule, dim_capsule ) 输出(数据)形状为: Capsule Implement is from https://github.com/bojone/Capsule/ 胶囊(组神经元)实现见:https://github.com/bojone/Capsule/ Capsule Paper: https://arxiv.org/abs/1710.09829 胶囊(组神经元)论文:https://arxiv.org/abs/1710.09829 """ def __init__(self, num_capsule, dim_capsule, routings=3, share_weights=True, activation='squash', **kwargs): super(Capsule, self).__init__(**kwargs) self.num_capsule = num_capsule self.dim_capsule = dim_capsule self.routings = routings self.share_weights = share_weights if activation == 'squash': self.activation = squash else: self.activation = activations.get(activation) def build(self, input_shape): input_dim_capsule = input_shape[-1] if self.share_weights: self.kernel = self.add_weight( name='capsule_kernel', shape=(1, input_dim_capsule, self.num_capsule * self.dim_capsule), initializer='glorot_uniform', trainable=True) else: input_num_capsule = input_shape[-2] self.kernel = self.add_weight( name='capsule_kernel', shape=(input_num_capsule, input_dim_capsule, self.num_capsule * self.dim_capsule), initializer='glorot_uniform', trainable=True) def call(self, inputs): """Following the routing algorithm from Hinton's paper, 根据Hinton论文的路由算法, but replace b = b + <u,v> with b = <u,v>. 但是 用b = <u,v>替换b = b + <u,v> This change can improve the feature representation of Capsule. 这种改变可以改善胶囊(组神经元)的特征表示。 However, you can replace 而且,可以把替换 b = K.batch_dot(outputs, hat_inputs, [2, 3]) with 使用 b += K.batch_dot(outputs, hat_inputs, [2, 3]) to realize a standard routing. 实现标准路由。 """ if self.share_weights: hat_inputs = K.conv1d(inputs, self.kernel) else: hat_inputs = K.local_conv1d(inputs, self.kernel, [1], [1]) batch_size = K.shape(inputs)[0] input_num_capsule = K.shape(inputs)[1] hat_inputs = K.reshape(hat_inputs, (batch_size, input_num_capsule, self.num_capsule, self.dim_capsule)) hat_inputs = K.permute_dimensions(hat_inputs, (0, 2, 1, 3)) b = K.zeros_like(hat_inputs[:, :, :, 0]) for i in range(self.routings): c = softmax(b, 1) if K.backend() == 'theano': o = K.sum(o, axis=1) o = self.activation(K.batch_dot(c, hat_inputs, [2, 2])) if i < self.routings - 1: b = K.batch_dot(o, hat_inputs, [2, 3]) if K.backend() == 'theano': o = K.sum(o, axis=1) return o def compute_output_shape(self, input_shape): return (None, self.num_capsule, self.dim_capsule) batch_size = 128 num_classes = 10 epochs = 100 (x_train, y_train), (x_test, y_test) = cifar10.load_data() x_train = x_train.astype('float32') x_test = x_test.astype('float32') x_train /= 255 x_test /= 255 y_train = utils.to_categorical(y_train, num_classes) y_test = utils.to_categorical(y_test, num_classes) # A common Conv2D model # 2维卷积模型 input_image = Input(shape=(None, None, 3)) x = Conv2D(64, (3, 3), activation='relu')(input_image) x = Conv2D(64, (3, 3), activation='relu')(x) x = AveragePooling2D((2, 2))(x) x = Conv2D(128, (3, 3), activation='relu')(x) x = Conv2D(128, (3, 3), activation='relu')(x) """now we reshape it as (batch_size, input_num_capsule, input_dim_capsule) then connect a Capsule layer. 重塑(数据)形状为 (batch_size, input_num_capsule, input_dim_capsule) 然后,链接一个胶囊(组神经元)层 the output of final model is the lengths of 10 Capsule, whose dim=16. 最终模型输出为16维的10个胶囊(组神经元) the length of Capsule is the proba, 胶囊(组神经元)的长度是proba so the problem becomes a 10 two-classification problem. 因此问题成为10个两分类问题。 """ x = Reshape((-1, 128))(x) capsule = Capsule(10, 16, 3, True)(x) output = Lambda(lambda z: K.sqrt(K.sum(K.square(z), 2)))(capsule) model = Model(inputs=input_image, outputs=output) # we use a margin loss # 使用边缘损失(函数) model.compile(loss=margin_loss, optimizer='adam', metrics=['accuracy']) model.summary() # we can compare the performance with or without data augmentation # 比较效果,有和没有数据集(扩大) data_augmentation = True if not data_augmentation: print('Not using data augmentation.') model.fit( x_train, y_train, batch_size=batch_size, epochs=epochs, validation_data=(x_test, y_test), shuffle=True) else: print('Using real-time data augmentation.') # This will do preprocessing and realtime data augmentation: # 预处理和实时数据扩大(通过平移、翻转等图像变换增加图像样本数量)。 datagen = ImageDataGenerator( featurewise_center=False, # set input mean to 0 over the dataset # 基于数据集,使输入数据平均值为0 samplewise_center=False, # set each sample mean to 0 # 使样本平均值为0 featurewise_std_normalization=False, # divide inputs by std of the dataset # 通过数据标准化划分输入数据 samplewise_std_normalization=False, # divide each input by its std # 通过标准化划分输入数据 zca_whitening=False, # apply ZCA(Zero-phase Component Analysis) whitening # 对输入数据施加ZCA白化 rotation_range=0, # randomly rotate images in the range (degrees, 0 to 180) # 旋转图像0-180度 width_shift_range=0.1, # randomly shift images horizontally (fraction of total width) # 水平平移图像(基于图像宽度比例) height_shift_range=0.1, # randomly shift images vertically (fraction of total height) # 垂直平移图像(基于图像高度比例) horizontal_flip=True, # randomly flip images # 水平翻转图像 vertical_flip=False) # randomly flip images # 垂直翻转图像 # Compute quantities required for feature-wise normalization # 特征归一化的计算量 # (std, mean, and principal components if ZCA whitening is applied). # (如果ZCA白化(一种降维方法)会使用标准化、均值和主成分方法) datagen.fit(x_train) # Fit the model on the batches generated by datagen.flow(). # 使用datagen.flow()生成的批次数据在模型训练 model.fit_generator( datagen.flow(x_train, y_train, batch_size=batch_size), epochs=epochs, validation_data=(x_test, y_test), workers=4)
代码执行
Keras详细介绍
中文:http://keras-cn.readthedocs.io/en/latest/
实例下载
https://github.com/keras-team/keras
https://github.com/keras-team/keras/tree/master/examples
完整项目下载
方便没积分童鞋,请加企鹅452205574,共享文件夹。
包括:代码、数据集合(图片)、已生成model、安装库文件等。