TensorFlow 构建3D-CNN

本章介绍，3D-CNN卷积操作和3D池化操作的计算过程，如何通过Tensor Flow，进行3D-CNN的编码。其他内容学习请前往：TensorFlow学习目录。

一、什么是3D-CNN

二、什么是3D-POOLING

三、TensorFlow实现3D-CNN、3D-POOLING函数

四、通过3D-CNN对MNIST进行识别

一、什么是3D-CNN

这部分多说那么多文字，没什么用，直接看gif

2D-CNN是对二维空间的原始图片和二维空间的卷积核进行按位相乘然后相加操作，那么3D-CNN是对三维空间的原始图像块和三维空间的卷积核进行按位相乘然后相加操作。

二、什么是3D-POOLING

3D池化操作和3D卷积是一样的，对一个立方体，进行区域主要特征提取并降维，如果是最大值池化，就找设定的ksize大小的三维空间内的最大值来代表这个块的主要特征，均值池化也是如此。

三、TensorFlow实现3D-CNN、3D-POOLING函数

def conv3d(x, filtersize, strides, padding):
    filter = tf.Variable(tf.truncated_normal(filtersize, stddev=0.1))
    bias = tf.constant(0.1, shape=[filtersize[-1]])
    return tf.nn.relu(tf.nn.conv3d(x, filter, strides, padding=padding) + bias)

首先定义一个filter，张量大小为[width, height, depth, in_channels, out_channels]
然后定义一个偏置bias，长度为[out_channels]
然后调用 $tf.nn.conv3d(inputs, filter, strides, padding,name)$
补充一点，3D-CNN和2D-CNN卷积之后的张量大小计算方式一致，请参考TensorFlow 卷积神经网络CNN

四、通过3D-CNN对MNIST进行识别

可能读者会有疑惑MNIST是2D如何做到3D，其实只是图片的深度 depth=1就是2D。
所以此处可以知道整个神经网络的入口张量为 [batch_size, width*height*depth*channels]
之后将其reshape成 [batch_size, width, height, depth, channels]

下面附上完整代码

import tensorflow as tf
import numpy as np
import sys
sys.path.insert(0, r'D:\pycodeLIB\TensorFlow')
import get_Dataset

# define some parameters
width = 28
height = 28
depth = 1
channels = 1
classes = 10
epochs = 100
batch_size = 32

x_train, y_train, x_test, y_test = get_Dataset.get_Dataset(name='mnist')

def conv3d(x, filtersize, strides, padding):
    filter = tf.Variable(tf.truncated_normal(filtersize, stddev=0.1))
    bias = tf.constant(0.1, shape=[filtersize[-1]])
    return tf.nn.relu(tf.nn.conv3d(x, filter, strides, padding=padding) + bias)

tf.nn.conv3d()

def max_pool_2x2(x, ksize, strides, padding):
    return tf.nn.max_pool3d(x, ksize=ksize, strides=strides, padding=padding)

def fattlen(x):
    shapelist = x.get_shape()[1:]
    length = shapelist[0]*shapelist[1]*shapelist[2]*shapelist[3]
    x = tf.reshape(x, [-1, length])
    return x

def fully_connected(x, neurons):
    length = int(x.get_shape()[1])
    w = tf.Variable(tf.truncated_normal(shape=[length, neurons], stddev=0.1))
    b = tf.constant(0.1, shape=[neurons])
    z = tf.nn.relu(tf.nn.xw_plus_b(x, w, b))
    return z

def build_3DCNN(x):
    x = tf.reshape(x, [-1, width, height, depth, channels])
    with tf.variable_scope('3d_cnn'):
        conv1 = conv3d(x, [5, 5, 5, 1, 64], [1, 1, 1, 1, 1], 'SAME')
        pool1 = max_pool_2x2(conv1, [1, 2, 2, 2, 1], [1, 2, 2, 2, 1], 'SAME')
        print (pool1.get_shape())
        conv2 = conv3d(pool1, [3, 3, 3, 64, 64], [1, 1, 1, 1, 1], 'SAME')

        conv3 = conv3d(conv2, [3, 3, 3, 64, 32], [1, 1, 1, 1, 1], 'SAME')
        print (conv3.get_shape())
        pool3 = max_pool_2x2(conv3, [1, 2, 2, 2, 1], [1, 2, 2, 2, 1], 'SAME')
        print (pool3.get_shape())

        fc1 = fattlen(pool3)
        fc2 = fully_connected(fc1, 400)
        fc3 = fully_connected(fc2, 200)
        fc4 = fully_connected(fc3, 10)

    return fc4

input_image = tf.placeholder(tf.float32, shape=[None, width*height*depth*channels])
labels = tf.placeholder(tf.float32, shape=[None, classes])
pre_labels = tf.placeholder(tf.int64, shape=[None])

logits = build_3DCNN(input_image)

loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=labels, logits=logits))
optimizer = tf.train.AdamOptimizer(1e-4).minimize(loss)

correct_predicton = tf.equal(tf.argmax(logits, 1), pre_labels)
ACC = tf.reduce_mean(tf.cast(correct_predicton, tf.float32))

with tf.Session(config=tf.ConfigProto(allow_soft_placement=True)) as sess:
    sess.run(tf.global_variables_initializer())

    for epoch in range(epochs):
        print ("EPOCH = ", epoch+1)

        for batch in range(int(len(x_train) / batch_size)):
            batch_xs = x_train[batch*batch_size: (batch+1)*batch_size]
            batch_ys = y_train[batch*batch_size: (batch+1)*batch_size]

            feed_dict = {
                input_image: batch_xs,
                labels: batch_ys
            }
            _, Loss = sess.run([optimizer, loss], feed_dict=feed_dict)
        acc = sess.run(ACC, feed_dict={input_image: x_test, pre_labels: y_test})
        print ("loss=", Loss, " ACC=", acc)
        print ("********************************************")

运行结果：为了不让篇幅过程我只pose除了EPOCH为1~10， 50~60，90~100的准确率以及损失

EPOCH =  1
loss= 0.043689117  ACC= 0.9712
********************************************
EPOCH =  2
loss= 0.02584479  ACC= 0.9725
********************************************
EPOCH =  3
loss= 0.0064597824  ACC= 0.9839
********************************************
EPOCH =  4
loss= 0.0030666273  ACC= 0.9845
********************************************
EPOCH =  5
loss= 0.0016372185  ACC= 0.9863
********************************************
EPOCH =  6
loss= 0.0089543415  ACC= 0.9861
********************************************
EPOCH =  7
loss= 0.0027057675  ACC= 0.9878
********************************************
EPOCH =  8
loss= 0.00074395677  ACC= 0.9848
********************************************
EPOCH =  9
loss= 0.0017212423  ACC= 0.9894
********************************************
EPOCH =  10
loss= 0.001972218  ACC= 0.9868
********************************************
.
.
.
EPOCH =  50
loss= 0.0  ACC= 0.9921
********************************************
EPOCH =  51
loss= 1.8626446e-08  ACC= 0.9897
********************************************
EPOCH =  52
loss= 0.0  ACC= 0.992
********************************************
EPOCH =  53
loss= 1.8439649e-06  ACC= 0.9915
********************************************
EPOCH =  54
loss= 2.2351738e-08  ACC= 0.9927
********************************************
EPOCH =  55
loss= 4.1350472e-07  ACC= 0.9906
********************************************
EPOCH =  56
loss= 0.0  ACC= 0.9917
********************************************
EPOCH =  57
loss= 1.4901158e-08  ACC= 0.9922
********************************************
EPOCH =  58
loss= 2.2351738e-08  ACC= 0.9915
********************************************
EPOCH =  59
loss= 0.0  ACC= 0.992
********************************************
EPOCH =  60
loss= 3.72529e-09  ACC= 0.9921
********************************************
.
.
.
EPOCH =  90
loss= 0.0  ACC= 0.9932
********************************************
EPOCH =  91
loss= 0.0  ACC= 0.9934
********************************************
EPOCH =  92
loss= 0.0  ACC= 0.9933
********************************************
EPOCH =  93
loss= 0.0  ACC= 0.9934
********************************************
EPOCH =  94
loss= 0.0  ACC= 0.9934
********************************************
EPOCH =  95
loss= 0.0  ACC= 0.9935
********************************************
EPOCH =  96
loss= 0.0  ACC= 0.9935
********************************************
EPOCH =  97
loss= 0.0  ACC= 0.9936
********************************************
EPOCH =  98
loss= 0.0  ACC= 0.9932
********************************************
EPOCH =  99
loss= 0.0  ACC= 0.9935
********************************************
EPOCH =  100
loss= 0.0  ACC= 0.9935
********************************************

1 每逢大事有静气

发布了331 篇原创文章 · 获赞 135 · 访问量 11万+

私信关注

TensorFlow 构建3D-CNN

一、什么是3D-CNN

二、什么是3D-POOLING

三、TensorFlow实现3D-CNN、3D-POOLING函数

四、通过3D-CNN对MNIST进行识别

猜你喜欢