与conv2d不同的是不会将channels混合在一起。输入的张量维度是[batch, in_height, in_width, channels]，filter的维度是[filter_height, filter_width, in_channels, channel_multiplier]。函数有channel_multiplier个不同的卷积核，独立的用到张量上，叠加在一起，最后通道总数是channel_multiplier*in_channels。

input_data = tf.Variable(np.random.rand(10,9,9,3), dtype=np.float32)
filter_data = tf.Variable(np.random.rand(2,2,3,2), dtype=np.float32)
y = tf.nn.depthwise_conv2d(input_data, filter_data, strides =[1,1,1,1], padding = "SAME")
print(y.shape)
# (10, 9, 9, 6)

3 tf.nn.separable_conv2d(input, depthwise_filter, pointwise_filter, strides, padding, name=None)

利用几个分离的卷积核去做卷积。首先用depthwise_filter做卷积，效果与depthwise_conv2d相同，然后用1x1的卷积核pointwise_filter去做卷积。pointwies_filter [1,1, multiplier*channels, out_channels]

input_data = tf.Variable(np.random.rand(10,9,9,3), dtype=np.float32)
depthwise_filter = tf.Variable(np.random.rand(2,2,3,2), dtype=np.float32)
pointwise_filter = tf.Variable(np.random.rand(1,1,6,12), dtype=np.float32)
y = tf.nn.separable_conv2d(input_data, depthwise_filter, pointwise_filter, strides =[1,1,1,1], padding = "SAME")
print(y.shape)
# (10, 9, 9, 12)

4 更多卷积函数

tf.nn.atrous_conv2d:计算Atrous卷积
tf.nn.conv2d_transpose:反卷积，其实是卷积的转置
tf.nn.conv1d:一维的
tf.nn.conv3d:三维的

2 池化函数

1 tf.nn.avg_pool(value, ksize, strides, padding, name=None)

ksize 对应每一维的窗口大小，avg是把窗口中的所有数求一个平均数，stride对应每一维的移动幅度。
输出每个维度的大小计算：

选择 SAME 的模式：
在各个方向的边缘会补0，也就是如果stride移动都为1的话会是原来的大小，计算公式：(value.shape + 1) / stride，如果不能整除，最后的数据会丢弃不选。
选择 VALID 的模式：
边缘不会补0，公式是：(value.shape - ksize + 1) / stride

input_data = tf.Variable(np.random.rand(10,5,5,3), dtype=np.float32)
filter_data = tf.Variable(np.random.rand(2,2,3,5), dtype=np.float32)
y = tf.nn.conv2d(input_data, filter_data, strides =[1,1,1,1], padding = "SAME")
print(y.shape)
# (10, 5, 5, 5)
output = tf.nn.avg_pool(value=y, ksize=[1,2,2,1],strides=[1,2,2,1],padding="VALID")
print(output.shape)
# (10, 2, 2, 5)
output = tf.nn.avg_pool(value=y, ksize=[1,2,2,1],strides=[1,2,2,1],padding="SAME")
print(output.shape)
# (10, 3, 3, 5)

2 tf.nn.max_pool(value, ksize, strides, padding, name=None)

与avg一样，唯一不同是每次从整个窗口中选出最大的作为代表而不是求平均数。

3 分类函数

1 tf.nn.sigmoid_cross_entropy_with_logits(logits, targets, name=None)

输入：logits:[batch_size, num_classes], targets:[batch_size, size].logits，使用此函数时候最后一层不需要进行sigmoid操作，函数内部包含了这个操作。

2 tf.nn.softmax(logits, name=None)

softmax = exp(logits) / reduce_sum(exp(logits), dim)

3 tf.nn.softmax_cross_entropy_with_logits(logits, labels, name=None)

输入：logits和labels 都是[batch_size, num_classes]，loss中保存batch中每个样本的交叉熵。

4 简单示例

在mnist数据集上与线性回归作比较，线性回归例子：

# 下载mnist地址：http://yann.lecun.com/exdb/mnist/
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf

# 存数据的路径
dir_s = "/Users/xiayongtao/Downloads/tensorflow_note/CNN/mnist"
mnist = input_data.read_data_sets(dir_s, one_hot = True)

# 构建回归模型
# 定义模型中的节点
x = tf.placeholder(tf.float32, [None, 784])
y_ = tf.placeholder(tf.float32, [None, 10])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.matmul(x, W) + b

coss_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y, labels=y_))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(coss_entropy)

sess = tf.InteractiveSession()
# 使用tf.InteractiveSession()来构建会话的时候，
# 我们可以先构建一个session然后再定义操作（operation）
# 如果我们使用tf.Session()来构建会话我们需要在会话构建之前定义好全部的操作（operation）然后再构建会话。
tf.global_variables_initializer().run()

# train
for _ in range(1000):
    batch_xs, batch_ys = mnist.train.next_batch(100)
    sess.run(train_step, feed_dict = {x:batch_xs, y_:batch_ys})

# 评价
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
# argmax：返回y最大值的坐标，0和1
# 0 是 列：各个数组相同位置的数最大值
# 1 是 行：每个数组中最大的数
# 该数据是(100, 10)应该用0
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict = {x:mnist.test.images, y_:mnist.test.labels}))

# 准确率：0.9202

卷积例子：

扫描二维码关注公众号，回复： 3236859 查看本文章

# 下载mnist地址：http://yann.lecun.com/exdb/mnist/
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf

# 存数据的路径
dir_s = "/Users/xiayongtao/Downloads/tensorflow_note/CNN/mnist"
mnist = input_data.read_data_sets(dir_s, one_hot = True)

train_X, train_Y, test_X, test_Y = mnist.train.images, mnist.train.labels, mnist.test.images, mnist.test.labels

print(train_X.shape)
# (55000, 784)
# 要做卷积，要换成(batch_size, height, width, channels)的形式，mnist黑白图channels为1，RGB图为3
train_X = train_X.reshape(-1, 28, 28, 1)
test_X = test_X.reshape(-1, 28, 28, 1)

# 权重初始化函数：
def init_weights(shape):
    return tf.Variable(tf.random_normal(shape, stddev=0.01))

# 定义卷积核：
# 定义一个3*3的卷积和，1与channels一致，叠加32层，输出维度(channels)为32
w1 = init_weights([3, 3, 1, 32]) 
w2 = init_weights([3, 3, 32, 64])
w3 = init_weights([3, 3, 64, 128])
# 定义全链接层：
# 需要计算输出维度，初始值为28，SAME模式，计算公式：(value + 1) / stride
# strides取2，执行了3此则为4，将其全连接到一个全连接分类层上
w4 = init_weights([128*4*4, 625])
# 连接到分类层
w5 = init_weights([625, 10])

# 定义卷积层函数
def conv_layer(in_data, filter_data, pooling_size, pooling_stride, p_keep_conv):
    conv_in = tf.nn.conv2d(in_data, filter_data, strides=[1,1,1,1], padding="SAME")
    activate_conv = tf.nn.relu(conv_in)
    # 深度神经网络一般用relu，sigmoid容易梯度消失
    pooling = tf.nn.max_pool(activate_conv, ksize=pooling_size, strides=pooling_stride, padding="SAME")
    out = tf.nn.dropout(pooling, p_keep_conv)
    return out

# 定义全连接层函数
def fc_layer(in_data, w, p_keep_fc):
    l = tf.matmul(in_data, w)
    l1 = tf.nn.relu(l)
    l2 = tf.nn.dropout(l1, p_keep_fc)
    return l2

X = tf.placeholder(tf.float32, [None, 28, 28, 1])
Y = tf.placeholder(tf.float32, [None, 10])

p_keep_conv = tf.placeholder("float")
p_keep_fc = tf.placeholder("float")
pooling_size = [1, 2, 2, 1]
pooling_stride = [1, 2, 2, 1]

# 定义model：
l1 = conv_layer(X, w1, pooling_size, pooling_stride, p_keep_conv)
l2 = conv_layer(l1, w2, pooling_size, pooling_stride, p_keep_conv)
l3 = conv_layer(l2, w3, pooling_size, pooling_stride, p_keep_conv)
l3 = tf.reshape(l3, [-1, 2048])
l4 = fc_layer(l3, w4, p_keep_fc)

output = tf.matmul(l4, w5)

# 定义损失函数
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=output,labels=Y))
# 优化方法
train_op = tf.train.RMSPropOptimizer(0.001, 0.9).minimize(cost)
predict_op = tf.argmax(output, 1)

# 训练
batch_size = 128
test_size = 256

with tf.Session() as sess:
    tf.global_variables_initializer().run()
    for i in range(10):
        training_batch = zip(range(0, len(train_X), batch_size),
                             range(batch_size, len(train_X)+1, batch_size))
        for start, end in training_batch:
            sess.run(train_op, feed_dict={X: train_X[start:end], Y: train_Y[start:end],
                                         p_keep_conv:0.8, p_keep_fc:0.5})
        test_indices = np.arange(len(test_X))
        np.random.shuffle(test_indices)
        test_indices = test_indices[0:test_size]

        print(i, np.mean(np.argmax(test_Y[test_indices], axis=1) ==sess.run(predict_op, feed_dict：{X:test_X[test_indices],                                                  p_keep_conv: 1,                                                    p_keep_fc:1})))

# 10个epoch后准确率是99%

5 保存和读取模型

存储数据：tf.train.Saver()
读取数据：saver.restore()

tensorflow--CNN

目录

1 卷积函数

1 tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=None, name=None)

2 tf.nn.depthwise_conv2d(input, filter, strides, padding, name=None)