使用Tensorflow实现CNN使用的函数小结

给图片添加维度np.expand_dims()

Image = np.expand_dims(np.expand_dims(img, 0), -1)

expand_dims(a, axis)
Expand the shape of an array.
无论a是多少维的数组，主要是看axis的取值。
例子参考下面的博客。

https://blog.csdn.net/qq_16949707/article/details/53418912
https://blog.csdn.net/qq_35860352/article/details/80463111

卷积 tf.nn.conv2d

方法定义
tf.nn.conv2d (input, filter, strides, padding, use_cudnn_on_gpu=None, data_format=None, name=None)

参数：
input : 输入的要做卷积的图片，要求为一个张量，shape为 [ batch, in_height, in_weight, in_channel ]，其中batch为图片的数量，in_height 为图片高度，in_weight 为图片宽度，in_channel 为图片的通道数，灰度图该值为1，彩色图为3。（也可以用其它值，但是具体含义不是很理解）
filter：卷积核，要求也是一个张量，shape为 [ filter_height, filter_weight, in_channel, out_channels ]，其中 filter_height 为卷积核高度，filter_weight 为卷积核宽度，in_channel 是图像通道数，和 input 的 in_channel 要保持一致，out_channel 是卷积核数量。
strides：卷积时在图像每一维的步长，这是一个一维的向量，[ 1, strides, strides, 1]，第一位和最后一位固定必须是1
padding： string类型，值为“SAME” 和 “VALID”，表示的是卷积的形式，是否考虑边界。”SAME”是考虑边界，不足的时候用0去填充周围，”VALID”则不考虑
use_cudnn_on_gpu： bool类型，是否使用cudnn加速，默认为true

import tensorflow as tf
# case 1
# 输入是1张 3*3 大小的图片，图像通道数是5，卷积核是 1*1 大小，数量是1
# 步长是[1,1,1,1]最后得到一个 3*3 的feature map
# 1张图最后输出就是一个 shape为[1,3,3,1] 的张量
input = tf.Variable(tf.random_normal([1,3,3,5]))
filter = tf.Variable(tf.random_normal([1,1,5,1]))
op1 = tf.nn.conv2d(input, filter, strides=[1,1,1,1], padding='SAME')
init = tf.initialize_all_variables() 
with tf.Session() as sess:
	sess.run(init)
	print('*' * 20 + ' op1 ' + '*' * 20)
	print(sess.run(op1))

https://blog.csdn.net/zuolixiangfisher/article/details/80528989

激活函数 tf.nn.relu

使用起来很简单，只需要将是上层得到的输入传进去就行。
关于Relu，可以阅读下面的博客。

https://blog.csdn.net/cherrylvlei/article/details/53149381

数组的降维np.reshape()

在画图的时候使用到了一个np.reshape(input, (1, -1));
假设input矩阵的shape为(A,B)，使用reshape(input，(C,D))得到的新数组元素数量与原数组元素数量要相等。即AB=CD.当一个参数为-1时，reshape函数会根据另一个参数的维度计算出数组的另外一个shape属性值。例子参考下面的博客。

https://blog.csdn.net/weixin_39449570/article/details/78619196

池化 tf.nn.max_pool

定义：
tf.nn.max_pool(value, ksize, strides, padding, name=None)
参数是四个，和卷积很类似：
第一个参数value：需要池化的输入，一般池化层接在卷积层后面，所以输入通常是feature map，依然是[batch, height, width, channels]这样的shape
第二个参数ksize：池化窗口的大小，取一个四维向量，一般是[1, height, width, 1]，因为我们不想在batch和channels上做池化，所以这两个维度设为了1
第三个参数strides：和卷积类似，窗口在每一个维度上滑动的步长，一般也是[1, stride,stride, 1]
第四个参数padding：和卷积类似，可以取’VALID’ 或者’SAME’
返回一个Tensor，类型不变，shape仍然是[batch, height, width, channels]这种形式。
例子参看下面的博客。

https://blog.csdn.net/mao_xiao_feng/article/details/53453926

tf.nn.dropout

定义
tf.nn.dropout(x, keep_prob, noise_shape=None, seed=None,name=None)
第一个参数x：指输入
第二个参数keep_prob: 设置神经元被选中的概率,在初始化时keep_prob是一个占位符, keep_prob = tf.placeholder(tf.float32)。tensorflow在run时设置keep_prob具体的值，例如keep_prob: 0.5。
第五个参数name：指定该操作的名字。
在全连接层中使用。
示例看下面的博客。

https://blog.csdn.net/huahuazhu/article/details/73649389

tf.nn.softmat

一般用在网络最后一层的输出。
最后一层的输出是由上层输出乘以权重加上偏置得到的，不需要记过激活函数。
ft = tf.matmul(last_output,weights)+bias
y_cnn = tf.nn.softmax(ft)

交叉熵的计算：
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y_CNN), reduction_indices=[1]))
使用优化算法优化：
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)

tf.reduce_sum 和 tf.reduce_mean

求平均值tf.reduce_mean(input_tensor, reduction_indices=None, keep_dims=False, name=None)
x = tf.constant([[1., 1.], [2., 2.]])
tf.reduce_mean(x) # 1.5
tf.reduce_mean(x, 0) # [1.5, 1.5]
tf.reduce_mean(x, 1) # [1., 2.]

按照某个维度求和reduce_sum(input_tensor,axis=None，keep_dims=False, name=None)
# ‘x’ is [[1, 1, 1]
# [1, 1, 1]]
tf.reduce_sum(x) ==> 6
tf.reduce_sum(x, 0) ==> [2, 2, 2]
tf.reduce_sum(x, 1) ==> [3, 3]
tf.reduce_sum(x, 1, keep_dims=True) ==> [[3], [3]]
tf.reduce_sum(x, [0, 1]) ==> 6

顺便把tf.reduce_max也牵出来
求最大值tf.reduce_max(input_tensor, reduction_indices=None, keep_dims=False, name=None)

https://www.tensorflow.org/api_docs/python/tf/reduce_max
https://blog.csdn.net/qq_32166627/article/details/52734387

tf.argmax

tf.argmax( input, axis=None, name=None, dimension=None, output_type=tf.int64)
Returns the index with the largest value across axes of a tensor. (deprecated arguments)

tf.equal

tf.equal( x, y, name=None)
Returns the truth value of (x == y) element-wise.

tf.cast

tf.cast(x, dtype, name=None)
Args:
x: A Tensor or SparseTensor of numeric type. It could be uint8, int8, uint16, int16, int32, int64, float16, float32, float64, complex64, complex128, bfloat16.
dtype: The destination type. The list of supported dtypes is the same as x.
name: A name for the operation (optional).
Returns:
A Tensor or SparseTensor with same shape as x and same type as dtype.

x = tf.constant([1.8, 2.2], dtype=tf.float32)
tf.cast(x, tf.int32)  # [1, 2], dtype=tf.int32