Difference between tf.nn.conv1d and layers.conv1d in tensorflow

When using tensorflow as a one-dimensional convolutional neural network, you will encounter two functions, tf.nn.conv1d and layers.conv1d, but what is the difference between these two functions? You can get some rules through calculation.

1. About the explanation of tf.nn.conv1d, the following is the API note about tf.nn.conv1d in Tensor Flow:

Computes a 1-D convolution given 3-D input and filter tensors.

Given an input tensor of shape
  [batch, in_width, in_channels]
if data_format is "NHWC", or
  [batch, in_channels, in_width]
if data_format is "NCHW",
and a filter / kernel tensor of shape
[filter_width, in_channels, out_channels], this op reshapes
the arguments to pass them to conv2d to perform the equivalent
convolution operation.

Internally, this op reshapes the input tensors and invokes `tf.nn.conv2d`.
For example, if `data_format` does not start with "NC", a tensor of shape
  [batch, in_width, in_channels]
is reshaped to
  [batch, 1, in_width, in_channels],
and the filter is reshaped to
  [1, filter_width, in_channels, out_channels].
The result is then reshaped back to
  [batch, out_width, out_channels]
\(where out_width is a function of the stride and padding as in conv2d\) and
returned to the caller.

Args:
  value: A 3D `Tensor`.  Must be of type `float32` or `float64`.
  filters: A 3D `Tensor`.  Must have the same type as `input`.
  stride: An `integer`.  The number of entries by which
    the filter is moved right at each step.
  padding: 'SAME' or 'VALID'
  use_cudnn_on_gpu: An optional `bool`.  Defaults to `True`.
  data_format: An optional `string` from `"NHWC", "NCHW"`.  Defaults
    to `"NHWC"`, the data is stored in the order of
    [batch, in_width, in_channels].  The `"NCHW"` format stores
    data as [batch, in_channels, in_width].
  name: A name for the operation (optional).

Returns:
  A `Tensor`.  Has the same type as input.

Raises:
  ValueError: if `data_format` is invalid.
What does it mean? That is to say, the meaning of the parameters of conv1d: (taking the NHWC format as an example, that is, the channel dimension is at the end)

1. value: In the comments, the format of value is: [batch, in_width, in_channels], batch is the sample dimension, indicating how many samples, in_width is the width dimension, indicating the width of the sample, in_channels dimension channel dimension, indicating how many samples Channels. 
In fact, you can also think of the format as follows: [batch, number of rows, number of columns], and treat each sample as a flat two-dimensional array. This is easy to understand.

2. filters: In the comments, the format of filters is: [filter_width, in_channels, out_channels]. According to the second view of value, filter_width can be seen as the number of rows convolved with value each time, in_channels indicates how many columns there are in value (corresponding to in_channels in value). out_channels represents output channels, which can be understood as how many convolution kernels there are in total, that is, the number of convolution kernels.

3. stride: an integer that represents the step size, the distance of each (downward) movement (in TensorFlow, the distance moved to the right, which can be regarded as the distance moved downward).

4. Padding: same as conv2d, whether value needs to be filled with 0 underneath.

5. name: name. Can be omitted.
First of all, from the parameter list, we can see that the input data that value refers to, stride is the step size of the convolution. The most doubtful thing here is the filter parameter, then we will briefly explain the filter. As can be seen from the above, the format of the filters is: [filter_width, in_channels, out_channels], which is the dimension of an array, corresponding to the size of the convolution kernel, the format of the input channel, and the number of convolution kernels, below we Illustrate the problem with examples:

import tensorflow as tf
import numpy as np


if __name__ == '__main__':
    inputs = tf.constant(np.arange(1, 6, dtype=np.float32), shape=[1, 5, 1])
    w = np.array([1, 2], dtype=np.float32).reshape([2, 1, 1])
    # filter width, filter channels and out channels(number of kernels)
    cov1 = tf.nn.conv1d(inputs, w, stride=1, padding='VALID')
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        out = sess.run(cov1)
        print(out)

The output is: [[[5.],
        [8.],
        [11.],
        [14.]]]
Let ’s analyze, the input data is [[[1], [2], [3], [ 4], [5]]], there are 5 features, the corresponding values ​​are 1, 2, 3, 4, 5 respectively, then the result of the convolution is 5, 8, 11, 14, then how does this result come Well, according to the calculation of convolution, we can get 5 = 1 * 1 + 2 * 2, 8 = 2 * 1 + 3 * 2, 11 = 3 * 1 + 4 * 2, 14 = 4 * 1 + 5 * 2, that is, W1 = 1, W2 = 2, which is exactly the same as the value set by our filters,

w = np.array([1, 2], dtype=np.float32).reshape([2, 1, 1])

So it can be seen that the filtes set is a convolution kernel matrix. In other words, we can set the convolution kernel matrix.

2. Regarding tf.layers.conv1d, the function is defined as follows


tf.layers.conv1d(

inputs,

filters,

kernel_size,

strides=1,

padding='valid',

data_format='channels_last',

dilation_rate=1,

activation=None,

use_bias=True,

kernel_initializer=None,

bias_initializer=tf.zeros_initializer(),

kernel_regularizer=None,

bias_regularizer=None,

activity_regularizer=None,

kernel_constraint=None,

bias_constraint=None,

trainable=True,

name=None,

reuse=None

)
比较重要的几个参数是inputs, filters, kernel_size,下面分别说明

 

inputs :  输入tensor, 维度(None,  a, b) 是一个三维的tensor

             None  :  一般是填充样本的个数,batch_size

             a         :  句子中的词数或者字数

             b          :    字或者词的向量维度

 

filters :  过滤器的个数
kernel_size : 卷积核的大小,卷积核其实应该是一个二维的,这里只需要指定一维,是因为卷积核的第二维与输入的词向量维度是一致的,因为对于句子而言,卷积的移动方向只能是沿着词的方向,即只能在列维度移动。

one example:

import tensorflow as tf
import numpy as np


if __name__ == '__main__':
    inputs = tf.constant(np.arange(1, 6, dtype=np.float32), shape=[1, 5, 1])
    cov2 = tf.layers.conv1d(inputs, filters=1, kernel_size=2, strides=1, padding='VALID')
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        out = sess.run(cov2)
        print(out)

Output result: [[[-1.9953331]
  [-3.5520997]
  [-5.108866]
  [-6.6656327]]]

Maybe the result you get is different from the result I got, because in this function, only the size and step size of the convolution kernel are set, and no specific convolution kernel matrix is ​​set, so this convolution kernel matrix is ​​randomly generated, just There may be cases where running the above program produces different results.

 

Quote:

https://blog.csdn.net/u011734144/article/details/84066928

https://blog.csdn.net/DaVinciL/article/details/81359245

Published 10 original articles · Like 11 · Visits 20,000+

Guess you like

Origin blog.csdn.net/u013323018/article/details/90444952