The difference Conv1D and Conv2D

My answer is, in Conv2D input channels is 1, the two there is no difference or that can be transformed into each other. First, the final code are both called back-end code (to TensorFlow, for example, can be found in tensorflow_backend.py inside):

tf.nn.convolution = X (
input = X,
filter = Kernel,
dilation_rate = (dilation_rate,),
Strides = (Strides,),
padding = padding,
DATA_FORMAT = tf_data_format)
except that different parameters of the transmission filter and the input, input is not necessary said filter = kernel what is it?

We enter Conv1D and Conv2D look at the source code. Their code is located layers / convolutional.py inside both the base class are inherited _Conv (Layer). _Conv view into the class code can be found the following code:

= conv_utils.normalize_tuple self.kernel_size (kernel_size, Rank, 'kernel_size')
...... intermediate code # omitted
input_dim = input_shape [channel_axis]
kernel_shape = self.kernel_size + (input_dim, self.filters)
we assume that the input size is Conv1D (600,300), and the input is Conv2D size (m, n, 1), both kernel_size 3.

Enter conv_utils.normalize_tuple function can be seen:

def normalize_tuple(value, n, name):
"""Transforms a single int or iterable of ints into an int tuple.
# Arguments
value: The value to validate and convert. Could an int, or any iterable
of ints.
n: The size of the tuple to be returned.
name: The name of the argument being validated, e.g. "strides" or
"kernel_size". This is only used to format error messages.
# Returns
A tuple of n integers.
# Raises
ValueError: If something else than an int/long or iterable thereof was
passed.
"""
if isinstance(value, int):
return (value,) * n
else:
try:
value_tuple = tuple(value)
except TypeError:
raise ValueError('The `' + name + '` argument must be a tuple of ' +
str(n) + ' integers. Received: ' + str(value))
if len(value_tuple) != n:
raise ValueError('The `' + name + '` argument must be a tuple of ' +
str(n) + ' integers. Received: ' + str(value))
for single_value in value_tuple:
try:
int(single_value)
except ValueError:
raise ValueError('The `' + name + '` argument must be a tuple of ' +
str(n) + ' integers. Received: ' + str(value) + ' '
'including element ' + str(single_value) + ' of type' +
' ' + str(type(single_value)))
return value_tuple
 

Therefore, the above code obtained kernel_size the actual size of the kernel, is calculated according to rank, Conv1D the rank is 1, Conv2D the rank is 2, if Conv1D, so obtained kernel_size is (3) if Conv2D, so obtained is (3,3)

 

 

= input_shape input_dim [channel_axis]
kernel_shape = self.kernel_size + (input_dim, self.filters)
and because the above inputdim last dimension is the size (Conv1D as 300, Conv2D is 1), we assume that the number of filter 64 are both a convolution kernel. Therefore, kernel of the actual shape Conv1D as follows:

(3,300,64)

The kernel of the actual shape Conv2D as follows:

(3,3,1,64)

We have just assumed that mass participation when kernel_size = 3, kernel_size settings used when if we will pass for his participation Conv2D tuple such as (3,300), then the transfer function according to conv_utils.normalize_tuple, we will return the last kernel_size yuan set up their own group, namely (3,300) then the actual shape Conv2D are:

(3,300,1,64), that this time Conv1D size reshape what to get, the two are equivalent.

In other words, Conv1D (kernel_size = 3) is actually Conv2D (kernel_size = (3,300)), must of course also input to reshape (600,300,1), can be performed on multiple lines Conv2D convolution.

This may also explain why in Keras can use Conv1D natural language processing, because in natural language processing, we assume that a sequence is 600 words, each word of the word vector is 300 dimensions, then a sequence of input into the network it is (600, 300), when I use Conv1D convoluted, in fact, completed direct convolution in the sequence is the actual time convolution (3,300) convolution and because each row is a word vector, thus using Conv1D (kernel_size = 3) it is equivalent to using a neural network feature extraction n_gram = 3 a. This is why using a convolution neural network processing text will be very fast and efficient content.
---------------------
Author: Haha progress
Source: CSDN
Original: https: //blog.csdn.net/hahajinbu/article/details/79535172
Copyright: This article is a blogger original article, reproduced, please attach Bowen link!

Guess you like

Origin www.cnblogs.com/jfdwd/p/10964094.html