访问本站观看效果更佳
我们在处理数据的过程中,会经常遇到多维数组拼接的问题。
常见的情况有以下几种(下面以*.shape代指张量大小):
- 沿某一维度扩展,比如a.shape == [32,3,5000] b.shape == [32,1,5000],我们需要把a,b合并成新的张量c,其中c.shape == [32,4,5000];
- 增加新的列,比如a.shape == [32,3,5000],b.shape == [32,5000],我们需要把a,b合并成新的张量c,其中c.shape == [32,4,5000];
简言之,第一种情况可以使用numpy
下的concatenate
实现,第二种情况我们可以用expand_dims
先拓展维度再按第一种情况处理来实现。下面具体说明其用法。
一 concatenate的用法
函数原型:
- numpy.concatenate((a1, a2, …), axis=0, out=None)
Join a sequence of arrays along an existing axis.
Parameters:
a1, a2, … : sequence of array_like
The arrays must have the same shape, except in the dimension corresponding to axis (the first, by default).
axis : int, optional
The axis along which the arrays will be joined. If axis is None, arrays are flattened before use. Default is 0.
out : ndarray, optional
If provided, the destination to place the result. The shape must be correct, matching that of what concatenate would have returned if no out argument were specified.
Returns:
res : ndarray
The concatenated array.
需要注意的地方有以下问题:
- 能够拼接的张量必须具有相同的形状,待拼接的维度除外。也就是说a.shape == [32,3,5000] b.shape == [32,1,5000] ab是可沿axis=1拼接的。a.shape == [32,3,5000],b.shape == [32,5000] ab是无法拼接的。
举例如下:
>>> a = np.array([[1, 2], [3, 4]])
>>> b = np.array([[5, 6]])
>>> np.concatenate((a, b), axis=0)
array([[1, 2],
[3, 4],
[5, 6]])
>>> np.concatenate((a, b.T), axis=1)
array([[1, 2, 5],
[3, 4, 6]])
>>> np.concatenate((a, b), axis=None)
array([1, 2, 3, 4, 5, 6])
二 column_stack的用法
看这个名字我们就大概知道,这个函数是用来新增列的。
函数原型:
- numpy.column_stack(tup)
Stack 1-D arrays as columns into a 2-D array.
Take a sequence of 1-D arrays and stack them as columns to make a single 2-D array. 2-D arrays are stacked as-is, just like with hstack. 1-D arrays are turned into 2-D columns first.
Parameters:
tup : sequence of 1-D or 2-D arrays.
Arrays to stack. All of them must have the same first dimension.
Returns:
stacked : 2-D array
The array formed by stacking the given arrays.
这里的函数说明指出,我们传入一个tuple
然后得到合并的二维数组。其实这个说法很模糊。我们先看一个例子。
>>> a = np.array((1,2,3))
>>> b = np.array((2,3,4))
>>> np.column_stack((a,b))
array([[1, 2],
[2, 3],
[3, 4]])
我们再看一下,上文指出输入是tuple
,那么我们再看一个例子:
import numpy as np
a = np.zeros((32,3,1000))
b = np.zeros((32,1,1000))
c = np.zeros((32,1,1000))
print(a.shape)
print(b.shape)
print(c.shape)
d = np.column_stack((a,b,c))
print(d.shape)
结果如下:
(32, 3, 1000)
(32, 1, 1000)
(32, 1, 1000)
(32, 5, 1000)
三 vstack的用法
函数原型:
- numpy.vstack(tup)
Stack arrays in sequence vertically (row wise).
Parameters:
tup : sequence of ndarrays
The arrays must have the same shape along all but the first axis. 1-D arrays must have the same length.
Returns:
stacked : ndarray
The array formed by stacking the given arrays, will be at least 2-D.
我们直接看例子吧!
>>> a = np.array([1, 2, 3])
>>> b = np.array([2, 3, 4])
>>> np.vstack((a,b))
array([[1, 2, 3],
[2, 3, 4]])
也就说对于0-1维度而言,这个函数沿着axis=0
拼接。原先a.shape == [1,3] b.shape == [1,3],拼接后变为[2,3]
>>> a = np.array([[1], [2], [3]])
>>> b = np.array([[2], [3], [4]])
>>> np.vstack((a,b))
array([[1],
[2],
[3],
[2],
[3],
[4]])
四 hstack的用法
函数原型:
- numpy.hstack(tup)
我们直接看例子!
>>> a = np.array((1,2,3))
>>> b = np.array((2,3,4))
>>> np.hstack((a,b))
array([1, 2, 3, 2, 3, 4])
>>> a = np.array([[1],[2],[3]])
>>> b = np.array([[2],[3],[4]])
>>> np.hstack((a,b))
array([[1, 2],
[2, 3],
[3, 4]])
对于0-1维度而言,这个函数沿着axis=2
拼接。原先a.shape == [1,3] b.shape == [1,3],拼接后变为[1,6]。
expand_dims的用法
函数原型:
- numpy.expand_dims(a, axis)
Expand the shape of an array.
Insert a new axis that will appear at the axis position in the expanded array shape.
Parameters:
a : array_like
Input array.
axis : int
Position in the expanded axes where the new axis is placed.
Returns:
res : ndarray
Output array. The number of dimensions is one greater than that of the input array.
我们先构造一个长度为2的向量:
>>> x = np.array([1,2])
>>> x.shape
(2,)
我们添加新的列,也就是说在axis=0
增加一列。
>>> y = np.expand_dims(x, axis=0)
>>> y
array([[1, 2]])
>>> y.shape
(1, 2)
现在我们可以看到,矩阵大小变为了(1,2)
我们沿axis=1
增加一行。
>>> y = np.expand_dims(x, axis=1) # Equivalent to x[:,np.newaxis]
>>> y
array([[1],
[2]])
>>> y.shape
(2, 1)
现在我们可以看到,矩阵大小变为了(2,1)