The dimension of a NumPy array is called rank. The rank is the number of axes, that is, the dimension of the array. The rank of a one-dimensional array is 1, and the rank of a two-dimensional array is 2.
table of Contents
In NumPy, each linear array is called an axis (axis), that is, dimensions (dimensions). For example, a two-dimensional array is equivalent to two one-dimensional arrays, where each element in the first one-dimensional array is a one-dimensional array. So a one-dimensional array is the axis in NumPy, the first axis is equivalent to the bottom array, and the second axis is the array in the bottom array. The number of axes-rank, is the dimension of the array. Many times you can declare axis. axis=0, means operating along the 0th axis, that is, operating on each column; axis=1, means operating along the first axis, that is, operating on each row.
The more important ndarray object attributes in NumPy arrays are:
Attributes | Description |
---|---|
ndarray.ndim | Rank, which is the number of axes or the number of dimensions |
ndarray.shape | The dimensions of the array, for a matrix, n rows and m columns |
ndarray.size | The total number of array elements, equivalent to the value of n*m in .shape |
ndarray.dtype | the element type of the ndarray object |
ndarray.itemsize | The size of each element in the ndarray object, in bytes |
ndarray.flags | Memory information of ndarray object |
ndarray.real | real part of ndarray element |
ndarray.imag | imaginary part of ndarray element |
ndarray.data | The buffer containing the actual array elements. Since the elements are generally obtained by the index of the array, this attribute is usually not needed. |
- ndarray.ndim is used to return the dimension of the array, equal to the rank.
- ndarray.shape represents the dimension of the array and returns a tuple. The length of this tuple is the number of dimensions, that is, the ndim attribute (rank). For example, for a two-dimensional array, its dimensions represent "number of rows" and "number of columns".
- ndarray.itemsize returns the size of each element in the array in bytes.
- ndarray.flags returns the memory information of the ndarray object.
NumPy creates an array
One of the most important features of NumPy is its N-dimensional array object ndarray, which is a collection of a series of data of the same type. The index of the elements in the collection starts with the 0 subscript.
The ndarray object is a multidimensional array used to store elements of the same type.
Each element in the ndarray has an area of the same storage size in memory.
The ndarray internally consists of the following:
-
A pointer to data (a piece of data in memory or a memory-mapped file).
-
Data type or dtype, which describes the grid of fixed-size values in the array.
-
A tuple representing the shape of the array, representing a tuple of each dimension.
-
A stride, where the integer refers to the number of bytes "stride" in order to advance to the next element in the current dimension.
The internal structure of ndarray: the
span can be a negative number, which will make the array move backward in the memory, such as obj[::-1] or obj[:,::-1] in the slice.
To create an ndarray, just call NumPy's array function:
numpy.array(object, dtype = None, copy = True, order = None, subok = False, ndmin = 0)
name | description |
---|---|
object | Array or nested sequence |
dtype | The data type of the array element, optional |
copy | Whether the object needs to be copied, optional |
order | Create the style of the array, C is the row direction, F is the column direction, and A is any direction (default) |
try | By default returns an array consistent with the base type |
ndmin | Specify the minimum dimension of the generated array |
numpy.empty The
numpy.empty method is used to create an uninitialized array with a specified shape (shape) and data type (dtype):
numpy.empty(shape, dtype = float, order = 'C')
numpy.zeros
creates an array of the specified size, and the array elements are filled with 0:
numpy.zeros(shape, dtype = float, order = 'C')
numpy.ones
creates an array of the specified shape, and the array elements are filled with 1:
numpy.ones(shape, dtype = None, order = 'C')
NumPy creates an array from an existing array
numpy.asarray
numpy.asarray is similar to numpy.array, but numpy.asarray has only three parameters, two less than numpy.array.
numpy.asarray(a, dtype = None, order = None)
parameter | description |
---|---|
a | Arbitrary input parameters, which can be lists, tuples of lists, tuples, tuples of tuples, lists of tuples, multidimensional arrays |
dtype | Data type, optional |
order | Optional, there are two options "C" and "F", which respectively represent, row-first and column-first, the order of storage elements in computer memory. |
import numpy as np
x = [[1,2,3],[4,5,6],[7,8,9]]
arrays = np.asarray(x)
print(arrays)
NumPy creates an array from a range of values
numpy.arange
Use the arange function in the numpy package to create a numerical range and return an ndarray object. The function format is as follows:
numpy.arange(start, stop, step, dtype)
According to the range specified by start and stop and the step size set by step, an ndarray is generated.
parameter | description |
---|---|
start | Start value, default is 0 |
stop | End value (not included) |
step | Step size, default is 1 |
dtype | Return the data type of the ndarray, if not provided, the input data type will be used. |
import numpy as np
x = np.arange(5)
print (x)
[0 1 2 3 4]
NumPy data type
numpy supports a lot more data types than the built-in types of Python, and can basically correspond to the data types of C language, and some of the types correspond to the built-in types of Python. The following table lists the common basic types of NumPy.
name | description |
---|---|
bool_ | Boolean data type (True or False) |
int_ | The default integer type (similar to long, int32 or int64 in C language) |
intc | Same as C's int type, generally int32 or int 64 |
intp | Integer type used for indexing (similar to C's ssize_t, generally still int32 or int64) |
int8 | Bytes (-128 to 127) |
int16 | Integer (-32768 to 32767) |
int32 | Integer (-2147483648 to 2147483647) |
int64 | Integer (-9223372036854775808 to 9223372036854775807) |
uint8 | 无符号整数(0 to 255) |
uint16 | 无符号整数(0 to 65535) |
uint32 | 无符号整数(0 to 4294967295) |
uint64 | 无符号整数(0 to 18446744073709551615) |
float_ | float64 类型的简写 |
float16 | 半精度浮点数,包括:1 个符号位,5 个指数位,10 个尾数位 |
float32 | 单精度浮点数,包括:1 个符号位,8 个指数位,23 个尾数位 |
float64 | 双精度浮点数,包括:1 个符号位,11 个指数位,52 个尾数位 |
complex_ | complex128 类型的简写,即 128 位复数 |
complex64 | 复数,表示双 32 位浮点数(实数部分和虚数部分) |
complex128 | 复数,表示双 64 位浮点数(实数部分和虚数部分) |
NumPy 切片和索引
ndarray对象的内容可以通过索引或切片来访问和修改,与 Python 中 list 的切片操作一样。
ndarray 数组可以基于 0 - n 的下标进行索引,切片对象可以通过内置的 slice 函数,并设置 start, stop 及 step 参数进行,从原数组中切割出一个新数组。
import numpy as np
a = np.arange(10)
s = slice(2,7,2) # 从索引 2 开始到索引 7 停止,间隔为2
print (a[s])
[2 4 6]
以上实例中,我们首先通过 arange() 函数创建 ndarray 对象。 然后,分别设置起始,终止和步长的参数为 2,7 和 2。
我们也可以通过冒号分隔切片参数 start:stop:step 来进行切片操作:
import numpy as np
a = np.arange(10)
b = a[2:7:2] # 从索引 2 开始到索引 7 停止,间隔为 2
print(b)
[2 4 6]
冒号 : 的解释:如果只放置一个参数,如 [2],将返回与该索引相对应的单个元素。如果为 [2:],表示从该索引开始以后的所有项都将被提取。如果使用了两个参数,如 [2:7],那么则提取两个索引(不包括停止索引)之间的项。
import numpy as np
a = np.arange(10) # [0 1 2 3 4 5 6 7 8 9]
b = a[5]
print(b)
5
print(a[2:])
[2 3 4 5 6 7 8 9]
print(a[2:5])
[2 3 4]
NumPy 高级索引
NumPy provides more indexing methods than ordinary Python sequences. In addition to the indexing with integers and slices previously seen, arrays can be indexed by integer arrays, Boolean indexes, and fancy indexes.
Integer array index The
following example gets the elements at positions (0,0), (1,1) and (2,0) in the array.
import numpy as np
x = np.array([[1, 2], [3, 4], [5, 6]])
y = x[[0,1,2], [0,1,0]]
print (y)
[1 4 5]
Boolean index: We can index the target array through a Boolean array. Boolean indexing obtains an array of elements that meet specified conditions through Boolean operations (such as comparison operators). The following example gets elements greater than 5:
import numpy as np
x = np.array([[ 0, 1, 2],[ 3, 4, 5],[ 6, 7, 8],[ 9, 10, 11]])
print ('我们的数组是:')
print (x)
print ('\n')
# 现在我们会打印出大于 5 的元素
print ('大于 5 的元素是:')
print (x[x > 5])
我们的数组是:
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
[ 9 10 11]]
大于 5 的元素是:
[ 6 7 8 9 10 11]
The following example uses ~ (complement operator) to filter NaN.
import numpy as np
a = np.array([np.nan, 1,2,np.nan,3,4,5])
print (a[~np.isnan(a)])
[ 1. 2. 3. 4. 5.]
NumPy iterative array
The NumPy iterator object numpy.nditer provides a flexible way to access one or more array elements. The most basic task of the iterator can complete the access to the array elements. Next we use the arange() function to create a 2X3 array and use nditer to iterate over it.
import numpy as np
a = np.arange(6).reshape(2,3)
print ('原始数组是:')
print (a)
print ('\n')
print ('迭代输出元素:')
for x in np.nditer(a):
print (x, end=", " )
print ('\n')
原始数组是:
[[0 1 2]
[3 4 5]]
迭代输出元素:
0, 1, 2, 3, 4, 5,
Control the traversal order
- for x in np.nditer(a, order='F'): Fortran order, which is column order first;
- for x in np.nditer(aT, order='C'): C order, which is line order first;
import numpy as np
a = np.arange(0,60,5)
a = a.reshape(3,4)
print ('原始数组是:')
print (a)
print ('\n')
print ('原始数组的转置是:')
b = a.T
print (b)
print ('\n')
print ('以 C 风格顺序排序:')
c = b.copy(order='C')
print (c)
for x in np.nditer(c):
print (x, end=", " )
print ('\n')
print ('以 F 风格顺序排序:')
c = b.copy(order='F')
print (c)
for x in np.nditer(c):
print (x, end=", " )
原始数组是:
[[ 0 5 10 15]
[20 25 30 35]
[40 45 50 55]]
原始数组的转置是:
[[ 0 20 40]
[ 5 25 45]
[10 30 50]
[15 35 55]]
以 C 风格顺序排序:
[[ 0 20 40]
[ 5 25 45]
[10 30 50]
[15 35 55]]
0, 20, 40, 5, 25, 45, 10, 30, 50, 15, 35, 55,
以 F 风格顺序排序:
[[ 0 20 40]
[ 5 25 45]
[10 30 50]
[15 35 55]]
0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55,
You can force the nditer object to use a certain order by setting it explicitly:
import numpy as np
a = np.arange(0,60,5)
a = a.reshape(3,4)
print ('原始数组是:')
print (a)
print ('\n')
print ('以 C 风格顺序排序:')
for x in np.nditer(a, order = 'C'):
print (x, end=", " )
print ('\n')
print ('以 F 风格顺序排序:')
for x in np.nditer(a, order = 'F'):
print (x, end=", " )
原始数组是:
[[ 0 5 10 15]
[20 25 30 35]
[40 45 50 55]]
以 C 风格顺序排序:
0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55,
以 F 风格顺序排序:
0, 20, 40, 5, 25, 45, 10, 30, 50, 15, 35, 55,