Article Directory
- 1. Introduction to NumPy
- 2. Basics - Arrays
- 3. Create an array
-
- 3.1 np.array()
- 3.2 np.arange()
- 3.3 np.zeros()
- 3.4 np.zeros_like()
- 3.5 np.ones()
- 3.6 np.ones_like()
- 3.7 np.empty()
- 3.8 np.empty_like()
- 3.9 np.linspace()
- 3.10 np.logspace()
- 3.11 np.indices()
- 3.12 np.random
- 3.13 np.fromfunction()
- 3.14 np.fromfile()
- 3.15 np.identity()
- 3.16 np.eye()
- 3.17 np.mgrid()
- 3.18 np.ogrid()
- 4. Index
-
- 4.1 Single-element indexing of one-dimensional arrays
- 4.2 Single-element indexing of multidimensional arrays
- 4.3 One-dimensional array slicing and stride indexing
- 4.4 Multidimensional array slicing and stride indexing
- 4.5 Indexing Arrays
- 4.6 Indexing multidimensional arrays
- 4.7 Boolean Indexed Arrays
- 4.8 Combining indexed arrays and slices
- 4.8 ``np.newaxis`` and ``...``
- 4.9 Assignment after indexing
- 5. Iteration
- 6. Array operations
- 7. Array operations
- 8. Broadcast
- 9. General functions
- 10. API functions
1. Introduction to NumPy
NumPy is the basic package for scientific computing in Python. It is a Python library that provides multidimensional array objects, various derived objects (such as mask arrays and matrices), and various APIs for fast operations on arrays, including mathematics, logic , shape operations, sorting, selection, input and output, discrete Fourier transform, basic linear algebra, basic statistical operations and stochastic simulation, etc.
At the heart of the NumPy package are ndarray
objects. It encapsulates python's native n-dimensional array of the same data type. Since the bottom layer of NumPy uses C code for precompilation, the operating efficiency is much higher than that of Python's native array. At the same time, NumPy has the characteristics of vectorization and broadcasting, so the code is more concise and streamlined.
This article mainly introduces the basic concepts and usage of NumPy, aiming to get started with NumPy quickly.
2. Basics - Arrays
The native Python list supports containing elements of different types at the same time, while the main object of NumPy is a multidimensional array of the same type. Dimensions in NumPy are called axes .
NumPy's array class is ndarray
that it has more properties than Python's native array:
-
ndarray.ndim
The number of axes (dimensions) of the array, also known as rank.
-
ndarray.shape
The dimension of the array is a tuple of arrays, representing the size of each dimension of the array. If the matrix is
n
rows andm
columns,shape
yes(n, m)
.ndim
isshape
the length of the tuple. -
ndarray.size
The total number of elements in the array, equal to
shape
the product of the elements in the array. -
ndarray.dtype
An object used to describe the type of elements in the array. Standard Python data types can be used. You can also use the data types provided by NumPy, such as
numpy.int32
,numpy.int64
andnumpy.float64
etc. -
ndarray.itemsize
The length in bytes of the element type. For example,
float64
the typeitemsize
is 8 (=64/8), and anothercomplex32
exampleitemsize
is 4 (=32/8). It isndarray.dtype.itemsize
equal to . -
ndarray.data
A buffer containing the actual elements in the element. Normally, we don't need to use this property directly, because we can access the elements in the array by index.
3. Create an array
Arrays are a core concept in NumPy, so creating them is the first step. There are 5 general mechanisms for creating arrays:
- Convert from other Python structures (e.g., lists, tuples)
- Creation of numpy native arrays (eg, arange, ones, zeros, etc.)
- Read arrays from disk, either in standard or custom formats
- Create an array from raw bytes by using a string or buffer
- use special library functions (eg, random)
The following summarizes some commonly used ways of creating arrays (not only creating regular arrays, but also the API provided by NumPy for creating various special arrays):
serial number | API | describe |
---|---|---|
1 | np.array() | Create arrays from Python lists or tuples |
2 | np.arange() | Create an array of numbers |
3 | np.zeros() | Create an array of all 0's |
4 | np.zeros_like() | Create an array of all 0s with the same size as the target array |
5 | np.ones() | Create an array of all 1's |
6 | np.ones_like() | Create an array of all 1s with the same size as the target array |
7 | np.empty() | Creates an array of the specified size, but does not initialize the value |
8 | np.empty_like() | Creates an array of the same size as the target array without value initialization |
9 | np.lispace() | Create an arithmetic array |
10 | np.logspace() | Create a proportional array |
11 | np.indices() | Create an array collection of specified shape numbers |
12 | np.random.random | create a random array |
13 | np.fromfunction() | Create an array from a specified function |
14 | np.fromfile() | Create an array from the specified file |
15 | np.identity() | Create an identity matrix with equal rows and columns |
16 | np.eye() | Create a two-dimensional matrix, the elements of the diagonal are all 1, the rows and columns can be different, and the diagonal offset can be specified |
17 | np.mgrid() | Returns an array of multidimensional structures |
18 | np.ogird() | Returns an array of sparse multidimensional structures |
3.1 np.array()
Create a one-dimensional array:
>>> a = np.array([1,2,3])
>>> a
array([1, 2, 3])
Create a 2D array:
>>> b = np.array([[1,2,3], [4,5,6]])
>>> b
array([[1, 2, 3],
[4, 5, 6]])
When creating an array, manually specify the type of the array:
# 指定为float32类型
>>> c = np.array([1, 2, 3], dtype=np.float32)
>>> c
array([1., 2., 3.], dtype=float32)
NumPy can accept both python lists and tuples, so it
array()
is fine to pass in tuples or lists. It also includes some functions that need to pass inshape
parameters such as tuples or lists.
3.2 np.arange()
Create a sequence starting at 0 and ending at a certain number, with a default step size of 1:
>>> a = np.arange(3)
>>> a
array([0, 1, 2])
Create an array starting at a certain number and ending at a certain number, specifying a step size and type:
>>> a = np.arange(start=1, stop=8, step=2, dtype=np.int32)
>>> a
array([1, 3, 5, 7])
The way arange() creates an array is left closed and right open [start, stop)
3.3 np.zeros()
Create an array of all 0s of the specified shape:
>>> a = np.zeros((3,2))
>>> a
array([[0., 0.],
[0., 0.],
[0., 0.]])
>>> a.dtype
dtype('float64')
Function zeros
defaults to float64
type. You can manually specify other types when creating:
>>> a = np.zeros((3,2), dtype=np.int32)
>>> a
array([[0, 0],
[0, 0],
[0, 0]])
>>> a.dtype
dtype('int32')
3.4 np.zeros_like()
Create an array of all zeros of the same size as the destination array:
>>> a = np.array([[1,2,3], [4,5,6]])
>>> a
array([[1, 2, 3],
[4, 5, 6]])
>>> b = np.zeros_like(a)
>>> b
array([[0, 0, 0],
[0, 0, 0]])
3.5 np.ones()
Create an array of all 1s of the specified shape, which np.zeros()
is the same as the usage, so I won’t go into details here. The default type is also float64
, you can dtype
specify the type.
3.6 np.ones_like()
Create an array of all 1s with the same size as the target array, and np.zeros_like()
use it the same way.
3.7 np.empty()
Create an array of the specified shape, but the values in the array are not initialized. Empty is created faster because it is not initialized. Can be used to create scenarios that require a placeholder array:
>>> a = np.empty((3, 4))
>>> a
array([[6.23042070e-307, 2.04719289e-306, 6.23057010e-307,
1.24611741e-306],
[1.78019082e-306, 6.23058028e-307, 1.06811422e-306,
3.56043054e-307],
[1.37961641e-306, 8.90071135e-308, 1.78021527e-306,
1.66889876e-307]])
3.8 np.empty_like()
Creates an array of the same size as the destination array, but does not initialize it.
3.9 np.linspace()
Create an arithmetic array starting at a certain number and ending at a certain number, specifying the number of elements in the interval:
>>> a = np.linspace(2, 9, num=8)
>>> a
array([2., 3., 4., 5., 6., 7., 8., 9.])
The range of the array created by the function linspace
is the default left-closed and right-closed [start, stop] , which includes stop, and you can Endpoint
specify whether to include stop through parameters:
>>> a = np.linspace(2, 9, num=7, endpoint=False)
>>> a
array([2., 3., 4., 5., 6., 7., 8.])
The function is similar linspace
to arange
the comparison, the function arange
specifies the step size, but linspace
the total number specified by the function, its step size is dynamically calculated, we can retstep
get the calculated step size through the parameters:
>>> a = np.linspace(2, 9, num=7, endpoint=False, retstep=True)
>>> a
(array([2., 3., 4., 5., 6., 7., 8.]), 1.0)
>>> a = np.linspace(2, 9, num=7, endpoint=False, retstep=True)
# 指定retstep之后可以同时得到数组和步长
>>> a
(array([2., 3., 4., 5., 6., 7., 8.]), 1.0)
# 获取创建出来的数组
>>> a[0]
array([2., 3., 4., 5., 6., 7., 8.])
# 获取数组的步长
>>> a[1]
1.0
3.10 np.logspace()
Create a proportional array, the default is base 10, starting from 1 0 start 10^{start}10s t a r t start, to1 0 stop 10^{stop}10s t o p ends, the total isnum
(default 50):
>>> a = np.logspace(1, 5, num=5)
>>> a
array([1.e+01, 1.e+02, 1.e+03, 1.e+04, 1.e+05])
>>> b = np.logspace(1, 5, num=4, dtype=np.int32)
>>> b
array([ 10, 215, 4641, 100000])
The base can base
be specified by parameters, so that is from basestart base^{start}bases t a r t start, tobasestop base^{stop}bases t o p ends, of course, you can alsoendpoint
specify whether to include itstop
:
>>> a = np.logspace(1, 10, base=2, num=10, dtype=np.int32)
>>> a
array([ 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024])
>>> b = np.logspace(1, 5, base=3, num=5, dtype=np.int32)
>>> b
array([ 3, 9, 27, 81, 243])
3.11 np.indices()
Create an array collection of the specified shape serial number, the collection is the serial number of each dimension. It sounds abstract, but it should be easier to understand by looking at an example:
>>> a = np.indices((3,2))
>>> a
array([[[0, 0],
[1, 1],
[2, 2]],
[[0, 1],
[0, 1],
[0, 1]]])
>>> a[0]
array([[0, 0],
[1, 1],
[2, 2]])
>>> a[1]
array([[0, 1],
[0, 1],
[0, 1]])
In the above example, we specified shape
(3, 2). Just imagine, a 3x2 matrix, the coordinate distribution should be like this:
[ a 0 , 0 a 0 , 1 a 1 , 0 a 1 , 1 a 2 , 0 a 2 , 1 ] \begin{bmatrix} a_{0, 0} & a_{0,1} \\ a_{1,0} & a_{1,1} \\ a_{2,0} & a_{2,1} \\ \end{bmatrix}
a0,0a1,0a2,0a0,1a1,1a2,1
The high-dimensional coordinate matrix is,
[ 0 , 0 1 , 1 2 , 2 ] \begin{bmatrix} 0, 0 \\ 1, 1 \\ 2, 2 \\ \end{bmatrix}
0,01,12,2
The low-dimensional coordinate matrix is,
[ 0 , 1 0 , 1 0 , 1 ] \begin{bmatrix} 0, 1 \\ 0, 1 \\ 0, 1 \\ \end{bmatrix}
0,10,10,1
Exactly corresponds to the result in the above example. Let's look at another example of a 3D matrix:
>>> a = np.indices((2,3,2))
>>> a[0]
array([[[0, 0],
[0, 0],
[0, 0]],
[[1, 1],
[1, 1],
[1, 1]]])
>>> a[1]
array([[[0, 0],
[1, 1],
[2, 2]],
[[0, 0],
[1, 1],
[2, 2]]])
>>> a[2]
array([[[0, 1],
[0, 1],
[0, 1]],
[[0, 1],
[0, 1],
[0, 1]]])
Similarly, a 2x3x2 three-dimensional matrix is to overlap the following two two-dimensional matrices back and forth, you can make up your mind: [
a 0 , 0 , 0 a 0 , 0 , 1 a 0 , 1 , 0 a 0 , 1 , 1 a 0 , 2 , 0 a 0 , 2 , 1 ] \begin{bmatrix} a_{0,0,0} & a_{0,0,1} \\ a_{0,1,0} & a_{0,1 ,1} \\ a_{0,2,0} & a_{0,2,1} \\ \end{bmatrix}
a0,0,0a0,1,0a0,2,0a0,0,1a0,1,1a0,2,1
[ a 1 , 0 , 0 a 1 , 0 , 1 a 1 , 1 , 0 a 1 , 1 , 1 a 1 , 2 , 0 a 1 , 2 , 1 ] \begin{bmatrix} a_{1,0,0} & a_{1,0,1} \\ a_{1,1,0} & a_{1,1,1} \\ a_{1,2,0} & a_{1,2,1} \\ \end{bmatrix} a1,0,0a1,1,0a1,2,0a1,0,1a1,1,1a1,2,1
The highest dimension coordinate matrix is,
[ 0 , 0 0 , 0 0 , 0 ] \begin{bmatrix} 0, 0 \\ 0, 0 \\ 0, 0 \\ \end{bmatrix}
0,00,00,0
[ 1 , 1 1 , 1 1 , 1 ] \begin{bmatrix} 1, 1 \\ 1, 1 \\ 1, 1 \\ \end{bmatrix} 1,11,11,1
The coordinate matrix of the remaining two dimensions is the same, so I won't go into details. The indices
ordinal array generated by the function can be used to calculate a sub-array of an array, for example:
>>> a = np.arange(12).reshape(3,4)
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> b = np.indices((2,3))
>>> b
array([[[0, 0, 0],
[1, 1, 1]],
[[0, 1, 2],
[0, 1, 2]]])
>>> a_sub = a[b[0], b[1]]
>>> a_sub
array([[0, 1, 2],
[4, 5, 6]])
# 在数组切片层面,效果同
>>> a[:2, :3]
array([[0, 1, 2],
[4, 5, 6]])
indices
The array created by the function is non-sparse by default, and can sparse
be specified as sparse by parameters:
>>> a = np.indices((3,2), sparse=True)
>>> a
(array([[0],
[1],
[2]]), array([[0, 1]]))
3.12 np.random
np.random.random()
Generate a specified shape
random array, the value range is [0, 1) by default :
>>> a = np.random.random((2,3))
>>> a
array([[0.48152382, 0.58393539, 0.61701583],
[0.05104326, 0.23513154, 0.21062412]])
np.random.randint()
Generate a specified shape
random array, and the default range of values is passed in as a parameter:
# [3,9)之间的随机整数,数组尺寸3X4
>>> a = np.random.randint(3, 9, (3,4))
>>> a
array([[6, 3, 3, 4],
[5, 7, 7, 3],
[7, 6, 3, 7]])
The function of the np.random.shuffle()
function is to randomly shuffle the array. np.random
There are many other APIs in the package for generating random arrays. Only the two most basic functions are introduced here.
3.13 np.fromfunction()
Create an array of a specified size, and initialize the value by passing in a lambda expression. The parameter of the lambda expression is the coordinate corresponding to the value:
>>> a = np.fromfunction(lambda i,j: i, (2,2))
>>> a
array([[0., 0.],
[1., 1.]])
# 将坐标i+j作为数组元素的初始化值
>>> b = np.fromfunction(lambda i,j: i + j, (3,3), dtype=np.int32)
>>> b
array([[0, 1, 2],
[1, 2, 3],
[2, 3, 4]])
The above example is a lambda, or passed directly into a function:
>>> def f(x,y):
... return 10*x+y
...
>>> a = np.fromfunction(f, (2,2), dtype=np.int32)
>>> a
array([[ 0, 1],
[10, 11]])
3.14 np.fromfile()
Read data from a file and generate an array:
>>> import tempfile
>>> fname = tempfile.mkstemp()[1]
>>> fname
'C:\\Users\\doudou\\AppData\\Local\\Temp\\tmpmyq7otph'
>>> a = np.arange(12)
>>> a
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
>>> a.tofile(fname)
>>> b = np.fromfile(fname, dtype=np.int32)
>>> b
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
It should be noted that functions tofile
and fromfile
functions use binary methods to store and read data, which is not platform-independent, and secondly, this method cannot preserve the byte order and data type of the array. For example, if storing a two-dimensional data, fromfile
the original data read by the function is one-dimensional, and must be specified dtype
to read the data correctly:
>>> a = np.arange(12).reshape(3,4)
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> a.tofile(fname)
# 这种方式不存储数组结构信息,所以读取出来变成了一维的
>>> b = np.fromfile(fname, dtype=np.int32)
>>> b
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
# 这种方式不存储数据类型,所以如果不指定dtype,是无法正确读出数据的
>>> c = np.fromfile(fname)
>>> c
array([2.12199579e-314, 6.36598737e-314, 1.06099790e-313, 1.48539705e-313,
1.90979621e-313, 2.33419537e-313])
To sum up the defects tofile
of fromfile
the method of sum, the official suggestion is to use save
the load
method of sum for data storage and reading:
>>> fname = tempfile.mkstemp()[1]
>>> fname
'C:\\Users\\doudou\\AppData\\Local\\Temp\\tmpynt0tqt3'
>>> a = np.arange(12).reshape(3,4)
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> np.save(fname, a)
>>> b = np.load(fname + '.npy')
>>> b
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
It can be seen that save
the load
method of sum can correctly restore the structure and data type of the data, which is much more convenient.
3.15 np.identity()
Create an identity matrix with N rows and columns
>>> a = np.identity(4)
>>> a
array([[1., 0., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 1.]])
>>> b = np.identity(2)
>>> b
array([[1., 0.],
[0., 1.]])
3.16 np.eye()
Create a two-dimensional matrix with specified rows and columns. The diagonal elements are all 1. If you do not specify the number of columns, the default number of columns and rows is equal:
>>> a = np.eye(4)
>>> a
array([[1., 0., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 1.]])
>>> b = np.eye(4, 5)
>>> b
array([[1., 0., 0., 0., 0.],
[0., 1., 0., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 1., 0.]])
By setting the parameter k
, you can specify the offset of the diagonal of all 1s. k
If it is positive, it is cheaper upwards, and if it is negative, it is offset downwards:
>>> a = np.eye(4, k=1)
>>> a
array([[0., 1., 0., 0.],
[0., 0., 1., 0.],
[0., 0., 0., 1.],
[0., 0., 0., 0.]])
>>> b = np.eye(4, k=2)
>>> b
array([[0., 0., 1., 0.],
[0., 0., 0., 1.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]])
>>> c = np.eye(4, k=-1)
>>> c
array([[0., 0., 0., 0.],
[1., 0., 0., 0.],
[0., 1., 0., 0.],
[0., 0., 1., 0.]])
3.17 np.mgrid()
Returns an array of multidimensional structures, and the start and end values and step size of each dimension can be specified.
np.mgrid[第1维, 第2维, 第3维, ...]
The written form of the nth dimension is:
a:b:c
c represents the step size, and is a real number representing the interval; the length is [a, b), left open and right closed
or:
a:b:cj
cj represents the step size, and it is a complex number representing the number of points; the length is [a,b], left closed and right closed
>>> x, y = np.mgrid[1:3:1, 4:5:2j]
>>> x
array([[1., 1.],
[2., 2.]])
>>> y
array([[4., 5.],
[4., 5.]])
The function
np.mgrid()
and the functionnp.indices()
are similar in the returned results, butnp.indices()
the values in the returned array are continuous coordinates, andnp.mgrid()
the values of are the values in the interval specified by the parameters.
3.18 np.ogrid()
Returns an array of sparse multi-dimensional structures, and the calling method np.mgrid()
is consistent with parameters and functions:
>>> x, y = np.ogrid[1:3:1, 4:5:2j]
>>> x
array([[1.],
[2.]])
>>> y
array([[4., 5.]])
4. Index
Array indexing refers to using square brackets []
to index the value of the array.
4.1 Single-element indexing of one-dimensional arrays
>>> a = np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> a[2]
2
>>> a[-2]
8
4.2 Single-element indexing of multidimensional arrays
>>> a = np.arange(10).reshape(2, 5)
>>> a
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
>>> a[1, 3]
8
>>> a[1, -1]
9
>>> a[1][-1]
9
For standard Python lists and tuples,
[x, y]
this , only[x][y]
the supported method, which is also applicable in NumPy
4.3 One-dimensional array slicing and stride indexing
For one-dimensional arrays, NumPy slices and strides are indexed in the same way as lists and tuples:
>>> a = np.arange(10)
>>> a[2:5]
array([2, 3, 4])
>>> a[:-7]
array([0, 1, 2])
>>> a[1:7:2]
array([1, 3, 5])
4.4 Multidimensional array slicing and stride indexing
For multidimensional arrays, NumPy supports slice and stride indexing, while lists and tuples do not:
>>> a = np.arange(35).reshape(5,7)
>>> a
array([[ 0, 1, 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11, 12, 13],
[14, 15, 16, 17, 18, 19, 20],
[21, 22, 23, 24, 25, 26, 27],
[28, 29, 30, 31, 32, 33, 34]])
>>> a[1:5:2,::3]
array([[ 7, 10, 13],
[21, 24, 27]])
4.5 Indexing Arrays
NumPy supports the method of indexing multiple elements in this array at the same time through an array or list (not tuples).
The indexed array must be of integer type, negative numbers can be used:
>>> a = np.arange(10)
>>> a[np.array([1, 1, 3, 5, -2])]
array([1, 1, 3, 5, 8])
>>> a[[1, 1, 3, 5, -2]]
array([1, 1, 3, 5, 8])
We can also use multidimensional indexed arrays (note: not indexing multidimensional arrays, but multidimensional indexed arrays - i.e. indexed arrays that are themselves multidimensional):
>>> a = np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> a[np.array([[1, 1], [7, -2]])]
array([[1, 1],
[7, 8]])
In the above example, we used a two-dimensional index array to index a one-dimensional array and obtained a two-dimensional array. That is, using an indexed array returns an array with the same shape as the indexed array.
4.6 Indexing multidimensional arrays
Index a multidimensional array using an index array:
>>> a = np.arange(35).reshape(5, 7)
>>> a
array([[ 0, 1, 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11, 12, 13],
[14, 15, 16, 17, 18, 19, 20],
[21, 22, 23, 24, 25, 26, 27],
[28, 29, 30, 31, 32, 33, 34]])
>>> a[[2, 4, 4], [3, 3, 4]]
array([17, 31, 32])
In the above example, we used two index arrays to index a two-dimensional array: [2, 4, 4]
and [3, 3, 4]
, in this way, it means that the element positions we want to index are [2,3], [4,3], [4,4]
these three, so the result [17, 31, 32]
is
We can leverage the broadcast mechanism for indexing:
>>> a[[2, 4, 4], 5]
array([19, 33, 33])
In this way, the element position we want to index is [2,5], [4,5], [4,5]
, and the result is [19, 33, 33]
.
When indexing a two-dimensional array with a one-dimensional index array, a new array of selected rows is obtained:
>>> a = np.arange(10).reshape(2, 5)
>>> a
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
>>> a[[0, 0, 1]]
array([[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
This method is also applicable to the case where the multidimensional index array is lower than the dimension of the array.
4.7 Boolean Indexed Arrays
The principle of Boolean index array is True
to output, False
not to output:
>>> a = np.arange(12).reshape(3,4)
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> b = a > 5
>>> b
array([[False, False, False, False],
[False, False, True, True],
[ True, True, True, True]])
>>> a[b]
array([ 6, 7, 8, 9, 10, 11])
When a Boolean indexed array has fewer dimensions than an array, it is indexed according to the higher dimension:
>>> a = np.arange(35).reshape(5, 7)
>>> a
array([[ 0, 1, 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11, 12, 13],
[14, 15, 16, 17, 18, 19, 20],
[21, 22, 23, 24, 25, 26, 27],
[28, 29, 30, 31, 32, 33, 34]])
>>> a[[True, False, True, False, False]]
array([[ 0, 1, 2, 3, 4, 5, 6],
[14, 15, 16, 17, 18, 19, 20]])
In the above example, the Boolean index array is one-dimensional. When indexing a two-dimensional array, a two-dimensional array composed of the first row and the third row is obtained.
Look at an example of a three-dimensional array again, and index a three-dimensional array through a two-dimensional Boolean index array:
>>> a = np.arange(30).reshape(2, 3, 5)
>>> a
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]],
[[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29]]])
>>> b = np.array([[True, True, False], [False, True, True]])
>>> a[b]
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29]])
4.8 Combining indexed arrays and slices
Slices can be used in indexed arrays:
>>> a = np.arange(35).reshape(5, 7)
>>> a
array([[ 0, 1, 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11, 12, 13],
[14, 15, 16, 17, 18, 19, 20],
[21, 22, 23, 24, 25, 26, 27],
[28, 29, 30, 31, 32, 33, 34]])
>>> a[np.array([0, 2, 4]), 1:3]
array([[ 1, 2],
[15, 16],
[29, 30]])
4.8 np.newaxis
and...
np.newaxis
: You can use to add a new dimension, in this case there will be no new elements, but the dimension is increased:
>>> a=np.arange(5)
>>> a
array([0, 1, 2, 3, 4])
>>> a.shape
(5,)
>>> b = a[:,np.newaxis]
>>> b
array([[0],
[1],
[2],
[3],
[4]])
>>> b.shape
(5, 1)
>>> c = a[np.newaxis,:]
>>> c.shape
(1, 5)
>>> c
array([[0, 1, 2, 3, 4]])
...
: Indicates the colon needed to generate a complete indexed array:
- x[1,2,…] is equivalent to x[1,2,:,:,:],
- x[…,3] is equivalent to x[:,:,:,:,3]
- x[4,...,5,:] is equivalent to x[4,:,:,5,:].
>>> a = np.arange(15).reshape(5, 3)
>>> a
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11],
[12, 13, 14]])
>>> a[1,...]
array([3, 4, 5])
>>> a[1]
array([3, 4, 5])
>>> a[...,2]
array([ 2, 5, 8, 11, 14])
4.9 Assignment after indexing
You can take advantage of the broadcast feature or assign a value with the same shape to the indexed array:
>>> a = np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> a[2:7] = 1
>>> a
array([0, 1, 1, 1, 1, 1, 1, 7, 8, 9])
>>> a[2:7]=np.arange(5)
>>> a
array([0, 1, 0, 1, 2, 3, 4, 7, 8, 9])
>>> a[2:7] += 10
>>> a
array([ 0, 1, 10, 11, 12, 13, 14, 7, 8, 9])
5. Iteration
We can iterate over an array in the following way, starting from the highest dimension by default:
>>> a = np.arange(6).reshape(2, 3)
>>> for x in a:
... print(x)
...
[0 1 2]
[3 4 5]
You can also use the property of array flat
, which will reduce the dimension of the array and tile it, and then you can easily traverse all the elements:
>>> for x in a.flat:
... print(x)
...
0
1
2
3
4
5
6. Array operations
6.1 Changing the size
reshape
You can get an array of corresponding size by :
>>> a = np.arange(6)
>>> a
array([0, 1, 2, 3, 4, 5])
>>> a.reshape(2,3)
array([[0, 1, 2],
[3, 4, 5]])
>>> a
array([0, 1, 2, 3, 4, 5])
You can also resize
change the actual size of the array:
>>> a.resize(2,3)
>>> a
array([[0, 1, 2],
[3, 4, 5]])
It should be noted that reshape
only a new array of corresponding size is returned, and the size of the original array will not be changed; instead, resize
the size of the array itself will be changed.
Direct assignment can be made .shape
to change the size:
>>> a = np.arange(6)
>>> a.shape = (3,2)
>>> a
array([[0, 1],
[2, 3],
[4, 5]])
For reshape()
and .shape
, we can specify size as -1, which will automatically calculate its size (note: resize()
not supported):
>>> a = np.arange(6)
>>> a.shape = 3, -1
>>> a
array([[0, 1],
[2, 3],
[4, 5]])
>>> a.reshape(2, -1)
array([[0, 1, 2],
[3, 4, 5]])
>>> a.reshape(-1, 2)
array([[0, 1],
[2, 3],
[4, 5]])
6.2 Tiling
Multi-dimensional arrays can be flattened into one-dimensional arrays through functions flatten()
and ravel()
, but the difference between the two is that the flatten()
modified array will not change the original array, and ravel()
the modified array will affect the original array:
>>> a = np.arange(6).reshape(2, 3)
>>> a
array([[0, 1, 2],
[3, 4, 5]])
>>> b = a.flatten()
>>> b
array([0, 1, 2, 3, 4, 5])
# 修改这个平铺数组不会影响原数组
>>> b[0] = 100
>>> a
array([[0, 1, 2],
[3, 4, 5]])
>>> c = a.ravel()
>>> c
array([0, 1, 2, 3, 4, 5])
# 修改这个平铺数组会影响原数组
>>> c[0] = 100
>>> a
array([[100, 1, 2],
[ 3, 4, 5]])
6.3 Sorting
By sort()
sorting the array, you can specify the dimension to sort, and the default is to sort by the lowest dimension:
>>> a = np.array([[8, 0, 1], [5, 3, 4], [2, 6, 7]])
>>> a
array([[8, 0, 1],
[5, 3, 4],
[2, 6, 7]])
>>> a.sort()
>>> a
array([[0, 1, 8],
[3, 4, 5],
[2, 6, 7]])
>>> a.sort(axis=0)
>>> a
array([[0, 1, 5],
[2, 4, 7],
[3, 6, 8]])
By default, the functions sort()
are sorted in ascending order. If descending order is required, negative numbers can be used: -np.sort(-a)
.
6.4 Transpose
Transposing an array can be done through properties .T
or functions:transponse()
>>> a = np.arange(6).reshape(2, 3)
>>> a
array([[0, 1, 2],
[3, 4, 5]])
>>> a.T
array([[0, 3],
[1, 4],
[2, 5]])
>>> a.transpose()
array([[0, 3],
[1, 4],
[2, 5]])
The transpose method above doesn't change the array itself, it just returns a new array.
6.5 Consolidation
Arrays can be merged with np.vstack()
and :np.hstack()
>>> a = np.arange(4).reshape(2, 2)
>>> a
array([[0, 1],
[2, 3]])
>>> b = np.arange(start=4, stop=8).reshape(2, 2)
>>> b
array([[4, 5],
[6, 7]])
>>> np.vstack((a,b))
array([[0, 1],
[2, 3],
[4, 5],
[6, 7]])
>>> np.hstack((a,b))
array([[0, 1, 4, 5],
[2, 3, 6, 7]])
The above are fixed vertical merges and horizontal merges, and can also be merged by np.concatenate
specifying the axis ( axis
the smaller the size, the higher the dimension):
>>> np.concatenate((a,b), axis=0)
array([[0, 1],
[2, 3],
[4, 5],
[6, 7]])
>>> np.concatenate((a,b), axis=1)
array([[0, 1, 4, 5],
[2, 3, 6, 7]])
6.6 Division
np.hsplit
Arrays can be split along the horizontal axis by . Equal or unequal splits are possible:
>>> a = np.arange(9).reshape(3, 3)
>>> a
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
# 等量分割成3列
>>> np.hsplit(a, 3)
[array([[0],
[3],
[6]]), array([[1],
[4],
[7]]), array([[2],
[5],
[8]])]
# 不等量分割,1份为1列,1份为2列
>>> np.hsplit(a, (1,2))
[array([[0],
[3],
[6]]), array([[1],
[4],
[7]]), array([[2],
[5],
[8]])]
np.vsplit
Splitting can be done along a vertical axis with or np.array_split
specifying an axis with .
6.7 copy
Arrays in Numpy are divided into no copy, shallow copy and deep copy. If assigned directly, the array object and its data are not copied:
>>> a = np.arange(12)
>>> b = a
>>> id(a)
2258092791568
>>> id(b)
2258092791568
>>> b is a
True
view()
A view can be created through the function , that is, a shallow copy. It appears that different array objects share the same data, and changing one affects the other:
>>> a = np.arange(12)
>>>
>>> b = a.view()
>>> id(a)
2257817754800
>>> id(b)
2258092792144
>>> a[0] = 130
>>> b
array([130, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
copy()
A complete copy of an array and data can be generated through the function , that is, a deep copy, and the two do not affect each other:
>>> a = np.arange(12)
>>> b = a.copy()
>>> id(a)
2258092791664
>>> id(b)
2258092791568
>>> a[0] = 130
>>> b
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
7. Array operations
The calculation efficiency in NumPy is high and has the characteristics of vectorization, which is very easy to understand. Basic mathematical operations refer to operations based on mathematical symbols, including the following types (operations are performed according to corresponding elements):
serial number | operation | describe |
---|---|---|
1 | + , - , * , / , // , % |
basic math operations |
2 | ** |
Exponential operation |
3 | += , -= , *= , /= , //= , %= , **= |
compound assignment operation |
4 | > , >= , < , <= , == |
comparison operation |
7.1 +, -, *, /, //, %
NumPy arrays support direct operations with numbers:
>>> a = np.array([1,2,3])
>>> a+10 # 加
array([11, 12, 13])
>>> a-10 # 减
array([-9, -8, -7])
>>> a*10 # 乘
array([10, 20, 30])
>>> a/10 # 除
array([0.1, 0.2, 0.3])
>>> a // 10 # 取整除
array([0, 0, 0], dtype=int32)
>>> a%10 # 取余
array([1, 2, 3], dtype=int32)
Operations between arrays and arrays are also supported:
>>> a = np.array([1,2,3])
>>> b = np.array([4,5,6])
>>> a+b
array([5, 7, 9])
>>> a-b
array([-3, -3, -3])
>>> a*b
array([ 4, 10, 18])
>>> a/b
array([0.25, 0.4 , 0.5 ])
>>> a//b
array([0, 0, 0])
>>> a%b
array([1, 2, 3])
Because the operation between arrays is to operate on the elements at the same position one by one, the arrays participating in the operation must have the same value shape
, otherwise the operation cannot be performed.
7.2 **
The passed **N
form can perform operations to the power of N, and supports the method between numbers or arrays:
>>> a = np.array([1,2,3])
>>> b = np.array([4,5,6])
>>> a**2 # 数组a的2次方
array([1, 4, 9])
>>> a**b # 数组a的b次方
array([1, 32, 729])
7.3 +=, -=, *=, /=, //=, %=, **=
NumPy's compound assignment operation is similar to most programming languages, but the difference between it and the basic mathematical operations and exponential operations in the previous two sections is that basic mathematical operations and exponential operations will generate a new array and will not change the value of the array elements; Compound assignment operations, on the other hand, act on the array elements themselves.
>>> import numpy as np
>>> a = np.array([1,2,3])
>>> a += 1
>>> a
array([2, 3, 4])
>>> a -= 1
>>> a
array([1, 2, 3])
>>> a *= 2
>>> a
array([2, 4, 6])
>>> a **= 2
>>> a
array([4, 16, 36])
In the above example we can see that the array itself is actually changed a
.
In addition, it is not used in the example /=
, because it is special, so it needs to be said separately. In theory , after /=
we define the array , it should be done directly, but if we write it like this, an error will be reported:a
a /= 2
>>> a /= 2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
numpy.core._exceptions._UFuncOutputCastingError: Cannot cast ufunc 'divide' output from dtype('float64') to dtype('int32') with casting rule 'same_kind'
That is to say /=
, the function array needs to be a floating-point type, and we defined it as an integer, so we have two ways, that is, we define the array as a floating-point type at the beginning, or use symbols to handle integer division //=
:
>>> a = np.array([1, 2, 3], dtype=np.float32)
>>> a /= 2
>>> a
array([0.5, 1. , 1.5], dtype=float32)
>>> b = np.array([1, 2, 3])
>>> b //= 2
>>> b
array([0, 1, 1])
7.4 >, >=, <, <=, ==
By comparing operations, we can get an array of Booleans:
>>> a = np.array([1, 2, 3])
>>> a > 1
array([False, True, True])
>>> a >= 1
array([ True, True, True])
>>> a < 1
array([False, False, False])
>>> a <= 1
array([ True, False, False])
>>> a == 1
array([ True, False, False])
8. Broadcast
Broadcasting in NumPy is a very interesting concept. Its main purpose is to allow arrays of different shapes or dimensions to be operated and have their own set of rules.
The condition for broadcast compatibility is that the two arrays are deduced from the last dimension,
- they are the same size, or
- one of them is 1
Then they are broadcast compatible, as the following two examples satisfy broadcast compatible:
A (3d array): 15 x 3 x 5
B (2d array): 3 x 5
Result (3d array): 15 x 3 x 5
A (4d array): 8 x 1 x 6 x 1
B (3d array): 7 x 1 x 5
Result (4d array): 8 x 7 x 6 x 5
If the above conditions are not met, broadcasting is not compatible, and ValueError: operands could not be broadcast together
an exception :
A (1d array): 3
B (1d array): 4 # trailing dimensions do not match
A (2d array): 2 x 1
B (3d array): 8 x 4 x 3 # second from last dimensions mismatched
To understand broadcasting, you mainly need to understand the size rules for generating arrays. If you understand this, the specific operations will be clearer:
# 一维数组的广播
>>> a = np.arange(4)
>>> a + 10
array([10, 11, 12, 13])
# 二维数组的广播
>>> b = a.reshape(4, 1)
>>> b
array([[0],
[1],
[2],
[3]])
>>> c = np.ones(5)
>>> b + c
array([[1., 1., 1., 1., 1.],
[2., 2., 2., 2., 2.],
[3., 3., 3., 3., 3.],
[4., 4., 4., 4., 4.]])
As we introduced earlier np.newaxis
, we can use it to increase the dimension of the array, and then perform broadcast calculations:
>>> a = np.array([0.0, 10.0, 20.0, 30.0])
>>> b = np.array([1.0, 2.0, 3.0])
>>> a[:, np.newaxis] + b
array([[ 1., 2., 3.],
[ 11., 12., 13.],
[ 21., 22., 23.],
[ 31., 32., 33.]])
9. General functions
In addition to using mathematical symbols directly, NumPy also provides some mathematical functions, which are called "general functions" ( ufunc
). These general functions operate element-wise on arrays and produce a new array.
9.1 Unary functions
function | describe |
---|---|
np.abs | absolute value |
np.sqrt | Square root (the result of negative square root is NAN) |
np.square | square |
np.exp | Calculate the exponent (e^x) |
np.log,np.log10,np.log2,np.log1p | Find the logarithm with base e, base 10, base 2, and base (1+x |
np.sign | Label the values in the array, those greater than 0 become 1, those equal to 0 become 0, and those less than 0 become -1 |
np.ceil | Get the upper bound integer for each element in the array |
np.floor | Get the lower bound integer of each element in the array |
np.clop | Set upper and lower limits in an array |
np.rint,np.round | Returns the rounded value |
np.modf | Split integers and decimals to form two arrays |
np. isnan | Judging whether it is nan |
np.isinf | Determine whether it is inf |
np.cos,np.cosh,np.sinh,np.tan,np.tanh | Trigonometric functions |
np.arccos,np.arcsin,np.arctan | inverse trigonometric functions |
np.nonzero | Find non-zero elements and their coordinates |
np.cumsum | accumulate |
np.diff | Tired |
9.2 Binary functions
function | describe |
---|---|
np.add | Addition |
np.subtract | Subtraction |
np.negative | complex arithmetic |
np.multiply | Multiplication |
np.divide | division operation |
np.floor_divide | Rounding operation, equivalent to // |
np.mod | remainder operation |
np.dot | inner product (dot product) |
greater,greater_equal,less,less_equal,equal,not_equal | Function expression of >,>=,<,<=,=,!= |
logical_and | and the operator function expression |
logical_or | or operator function expression |
9.3 Aggregate functions
function | describe |
---|---|
np.sum | Compute the sum of the elements |
np.prod | Calculate the product of elements |
np.mean | Computes the average of the elements |
np.std | Compute element-wise standard deviation |
np.var | Compute element-wise variance |
np.min | Computes the minimum value of an element |
np.max | Computes the maximum value of an element |
np.argmin | Find the index of the minimum value |
np.argmax | Find the index of the maximum value |
np.median | Computes the median of the elements |
np.average | Computes the weighted average of elements |
9.4 Boolean Array Functions
function | describe |
---|---|
np.all() | Determine whether all elements or the elements of the specified axis are all True |
np.any() | 判断所有元素或者指定轴的元素中,是否有True |
10. API函数
NumPy中的API的类型很丰富,而且非常多,这一部分可参照官方API文档。