NumPy Getting Started Tutorial

1. Introduction to NumPy

NumPy is the basic package for scientific computing in Python. It is a Python library that provides multidimensional array objects, various derived objects (such as mask arrays and matrices), and various APIs for fast operations on arrays, including mathematics, logic , shape operations, sorting, selection, input and output, discrete Fourier transform, basic linear algebra, basic statistical operations and stochastic simulation, etc.

At the heart of the NumPy package are ndarrayobjects. It encapsulates python's native n-dimensional array of the same data type. Since the bottom layer of NumPy uses C code for precompilation, the operating efficiency is much higher than that of Python's native array. At the same time, NumPy has the characteristics of vectorization and broadcasting, so the code is more concise and streamlined.

This article mainly introduces the basic concepts and usage of NumPy, aiming to get started with NumPy quickly.


2. Basics - Arrays

The native Python list supports containing elements of different types at the same time, while the main object of NumPy is a multidimensional array of the same type. Dimensions in NumPy are called axes .

NumPy's array class is ndarraythat it has more properties than Python's native array:

  • ndarray.ndim

    The number of axes (dimensions) of the array, also known as rank.

  • ndarray.shape

    The dimension of the array is a tuple of arrays, representing the size of each dimension of the array. If the matrix is n​​rows and mcolumns, shapeyes (n, m). ndimis shapethe length of the tuple.

  • ndarray.size

    The total number of elements in the array, equal to shapethe product of the elements in the array.

  • ndarray.dtype

    An object used to describe the type of elements in the array. Standard Python data types can be used. You can also use the data types provided by NumPy, such as numpy.int32, numpy.int64and numpy.float64etc.

  • ndarray.itemsize

    The length in bytes of the element type. For example, float64the type itemsizeis 8 (=64/8), and another complex32example itemsizeis 4 (=32/8). It is ndarray.dtype.itemsizeequal to .

  • ndarray.data

    A buffer containing the actual elements in the element. Normally, we don't need to use this property directly, because we can access the elements in the array by index.

3. Create an array

Arrays are a core concept in NumPy, so creating them is the first step. There are 5 general mechanisms for creating arrays:

  • Convert from other Python structures (e.g., lists, tuples)
  • Creation of numpy native arrays (eg, arange, ones, zeros, etc.)
  • Read arrays from disk, either in standard or custom formats
  • Create an array from raw bytes by using a string or buffer
  • use special library functions (eg, random)

The following summarizes some commonly used ways of creating arrays (not only creating regular arrays, but also the API provided by NumPy for creating various special arrays):

serial number API describe
1 np.array() Create arrays from Python lists or tuples
2 np.arange() Create an array of numbers
3 np.zeros() Create an array of all 0's
4 np.zeros_like() Create an array of all 0s with the same size as the target array
5 np.ones() Create an array of all 1's
6 np.ones_like() Create an array of all 1s with the same size as the target array
7 np.empty() Creates an array of the specified size, but does not initialize the value
8 np.empty_like() Creates an array of the same size as the target array without value initialization
9 np.lispace() Create an arithmetic array
10 np.logspace() Create a proportional array
11 np.indices() Create an array collection of specified shape numbers
12 np.random.random create a random array
13 np.fromfunction() Create an array from a specified function
14 np.fromfile() Create an array from the specified file
15 np.identity() Create an identity matrix with equal rows and columns
16 np.eye() Create a two-dimensional matrix, the elements of the diagonal are all 1, the rows and columns can be different, and the diagonal offset can be specified
17 np.mgrid() Returns an array of multidimensional structures
18 np.ogird() Returns an array of sparse multidimensional structures

3.1 np.array()

Create a one-dimensional array:

>>> a = np.array([1,2,3])
>>> a
array([1, 2, 3])

Create a 2D array:

>>> b = np.array([[1,2,3], [4,5,6]])
>>> b
array([[1, 2, 3],
       [4, 5, 6]])

When creating an array, manually specify the type of the array:

# 指定为float32类型
>>> c = np.array([1, 2, 3], dtype=np.float32)
>>> c
array([1., 2., 3.], dtype=float32)

NumPy can accept both python lists and tuples, so it array()is fine to pass in tuples or lists. It also includes some functions that need to pass in shapeparameters such as tuples or lists.

3.2 np.arange()

Create a sequence starting at 0 and ending at a certain number, with a default step size of 1:

>>> a = np.arange(3)
>>> a
array([0, 1, 2])

Create an array starting at a certain number and ending at a certain number, specifying a step size and type:

>>> a = np.arange(start=1, stop=8, step=2, dtype=np.int32)
>>> a
array([1, 3, 5, 7])

The way arange() creates an array is left closed and right open [start, stop)

3.3 np.zeros()

Create an array of all 0s of the specified shape:

>>> a = np.zeros((3,2))
>>> a
array([[0., 0.],
       [0., 0.],
       [0., 0.]])
>>> a.dtype
dtype('float64')

Function zerosdefaults to float64type. You can manually specify other types when creating:

>>> a = np.zeros((3,2), dtype=np.int32)
>>> a
array([[0, 0],
      [0, 0],
      [0, 0]])
>>> a.dtype
dtype('int32')

3.4 np.zeros_like()

Create an array of all zeros of the same size as the destination array:

>>> a = np.array([[1,2,3], [4,5,6]])
>>> a
array([[1, 2, 3],
       [4, 5, 6]])
>>> b = np.zeros_like(a)
>>> b
array([[0, 0, 0],
       [0, 0, 0]])

3.5 np.ones()

Create an array of all 1s of the specified shape, which np.zeros()is the same as the usage, so I won’t go into details here. The default type is also float64, you can dtypespecify the type.

3.6 np.ones_like()

Create an array of all 1s with the same size as the target array, and np.zeros_like()use it the same way.

3.7 np.empty()

Create an array of the specified shape, but the values ​​in the array are not initialized. Empty is created faster because it is not initialized. Can be used to create scenarios that require a placeholder array:

>>> a = np.empty((3, 4))
>>> a
array([[6.23042070e-307, 2.04719289e-306, 6.23057010e-307,
        1.24611741e-306],
       [1.78019082e-306, 6.23058028e-307, 1.06811422e-306,
        3.56043054e-307],
       [1.37961641e-306, 8.90071135e-308, 1.78021527e-306,
        1.66889876e-307]])

3.8 np.empty_like()

Creates an array of the same size as the destination array, but does not initialize it.

3.9 np.linspace()

Create an arithmetic array starting at a certain number and ending at a certain number, specifying the number of elements in the interval:

>>> a = np.linspace(2, 9, num=8)
>>> a
array([2., 3., 4., 5., 6., 7., 8., 9.])

The range of the array created by the function linspaceis the default left-closed and right-closed [start, stop] , which includes stop, and you can Endpointspecify whether to include stop through parameters:

>>> a = np.linspace(2, 9, num=7, endpoint=False)
>>> a
array([2., 3., 4., 5., 6., 7., 8.])

The function is similar linspaceto arangethe comparison, the function arangespecifies the step size, but linspacethe total number specified by the function, its step size is dynamically calculated, we can retstepget the calculated step size through the parameters:

>>> a = np.linspace(2, 9, num=7, endpoint=False, retstep=True)
>>> a
(array([2., 3., 4., 5., 6., 7., 8.]), 1.0)
>>> a = np.linspace(2, 9, num=7, endpoint=False, retstep=True)

# 指定retstep之后可以同时得到数组和步长
>>> a
(array([2., 3., 4., 5., 6., 7., 8.]), 1.0)
# 获取创建出来的数组
>>> a[0]
array([2., 3., 4., 5., 6., 7., 8.])
# 获取数组的步长
>>> a[1]
1.0

3.10 np.logspace()

Create a proportional array, the default is base 10, starting from 1 0 start 10^{start}10s t a r t start, to1 0 stop 10^{stop}10s t o p ends, the total isnum(default 50):

>>> a = np.logspace(1, 5, num=5)
>>> a
array([1.e+01, 1.e+02, 1.e+03, 1.e+04, 1.e+05])

>>> b = np.logspace(1, 5, num=4, dtype=np.int32)
>>> b
array([    10,    215,   4641, 100000])

The base can basebe specified by parameters, so that is from basestart base^{start}bases t a r t start, tobasestop base^{stop}bases t o p ends, of course, you can alsoendpointspecify whether to include itstop:

>>> a = np.logspace(1, 10, base=2, num=10, dtype=np.int32)
>>> a
array([   2,    4,    8,   16,   32,   64,  128,  256,  512, 1024])

>>> b = np.logspace(1, 5, base=3, num=5, dtype=np.int32)
>>> b
array([  3,   9,  27,  81, 243])

3.11 np.indices()

Create an array collection of the specified shape serial number, the collection is the serial number of each dimension. It sounds abstract, but it should be easier to understand by looking at an example:

>>> a = np.indices((3,2))
>>> a
array([[[0, 0],
        [1, 1],
        [2, 2]],

       [[0, 1],
        [0, 1],
        [0, 1]]])

>>> a[0]
array([[0, 0],
       [1, 1],
       [2, 2]])

>>> a[1]
array([[0, 1],
       [0, 1],
       [0, 1]])

In the above example, we specified shape(3, 2). Just imagine, a 3x2 matrix, the coordinate distribution should be like this:
[ a 0 , 0 a 0 , 1 a 1 , 0 a 1 , 1 a 2 , 0 a 2 , 1 ] \begin{bmatrix} a_{0, 0} & a_{0,1} \\ a_{1,0} & a_{1,1} \\ a_{2,0} & a_{2,1} \\ \end{bmatrix} a0,0a1,0a2,0a0,1a1,1a2,1

The high-dimensional coordinate matrix is,
[ 0 , 0 1 , 1 2 , 2 ] \begin{bmatrix} 0, 0 \\ 1, 1 \\ 2, 2 \\ \end{bmatrix} 0,01,12,2

The low-dimensional coordinate matrix is,
[ 0 , 1 0 , 1 0 , 1 ] \begin{bmatrix} 0, 1 \\ 0, 1 \\ 0, 1 \\ \end{bmatrix} 0,10,10,1

Exactly corresponds to the result in the above example. Let's look at another example of a 3D matrix:

>>> a = np.indices((2,3,2))
>>> a[0]
array([[[0, 0],
        [0, 0],
        [0, 0]],
        
       [[1, 1],
        [1, 1],
        [1, 1]]])
        
>>> a[1]
array([[[0, 0],
        [1, 1],
        [2, 2]],

       [[0, 0],
        [1, 1],
        [2, 2]]])
        
>>> a[2]
array([[[0, 1],
        [0, 1],
        [0, 1]],

       [[0, 1],
        [0, 1],
        [0, 1]]])

Similarly, a 2x3x2 three-dimensional matrix is ​​to overlap the following two two-dimensional matrices back and forth, you can make up your mind: [
a 0 , 0 , 0 a 0 , 0 , 1 a 0 , 1 , 0 a 0 , 1 , 1 a 0 , 2 , 0 a 0 , 2 , 1 ] \begin{bmatrix} a_{0,0,0} & a_{0,0,1} \\ a_{0,1,0} & a_{0,1 ,1} \\ a_{0,2,0} & a_{0,2,1} \\ \end{bmatrix} a0,0,0a0,1,0a0,2,0a0,0,1a0,1,1a0,2,1

[ a 1 , 0 , 0 a 1 , 0 , 1 a 1 , 1 , 0 a 1 , 1 , 1 a 1 , 2 , 0 a 1 , 2 , 1 ] \begin{bmatrix} a_{1,0,0} & a_{1,0,1} \\ a_{1,1,0} & a_{1,1,1} \\ a_{1,2,0} & a_{1,2,1} \\ \end{bmatrix} a1,0,0a1,1,0a1,2,0a1,0,1a1,1,1a1,2,1

The highest dimension coordinate matrix is,
[ 0 , 0 0 , 0 0 , 0 ] \begin{bmatrix} 0, 0 \\ 0, 0 \\ 0, 0 \\ \end{bmatrix} 0,00,00,0

[ 1 , 1 1 , 1 1 , 1 ] \begin{bmatrix} 1, 1 \\ 1, 1 \\ 1, 1 \\ \end{bmatrix} 1,11,11,1

The coordinate matrix of the remaining two dimensions is the same, so I won't go into details. The indicesordinal array generated by the function can be used to calculate a sub-array of an array, for example:

>>> a = np.arange(12).reshape(3,4)
>>> a
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
       
>>> b = np.indices((2,3))
>>> b
array([[[0, 0, 0],
        [1, 1, 1]],

       [[0, 1, 2],
        [0, 1, 2]]])
        
>>> a_sub = a[b[0], b[1]]
>>> a_sub
array([[0, 1, 2],
       [4, 5, 6]])

#  在数组切片层面,效果同
>>> a[:2, :3]
array([[0, 1, 2],
       [4, 5, 6]])

indicesThe array created by the function is non-sparse by default, and can sparsebe specified as sparse by parameters:

>>> a = np.indices((3,2), sparse=True)
>>> a
(array([[0],
       [1],
       [2]]), array([[0, 1]]))

3.12 np.random

np.random.random()Generate a specified shaperandom array, the value range is [0, 1) by default :

>>> a = np.random.random((2,3))
>>> a
array([[0.48152382, 0.58393539, 0.61701583],
       [0.05104326, 0.23513154, 0.21062412]])

np.random.randint()Generate a specified shaperandom array, and the default range of values ​​is passed in as a parameter:

# [3,9)之间的随机整数,数组尺寸3X4
>>> a = np.random.randint(3, 9, (3,4))
>>> a
array([[6, 3, 3, 4],
       [5, 7, 7, 3],
       [7, 6, 3, 7]])

The function of the np.random.shuffle()function is to randomly shuffle the array. np.randomThere are many other APIs in the package for generating random arrays. Only the two most basic functions are introduced here.

3.13 np.fromfunction()

Create an array of a specified size, and initialize the value by passing in a lambda expression. The parameter of the lambda expression is the coordinate corresponding to the value:

>>> a = np.fromfunction(lambda i,j: i, (2,2))
>>> a
array([[0., 0.],
       [1., 1.]])

# 将坐标i+j作为数组元素的初始化值
>>> b = np.fromfunction(lambda i,j: i + j, (3,3), dtype=np.int32)
>>> b
array([[0, 1, 2],
       [1, 2, 3],
       [2, 3, 4]])

The above example is a lambda, or passed directly into a function:

>>> def f(x,y):
...     return 10*x+y
...
>>> a = np.fromfunction(f, (2,2), dtype=np.int32)
>>> a
array([[ 0,  1],
       [10, 11]])

3.14 np.fromfile()

Read data from a file and generate an array:

>>> import tempfile
>>> fname = tempfile.mkstemp()[1]
>>> fname
'C:\\Users\\doudou\\AppData\\Local\\Temp\\tmpmyq7otph'

>>> a = np.arange(12)
>>> a
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

>>> a.tofile(fname)
>>> b = np.fromfile(fname, dtype=np.int32)
>>> b
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

It should be noted that functions tofileand fromfilefunctions use binary methods to store and read data, which is not platform-independent, and secondly, this method cannot preserve the byte order and data type of the array. For example, if storing a two-dimensional data, fromfilethe original data read by the function is one-dimensional, and must be specified dtypeto read the data correctly:

>>> a = np.arange(12).reshape(3,4)
>>> a
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

>>> a.tofile(fname)
# 这种方式不存储数组结构信息,所以读取出来变成了一维的
>>> b = np.fromfile(fname, dtype=np.int32)
>>> b
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

# 这种方式不存储数据类型,所以如果不指定dtype,是无法正确读出数据的
>>> c = np.fromfile(fname)
>>> c
array([2.12199579e-314, 6.36598737e-314, 1.06099790e-313, 1.48539705e-313,
       1.90979621e-313, 2.33419537e-313])

To sum up the defects tofileof fromfilethe method of sum, the official suggestion is to use savethe loadmethod of sum for data storage and reading:

>>> fname = tempfile.mkstemp()[1]
>>> fname
'C:\\Users\\doudou\\AppData\\Local\\Temp\\tmpynt0tqt3'

>>> a = np.arange(12).reshape(3,4)
>>> a
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

>>> np.save(fname, a)
>>> b = np.load(fname + '.npy')
>>> b
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

It can be seen that savethe loadmethod of sum can correctly restore the structure and data type of the data, which is much more convenient.

3.15 np.identity()

Create an identity matrix with N rows and columns

>>> a = np.identity(4)
>>> a
array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

>>> b = np.identity(2)
>>> b
array([[1., 0.],
       [0., 1.]])

3.16 np.eye()

Create a two-dimensional matrix with specified rows and columns. The diagonal elements are all 1. If you do not specify the number of columns, the default number of columns and rows is equal:

>>> a = np.eye(4)
>>> a
array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])
       
>>> b = np.eye(4, 5)
>>> b
array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.]])

By setting the parameter k, you can specify the offset of the diagonal of all 1s. kIf it is positive, it is cheaper upwards, and if it is negative, it is offset downwards:

>>> a = np.eye(4, k=1)
>>> a
array([[0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.],
       [0., 0., 0., 0.]])
       
>>> b = np.eye(4, k=2)
>>> b
array([[0., 0., 1., 0.],
       [0., 0., 0., 1.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])
       
>>> c = np.eye(4, k=-1)
>>> c
array([[0., 0., 0., 0.],
       [1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.]])

3.17 np.mgrid()

Returns an array of multidimensional structures, and the start and end values ​​and step size of each dimension can be specified.
np.mgrid[第1维, 第2维, 第3维, ...]

The written form of the nth dimension is:

a:b:c
c represents the step size, and is a real number representing the interval; the length is [a, b), left open and right closed

or:

a:b:cj
cj represents the step size, and it is a complex number representing the number of points; the length is [a,b], left closed and right closed

>>> x, y = np.mgrid[1:3:1, 4:5:2j]
>>> x
array([[1., 1.],
       [2., 2.]])
>>> y
array([[4., 5.],
       [4., 5.]])

The function np.mgrid()and the function np.indices()are similar in the returned results, but np.indices()the values ​​in the returned array are continuous coordinates, and np.mgrid()the values ​​of are the values ​​in the interval specified by the parameters.

3.18 np.ogrid()

Returns an array of sparse multi-dimensional structures, and the calling method np.mgrid()is consistent with parameters and functions:

>>> x, y = np.ogrid[1:3:1, 4:5:2j]
>>> x
array([[1.],
       [2.]])
>>> y
array([[4., 5.]])

4. Index

Array indexing refers to using square brackets []to index the value of the array.

4.1 Single-element indexing of one-dimensional arrays

>>> a = np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> a[2]
2
>>> a[-2]
8

4.2 Single-element indexing of multidimensional arrays

>>> a = np.arange(10).reshape(2, 5)
>>> a
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])
>>> a[1, 3]
8
>>> a[1, -1]
9
>>> a[1][-1]
9

For standard Python lists and tuples, [x, y]this , only [x][y]the supported method, which is also applicable in NumPy

4.3 One-dimensional array slicing and stride indexing

For one-dimensional arrays, NumPy slices and strides are indexed in the same way as lists and tuples:

>>> a = np.arange(10)
>>> a[2:5]
array([2, 3, 4])
>>> a[:-7]
array([0, 1, 2])
>>> a[1:7:2]
array([1, 3, 5])

4.4 Multidimensional array slicing and stride indexing

For multidimensional arrays, NumPy supports slice and stride indexing, while lists and tuples do not:

>>> a = np.arange(35).reshape(5,7)
>>> a
array([[ 0,  1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12, 13],
       [14, 15, 16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25, 26, 27],
       [28, 29, 30, 31, 32, 33, 34]])
>>> a[1:5:2,::3]
array([[ 7, 10, 13],
       [21, 24, 27]])

4.5 Indexing Arrays

NumPy supports the method of indexing multiple elements in this array at the same time through an array or list (not tuples).

The indexed array must be of integer type, negative numbers can be used:

>>> a = np.arange(10)
>>> a[np.array([1, 1, 3, 5, -2])]
array([1, 1, 3, 5, 8])
>>> a[[1, 1, 3, 5, -2]]
array([1, 1, 3, 5, 8])

We can also use multidimensional indexed arrays (note: not indexing multidimensional arrays, but multidimensional indexed arrays - i.e. indexed arrays that are themselves multidimensional):

>>> a = np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> a[np.array([[1, 1], [7, -2]])]
array([[1, 1],
       [7, 8]])

In the above example, we used a two-dimensional index array to index a one-dimensional array and obtained a two-dimensional array. That is, using an indexed array returns an array with the same shape as the indexed array.

4.6 Indexing multidimensional arrays

Index a multidimensional array using an index array:

>>> a = np.arange(35).reshape(5, 7)
>>> a
array([[ 0,  1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12, 13],
       [14, 15, 16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25, 26, 27],
       [28, 29, 30, 31, 32, 33, 34]])
>>> a[[2, 4, 4], [3, 3, 4]]
array([17, 31, 32])

In the above example, we used two index arrays to index a two-dimensional array: [2, 4, 4]and [3, 3, 4], in this way, it means that the element positions we want to index are [2,3], [4,3], [4,4]these three, so the result [17, 31, 32]is

We can leverage the broadcast mechanism for indexing:

>>> a[[2, 4, 4], 5]
array([19, 33, 33])

In this way, the element position we want to index is [2,5], [4,5], [4,5], and the result is [19, 33, 33].

When indexing a two-dimensional array with a one-dimensional index array, a new array of selected rows is obtained:

>>> a = np.arange(10).reshape(2, 5)
>>> a
array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])
>>> a[[0, 0, 1]]
array([[0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

This method is also applicable to the case where the multidimensional index array is lower than the dimension of the array.

4.7 Boolean Indexed Arrays

The principle of Boolean index array is Trueto output, Falsenot to output:

>>> a = np.arange(12).reshape(3,4)
>>> a
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
>>> b = a > 5
>>> b
array([[False, False, False, False],
       [False, False,  True,  True],
       [ True,  True,  True,  True]])
>>> a[b]
array([ 6,  7,  8,  9, 10, 11])

When a Boolean indexed array has fewer dimensions than an array, it is indexed according to the higher dimension:

>>> a = np.arange(35).reshape(5, 7)
>>> a
array([[ 0,  1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12, 13],
       [14, 15, 16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25, 26, 27],
       [28, 29, 30, 31, 32, 33, 34]])

>>> a[[True, False, True, False, False]]
array([[ 0,  1,  2,  3,  4,  5,  6],
       [14, 15, 16, 17, 18, 19, 20]])

In the above example, the Boolean index array is one-dimensional. When indexing a two-dimensional array, a two-dimensional array composed of the first row and the third row is obtained.

Look at an example of a three-dimensional array again, and index a three-dimensional array through a two-dimensional Boolean index array:

>>> a = np.arange(30).reshape(2, 3, 5)
>>> a
array([[[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14]],

       [[15, 16, 17, 18, 19],
        [20, 21, 22, 23, 24],
        [25, 26, 27, 28, 29]]])
        
>>> b = np.array([[True, True, False], [False, True, True]])
>>> a[b]
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [20, 21, 22, 23, 24],
       [25, 26, 27, 28, 29]])

4.8 Combining indexed arrays and slices

Slices can be used in indexed arrays:

>>> a = np.arange(35).reshape(5, 7)
>>> a
array([[ 0,  1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12, 13],
       [14, 15, 16, 17, 18, 19, 20],
       [21, 22, 23, 24, 25, 26, 27],
       [28, 29, 30, 31, 32, 33, 34]])
>>> a[np.array([0, 2, 4]), 1:3]
array([[ 1,  2],
       [15, 16],
       [29, 30]])

4.8 np.newaxisand...

np.newaxis: You can use to add a new dimension, in this case there will be no new elements, but the dimension is increased:

>>> a=np.arange(5)
>>> a
array([0, 1, 2, 3, 4])
>>> a.shape
(5,)

>>> b = a[:,np.newaxis]
>>> b
array([[0],
       [1],
       [2],
       [3],
       [4]])
>>> b.shape
(5, 1)

>>> c = a[np.newaxis,:]
>>> c.shape
(1, 5)
>>> c
array([[0, 1, 2, 3, 4]])

...: Indicates the colon needed to generate a complete indexed array:

  • x[1,2,…] is equivalent to x[1,2,:,:,:],
  • x[…,3] is equivalent to x[:,:,:,:,3]
  • x[4,...,5,:] is equivalent to x[4,:,:,5,:].
>>> a = np.arange(15).reshape(5, 3)
>>> a
array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])
>>> a[1,...]
array([3, 4, 5])
>>> a[1]
array([3, 4, 5])
>>> a[...,2]
array([ 2,  5,  8, 11, 14])

4.9 Assignment after indexing

You can take advantage of the broadcast feature or assign a value with the same shape to the indexed array:

>>> a = np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

>>> a[2:7] = 1
>>> a
array([0, 1, 1, 1, 1, 1, 1, 7, 8, 9])

>>> a[2:7]=np.arange(5)
>>> a
array([0, 1, 0, 1, 2, 3, 4, 7, 8, 9])

>>> a[2:7] += 10
>>> a
array([ 0,  1, 10, 11, 12, 13, 14,  7,  8,  9])

5. Iteration

We can iterate over an array in the following way, starting from the highest dimension by default:

>>> a = np.arange(6).reshape(2, 3)
>>> for x in a:
...     print(x)
...
[0 1 2]
[3 4 5]

You can also use the property of array flat, which will reduce the dimension of the array and tile it, and then you can easily traverse all the elements:

>>> for x in a.flat:
...     print(x)
...
0
1
2
3
4
5

6. Array operations

6.1 Changing the size

reshapeYou can get an array of corresponding size by :

>>> a = np.arange(6)
>>> a
array([0, 1, 2, 3, 4, 5])
>>> a.reshape(2,3)
array([[0, 1, 2],
       [3, 4, 5]])
>>> a
array([0, 1, 2, 3, 4, 5])

You can also resizechange the actual size of the array:

>>> a.resize(2,3)
>>> a
array([[0, 1, 2],
       [3, 4, 5]])

It should be noted that reshapeonly a new array of corresponding size is returned, and the size of the original array will not be changed; instead, resizethe size of the array itself will be changed.

Direct assignment can be made .shapeto change the size:

>>> a = np.arange(6)
>>> a.shape = (3,2)
>>> a
array([[0, 1],
       [2, 3],
       [4, 5]])

For reshape()and .shape, we can specify size as -1, which will automatically calculate its size (note: resize()not supported):

>>> a = np.arange(6)
>>> a.shape = 3, -1
>>> a
array([[0, 1],
       [2, 3],
       [4, 5]])
       
>>> a.reshape(2, -1)
array([[0, 1, 2],
       [3, 4, 5]])
       
>>> a.reshape(-1, 2)
array([[0, 1],
       [2, 3],
       [4, 5]])

6.2 Tiling

Multi-dimensional arrays can be flattened into one-dimensional arrays through functions flatten()and ravel(), but the difference between the two is that the flatten() modified array will not change the original array, and ravel()the modified array will affect the original array:

>>> a = np.arange(6).reshape(2, 3)
>>> a
array([[0, 1, 2],
       [3, 4, 5]])
       
>>> b = a.flatten()
>>> b
array([0, 1, 2, 3, 4, 5])
# 修改这个平铺数组不会影响原数组
>>> b[0] = 100
>>> a
array([[0, 1, 2],
       [3, 4, 5]])
       
>>> c = a.ravel()
>>> c
array([0, 1, 2, 3, 4, 5])
# 修改这个平铺数组会影响原数组
>>> c[0] = 100
>>> a
array([[100,   1,   2],
       [  3,   4,   5]])

6.3 Sorting

By sort()sorting the array, you can specify the dimension to sort, and the default is to sort by the lowest dimension:

>>> a = np.array([[8, 0, 1], [5, 3, 4], [2, 6, 7]])
>>> a
array([[8, 0, 1],
       [5, 3, 4],
       [2, 6, 7]])
>>> a.sort()
>>> a
array([[0, 1, 8],
       [3, 4, 5],
       [2, 6, 7]])
>>> a.sort(axis=0)
>>> a
array([[0, 1, 5],
       [2, 4, 7],
       [3, 6, 8]])

By default, the functions sort()are sorted in ascending order. If descending order is required, negative numbers can be used: -np.sort(-a).

6.4 Transpose

Transposing an array can be done through properties .Tor functions:transponse()

>>> a = np.arange(6).reshape(2, 3)
>>> a
array([[0, 1, 2],
       [3, 4, 5]])
>>> a.T
array([[0, 3],
       [1, 4],
       [2, 5]])
>>> a.transpose()
array([[0, 3],
       [1, 4],
       [2, 5]])

The transpose method above doesn't change the array itself, it just returns a new array.

6.5 Consolidation

Arrays can be merged with np.vstack()and :np.hstack()

>>> a = np.arange(4).reshape(2, 2)
>>> a
array([[0, 1],
       [2, 3]])
>>> b = np.arange(start=4, stop=8).reshape(2, 2)
>>> b
array([[4, 5],
       [6, 7]])
       
>>> np.vstack((a,b))
array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7]])
       
>>> np.hstack((a,b))
array([[0, 1, 4, 5],
       [2, 3, 6, 7]])

The above are fixed vertical merges and horizontal merges, and can also be merged by np.concatenatespecifying the axis ( axisthe smaller the size, the higher the dimension):

>>> np.concatenate((a,b), axis=0)
array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7]])
>>> np.concatenate((a,b), axis=1)
array([[0, 1, 4, 5],
       [2, 3, 6, 7]])

6.6 Division

np.hsplitArrays can be split along the horizontal axis by . Equal or unequal splits are possible:

>>> a = np.arange(9).reshape(3, 3)
>>> a
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
       
# 等量分割成3列
>>> np.hsplit(a, 3)
[array([[0],
       [3],
       [6]]), array([[1],
       [4],
       [7]]), array([[2],
       [5],
       [8]])]
       
# 不等量分割,1份为1列,1份为2列
>>> np.hsplit(a, (1,2))
[array([[0],
       [3],
       [6]]), array([[1],
       [4],
       [7]]), array([[2],
       [5],
       [8]])]

np.vsplitSplitting can be done along a vertical axis with or np.array_splitspecifying an axis with .

6.7 copy

Arrays in Numpy are divided into no copy, shallow copy and deep copy. If assigned directly, the array object and its data are not copied:

>>> a = np.arange(12)
>>> b = a
>>> id(a)
2258092791568
>>> id(b)
2258092791568
>>> b is a
True

view()A view can be created through the function , that is, a shallow copy. It appears that different array objects share the same data, and changing one affects the other:

>>> a = np.arange(12)
>>>
>>> b = a.view()
>>> id(a)
2257817754800
>>> id(b)
2258092792144
>>> a[0] = 130
>>> b
array([130,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11])

copy()A complete copy of an array and data can be generated through the function , that is, a deep copy, and the two do not affect each other:

>>> a = np.arange(12)
>>> b = a.copy()
>>> id(a)
2258092791664
>>> id(b)
2258092791568
>>> a[0] = 130
>>> b
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

7. Array operations

The calculation efficiency in NumPy is high and has the characteristics of vectorization, which is very easy to understand. Basic mathematical operations refer to operations based on mathematical symbols, including the following types (operations are performed according to corresponding elements):

serial number operation describe
1 +, -, *, /, //, % basic math operations
2 ** Exponential operation
3 +=, -=, *=, /=, //=, %=, **= compound assignment operation
4 >, >=, <, <=, == comparison operation

7.1 +, -, *, /, //, %

NumPy arrays support direct operations with numbers:

>>> a = np.array([1,2,3])
>>> a+10	# 加
array([11, 12, 13])
>>> a-10	# 减
array([-9, -8, -7])
>>> a*10	# 乘
array([10, 20, 30])
>>> a/10	# 除
array([0.1, 0.2, 0.3])
>>> a // 10 # 取整除
array([0, 0, 0], dtype=int32)
>>> a%10	# 取余
array([1, 2, 3], dtype=int32)

Operations between arrays and arrays are also supported:

>>> a = np.array([1,2,3])
>>> b = np.array([4,5,6])
>>> a+b
array([5, 7, 9])
>>> a-b
array([-3, -3, -3])
>>> a*b
array([ 4, 10, 18])
>>> a/b
array([0.25, 0.4 , 0.5 ])
>>> a//b
array([0, 0, 0])
>>> a%b
array([1, 2, 3])

Because the operation between arrays is to operate on the elements at the same position one by one, the arrays participating in the operation must have the same value shape, otherwise the operation cannot be performed.

7.2 **

The passed **Nform can perform operations to the power of N, and supports the method between numbers or arrays:

>>> a = np.array([1,2,3])
>>> b = np.array([4,5,6])
>>> a**2	# 数组a的2次方
array([1, 4, 9])
>>> a**b	# 数组a的b次方
array([1, 32, 729])

7.3 +=, -=, *=, /=, //=, %=, **=

NumPy's compound assignment operation is similar to most programming languages, but the difference between it and the basic mathematical operations and exponential operations in the previous two sections is that basic mathematical operations and exponential operations will generate a new array and will not change the value of the array elements; Compound assignment operations, on the other hand, act on the array elements themselves.

>>> import numpy as np
>>> a = np.array([1,2,3])
>>> a += 1
>>> a
array([2, 3, 4])
>>> a -= 1
>>> a
array([1, 2, 3])
>>> a *= 2
>>> a
array([2, 4, 6])
>>> a **= 2
>>> a
array([4, 16, 36])

In the above example we can see that the array itself is actually changed a.

In addition, it is not used in the example /=, because it is special, so it needs to be said separately. In theory , after /=we define the array , it should be done directly, but if we write it like this, an error will be reported:aa /= 2

>>> a /= 2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
numpy.core._exceptions._UFuncOutputCastingError: Cannot cast ufunc 'divide' output from dtype('float64') to dtype('int32') with casting rule 'same_kind'

That is to say /=, the function array needs to be a floating-point type, and we defined it as an integer, so we have two ways, that is, we define the array as a floating-point type at the beginning, or use symbols to handle integer division //=:

>>> a = np.array([1, 2, 3], dtype=np.float32)
>>> a /= 2
>>> a
array([0.5, 1. , 1.5], dtype=float32)

>>> b = np.array([1, 2, 3])
>>> b //= 2
>>> b
array([0, 1, 1])

7.4 >, >=, <, <=, ==

By comparing operations, we can get an array of Booleans:

>>> a = np.array([1, 2, 3])
>>> a > 1
array([False,  True,  True])
>>> a >= 1
array([ True,  True,  True])
>>> a < 1
array([False, False, False])
>>> a <= 1
array([ True, False, False])
>>> a == 1
array([ True, False, False])

8. Broadcast

Broadcasting in NumPy is a very interesting concept. Its main purpose is to allow arrays of different shapes or dimensions to be operated and have their own set of rules.

The condition for broadcast compatibility is that the two arrays are deduced from the last dimension,

  1. they are the same size, or
  2. one of them is 1

Then they are broadcast compatible, as the following two examples satisfy broadcast compatible:

A      (3d array):  15 x 3 x 5
B      (2d array):       3 x 5
Result (3d array):  15 x 3 x 5
A      (4d array):  8 x 1 x 6 x 1
B      (3d array):      7 x 1 x 5
Result (4d array):  8 x 7 x 6 x 5

If the above conditions are not met, broadcasting is not compatible, and ValueError: operands could not be broadcast togetheran exception :

A      (1d array):  3
B      (1d array):  4 # trailing dimensions do not match

A      (2d array):      2 x 1
B      (3d array):  8 x 4 x 3 # second from last dimensions mismatched

To understand broadcasting, you mainly need to understand the size rules for generating arrays. If you understand this, the specific operations will be clearer:

# 一维数组的广播
>>> a = np.arange(4)
>>> a + 10
array([10, 11, 12, 13])

# 二维数组的广播
>>> b = a.reshape(4, 1)
>>> b
array([[0],
       [1],
       [2],
       [3]])
>>> c = np.ones(5)
>>> b + c
array([[1., 1., 1., 1., 1.],
       [2., 2., 2., 2., 2.],
       [3., 3., 3., 3., 3.],
       [4., 4., 4., 4., 4.]])

As we introduced earlier np.newaxis, we can use it to increase the dimension of the array, and then perform broadcast calculations:

>>> a = np.array([0.0, 10.0, 20.0, 30.0])
>>> b = np.array([1.0, 2.0, 3.0])
>>> a[:, np.newaxis] + b
array([[  1.,   2.,   3.],
       [ 11.,  12.,  13.],
       [ 21.,  22.,  23.],
       [ 31.,  32.,  33.]])

9. General functions

In addition to using mathematical symbols directly, NumPy also provides some mathematical functions, which are called "general functions" ( ufunc). These general functions operate element-wise on arrays and produce a new array.

9.1 Unary functions

function describe
np.abs absolute value
np.sqrt Square root (the result of negative square root is NAN)
np.square square
np.exp Calculate the exponent (e^x)
np.log,np.log10,np.log2,np.log1p Find the logarithm with base e, base 10, base 2, and base (1+x
np.sign Label the values ​​in the array, those greater than 0 become 1, those equal to 0 become 0, and those less than 0 become -1
np.ceil Get the upper bound integer for each element in the array
np.floor Get the lower bound integer of each element in the array
np.clop Set upper and lower limits in an array
np.rint,np.round Returns the rounded value
np.modf Split integers and decimals to form two arrays
np. isnan Judging whether it is nan
np.isinf Determine whether it is inf
np.cos,np.cosh,np.sinh,np.tan,np.tanh Trigonometric functions
np.arccos,np.arcsin,np.arctan inverse trigonometric functions
np.nonzero Find non-zero elements and their coordinates
np.cumsum accumulate
np.diff Tired

9.2 Binary functions

function describe
np.add Addition
np.subtract Subtraction
np.negative complex arithmetic
np.multiply Multiplication
np.divide division operation
np.floor_divide Rounding operation, equivalent to //
np.mod remainder operation
np.dot inner product (dot product)
greater,greater_equal,less,less_equal,equal,not_equal Function expression of >,>=,<,<=,=,!=
logical_and and the operator function expression
logical_or or operator function expression

9.3 Aggregate functions

function describe
np.sum Compute the sum of the elements
np.prod Calculate the product of elements
np.mean Computes the average of the elements
np.std Compute element-wise standard deviation
np.var Compute element-wise variance
np.min Computes the minimum value of an element
np.max Computes the maximum value of an element
np.argmin Find the index of the minimum value
np.argmax Find the index of the maximum value
np.median Computes the median of the elements
np.average Computes the weighted average of elements

9.4 Boolean Array Functions

function describe
np.all() Determine whether all elements or the elements of the specified axis are all True
np.any() 判断所有元素或者指定轴的元素中,是否有True

10. API函数

NumPy中的API的类型很丰富,而且非常多,这一部分可参照官方API文档

Guess you like

Origin blog.csdn.net/ZivXu/article/details/128403814