Numpy basic application
Numpy is an open source Python scientific computing library for fast processing of arrays of arbitrary dimensions . Numpy supports common array and matrix operations . For the same numerical calculation task, using NumPy not only makes the code much more concise, but also the performance of NumPy is far better than that of native Python, which is basically a difference of one to two orders of magnitude. The larger the size, the more obvious the advantages of NumPy.
The core data type of Numpy is ndarray
that it ndarray
can handle one-dimensional, two-dimensional and multi-dimensional arrays. This object is equivalent to a fast and flexible large data container. The underlying code of NumPy is written in C language, which solves the limitation of GIL. ndarray
When accessing data, the addresses of data and data are continuous, which ensures efficient batch operations, which are far better than those in Python list
; On the other hand , objects provide more methods to process data, especially methods related to statistics, which are not ndarray
native to Python .list
For all articles, please visit the column: "Python Full Stack Tutorial (0 Basics)"
and recommend the most recent update: "Detailed Explanation of High-frequency Interview Questions in Dachang Test" This column provides detailed answers to interview questions related to high-frequency testing in recent years, combined with your own Years of work experience, as well as the guidance of peer leaders summed up. It aims to help students in testing and python to pass the interview smoothly and get a satisfactory offer!
Article directory
Preparation
-
Start Notebook
jupyter notebook
Tip : Before starting Notebook, it is recommended to install data analysis-related dependencies, including the three artifacts mentioned above and related dependencies, including:
numpy
,pandas
,matplotlib
, andopenpyxl
so on. If you use Anaconda, you don't need to install it separately. -
import
import numpy as np import pandas as pd import matplotlib.pyplot as plt
Note : If you have started the Notebook but have not installed the relevant dependent libraries, for example, you can enter and run the cell
numpy
in the cell of the Notebook to install NumPy, or you can install multiple third-party libraries at one time, which needs to be in the cell!pip install numpy
input%pip install numpy pandas matplotlib
. Note that in the above code, we not only import NumPy, but also import pandas and matplotlib libraries.
create array object
There are many ways to create ndarray
objects. The following describes how to create one-dimensional arrays, two-dimensional arrays and multidimensional arrays.
one-dimensional array
-
Method 1: Using
array
a function, bylist
creating an array objectcode:
array1 = np.array([1, 2, 3, 4, 5]) array1
output:
array([1, 2, 3, 4, 5])
-
Method 2: Use
arange
a function to create an array object by specifying a value rangecode:
array2 = np.arange(0, 20, 2) array2
output:
array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18])
-
Method 3: Use
linspace
a function to create an array object with a specified range of evenly spaced numberscode:
array3 = np.linspace(-5, 5, 101) array3
output:
array([-5. , -4.9, -4.8, -4.7, -4.6, -4.5, -4.4, -4.3, -4.2, -4.1, -4. , -3.9, -3.8, -3.7, -3.6, -3.5, -3.4, -3.3, -3.2, -3.1, -3. , -2.9, -2.8, -2.7, -2.6, -2.5, -2.4, -2.3, -2.2, -2.1, -2. , -1.9, -1.8, -1.7, -1.6, -1.5, -1.4, -1.3, -1.2, -1.1, -1. , -0.9, -0.8, -0.7, -0.6, -0.5, -0.4, -0.3, -0.2, -0.1, 0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. , 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2. , 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3. , 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 4. , 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8, 4.9, 5. ])
-
Method 4: Use
numpy.random
the function of the module to generate random numbers and create an array objectyields 10 [0, 1) [0, 1)[0,1 ) Random decimals in range, code:
array4 = np.random.rand(10) array4
output:
array([0.45556132, 0.67871326, 0.4552213 , 0.96671509, 0.44086463, 0.72650875, 0.79877188, 0.12153022, 0.24762739, 0.6669852 ])
yields 10 [1, 100) [1, 100)[1,100 ) random integer in the range, code:
array5 = np.random.randint(1, 100, 10) array5
output:
array([29, 97, 87, 47, 39, 19, 71, 32, 79, 34])
produces 20 μ = 50 \mu=50m=50, σ = 10 \sigma=10 p=The normal distribution random number of 10 , the code:
array6 = np.random.normal(50, 10, 20) array6
output:
array([55.04155586, 46.43510797, 20.28371158, 62.67884053, 61.23185964, 38.22682148, 53.17126151, 43.54741592, 36.11268017, 40.94086676, 63.27911699, 46.92688903, 37.1593374 , 67.06525656, 67.47269463, 23.37925889, 31.45312239, 48.34532466, 55.09180924, 47.95702787])
Note : There are many other ways to create a one-dimensional array, such as reading strings, reading files, parsing regular expressions, etc. We will not discuss these methods here, interested readers can do their own research.
Two-dimensional array
-
Method 1: Use functions to create array objects
array
through nestinglist
code:
array7 = np.array([[1, 2, 3], [4, 5, 6]]) array7
output:
array([[1, 2, 3], [4, 5, 6]])
-
Method 2: Use
zeros
,ones
, andfull
functions to specify the shape of the array to create an array objectUse
zeros
function, code:array8 = np.zeros((3, 4)) array8
output:
array([[0., 0., 0., 0.], [0., 0., 0., 0.], [0., 0., 0., 0.]])
Use
ones
function, code:array9 = np.ones((3, 4)) array9
output:
array([[1., 1., 1., 1.], [1., 1., 1., 1.], [1., 1., 1., 1.]])
Use
full
function, code:array10 = np.full((3, 4), 10) array10
output:
array([[10, 10, 10, 10], [10, 10, 10, 10], [10, 10, 10, 10]])
-
Method 3: Use the eye function to create an identity matrix
code:
array11 = np.eye(4) array11
output:
array([[1., 0., 0., 0.], [0., 1., 0., 0.], [0., 0., 1., 0.], [0., 0., 0., 1.]])
-
Method 4: By
reshape
converting a one-dimensional array into a two-dimensional arraycode:
array12 = np.array([1, 2, 3, 4, 5, 6]).reshape(2, 3) array12
output:
array([[1, 2, 3], [4, 5, 6]])
Tip :
reshape
It isndarray
a method of the object.reshape
When using the method, you need to ensure that the number of array elements after reshaping is consistent with the number of array elements before reshaping, otherwise an exception will occur. -
Method 5:
numpy.random
Generate random numbers through the function of the module to create an array objectyields [0, 1) [0, 1)[0,1 ) A two-dimensional array of 3 rows and 4 columns composed of random decimals in the range, the code:
array13 = np.random.rand(3, 4) array13
output:
array([[0.54017809, 0.46797771, 0.78291445, 0.79501326], [0.93973783, 0.21434806, 0.03592874, 0.88838892], [0.84130479, 0.3566601 , 0.99935473, 0.26353598]])
yields [1, 100) [1, 100)[1,A two-dimensional array of 3 rows and 4 columns composed of random integers in the range of 100 ) , the code:
array14 = np.random.randint(1, 100, (3, 4)) array14
output:
array([[83, 30, 64, 53], [39, 92, 53, 43], [43, 48, 91, 72]])
Multidimensional Arrays
-
Create multidimensional arrays using random
code:
array15 = np.random.randint(1, 100, (3, 4, 5)) array15
output:
array([[[94, 26, 49, 24, 43], [27, 27, 33, 98, 33], [13, 73, 6, 1, 77], [54, 32, 51, 86, 59]], [[62, 75, 62, 29, 87], [90, 26, 6, 79, 41], [31, 15, 32, 56, 64], [37, 84, 61, 71, 71]], [[45, 24, 78, 77, 41], [75, 37, 4, 74, 93], [ 1, 36, 36, 60, 43], [23, 84, 44, 89, 79]]])
-
Reshape a one-dimensional and two-dimensional array into a multidimensional array
Adjust the shape of a one-dimensional array to a multi-dimensional array, the code:
array16 = np.arange(1, 25).reshape((2, 3, 4)) array16
output:
array([[[ 1, 2, 3, 4], [ 5, 6, 7, 8], [ 9, 10, 11, 12]], [[13, 14, 15, 16], [17, 18, 19, 20], [21, 22, 23, 24]]])
Adjust the shape of a two-dimensional array into a multidimensional array, the code:
array17 = np.random.randint(1, 100, (4, 6)).reshape((4, 3, 2)) array17
output:
array([[[60, 59], [31, 80], [54, 91]], [[67, 4], [ 4, 59], [47, 49]], [[16, 4], [ 5, 71], [80, 53]], [[38, 49], [70, 5], [76, 80]]])
-
Read the picture to get the corresponding three-dimensional array
code:
array18 = plt.imread('guido.jpg') array18
output:
array([[[ 36, 33, 28], [ 36, 33, 28], [ 36, 33, 28], ..., [ 32, 31, 29], [ 32, 31, 27], [ 31, 32, 26]], [[ 37, 34, 29], [ 38, 35, 30], [ 38, 35, 30], ..., [ 31, 30, 28], [ 31, 30, 26], [ 30, 31, 25]], [[ 38, 35, 30], [ 38, 35, 30], [ 38, 35, 30], ..., [ 30, 29, 27], [ 30, 29, 25], [ 29, 30, 25]], ..., [[239, 178, 123], [237, 176, 121], [235, 174, 119], ..., [ 78, 68, 56], [ 75, 67, 54], [ 73, 65, 52]], [[238, 177, 120], [236, 175, 118], [234, 173, 116], ..., [ 82, 70, 58], [ 78, 68, 56], [ 75, 66, 51]], [[238, 176, 119], [236, 175, 118], [234, 173, 116], ..., [ 84, 70, 61], [ 81, 69, 57], [ 79, 67, 53]]], dtype=uint8)
Explanation : The above code reads
guido.jpg
the picture file named under the current path. The picture in the computer system is usually composed of pixels in several rows and columns, and each pixel is composed of three primary colors of red, green and blue, so it can be Represented by a three-dimensional array. Reading pictures usesmatplotlib
libraryimread
functions.
properties of the array object
-
size
Attribute: the number of array elementscode:
array19 = np.arange(1, 100, 2) array20 = np.random.rand(3, 4) print(array19.size, array20.size)
output:
50 12
-
shape
Attribute: the shape of the arraycode:
print(array19.shape, array20.shape)
output:
(50,) (3, 4)
-
dtype
Attribute: the data type of the array elementscode:
print(array19.dtype, array20.dtype)
output:
int64 float64
ndarray
The data type of the object element can refer to the table shown below. -
ndim
Attribute: Dimensions of the arraycode:
print(array19.ndim, array20.ndim)
output:
1 2
-
itemsize
Attribute: the number of bytes of memory space occupied by a single element of the arraycode:
array21 = np.arange(1, 100, 2, dtype=np.int8) print(array19.itemsize, array20.itemsize, array21.itemsize)
output:
8 8 1
Description : When using
arange
to create an array object,dtype
specify the data type of the element through the parameter. It can be seen thatnp.int8
it represents an 8-bit signed integer, which only occupies 1 byte of memory space, and the value range is [ − 128 , 127 ] [-128,127][−128,127]。 -
nbytes
Attribute: the number of bytes of memory space occupied by all elements of the arraycode:
print(array19.nbytes, array20.nbytes, array21.nbytes)
output:
400 96 50
-
flat
Attribute: Iterator to the elements of the array (after one-dimensionalization)code:
from typing import Iterable print(isinstance(array20.flat, np.ndarray), isinstance(array20.flat, Iterable))
output:
False True
-
base
Attribute: the base object of the array (if the array shares the memory space of other arrays)code:
array22 = array19[:] print(array22.base is array19, array22.base is array21)
output:
True False
Explanation : The above code uses the slice operation of the array, which is similar to
list
the slice of the type in Python, but the details are not exactly the same. The following will specifically explain this knowledge point. Through the above code, it can be found thatndarray
the new array object obtained after slicing shares the data in memory with the original array object, so thearray22
attributebase
isarray19
the corresponding array object.
Array indexing and slicing
Similar to lists in Python, NumPy ndarray
objects can perform indexing and slicing operations. Elements in the array can be obtained or modified through indexing, and a part of the array can be taken out through slicing.
-
Index operation (ordinary index)
One-dimensional array, code:
array23 = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9]) print(array23[0], array23[array23.size - 1]) print(array23[-array23.size], array23[-1])
output:
1 9 1 9
Two-dimensional array, code:
array24 = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]) print(array24[2]) print(array24[0][0], array24[-1][-1]) print(array24[1][1], array24[1, 1])
output:
[7 8 9] 1 9 5 5 [[ 1 2 3] [ 4 10 6] [ 7 8 9]]
code:
array24[1][1] = 10 print(array24) array24[1] = [10, 11, 12] print(array24)
output:
[[ 1 2 3] [ 4 10 6] [ 7 8 9]] [[ 1 2 3] [10 11 12] [ 7 8 9]]
-
slice operation (slice index)
Slicing is a syntax like this
[开始索引:结束索引:步长]
. By specifying the start index (the default value is infinitesimal), the end index (the default value is infinite) and the step size (the default value is 1), the elements of the specified part are taken from the array and a new array is formed. Because the start index, end index, and step size have default values, they can all be omitted, and the second colon can also be omitted if the step size is not specified. The slicing operation of a one-dimensional arraylist
is very similar to the slicing of types in Python, and will not be repeated here. For the slicing of a two-dimensional array, you can refer to the following code, which is believed to be very easy to understand.code:
print(array24[:2, 1:])
output:
[[2 3] [5 6]]
code:
print(array24[2]) print(array24[2, :])
output:
[7 8 9] [7 8 9]
code:
print(array24[2:, :])
output:
[[7 8 9]]
code:
print(array24[:, :2])
output:
[[1 2] [4 5] [7 8]]
code:
print(array24[1, :2]) print(array24[1:2, :2])
output:
[4 5] [[4 5]]
code:
print(array24[::2, ::2])
output:
[[1 3] [7 9]]
code:
print(array24[::-2, ::-2])
output:
[[9 7] [3 1]]
Regarding the indexing and slicing operations of arrays, you can use the following two pictures to enhance your impression. These two pictures are from the book "Data Analysis with Python"
pandas
, which is a classic in the field of Python data analysis written by the author of the library, Wes McKinney Textbook, interested readers can buy and read the original book. -
fancy index
Fancy indexing refers to the use of integer arrays for indexing. The integer arrays mentioned here can be NumPy
ndarray
, or iterable types such as Pythonlist
,tuple
and can use positive or negative indexes.Fancy indexing of 1D arrays, code:
array25 = np.array([50, 30, 15, 20, 40]) array25[[0, 1, -1]]
output:
array([50, 30, 40])
Fancy indexing of 2D arrays, code:
array26 = np.array([[30, 20, 10], [40, 60, 50], [10, 90, 80]]) # 取二维数组的第1行和第3行 array26[[0, 2]]
output:
array([[30, 20, 10], [10, 90, 80]])
code:
# 取二维数组第1行第2列,第3行第3列的两个元素 array26[[0, 2], [1, 2]]
output:
array([20, 80])
code:
# 取二维数组第1行第2列,第3行第2列的两个元素 array26[[0, 2], 1]
output:
array([20, 90])
-
boolean index
The Boolean index is to index the array elements through an array of Boolean type. The array of Boolean type can be constructed manually, or can be generated by relational operations.
code:
array27 = np.arange(1, 10) array27[[True, False, True, True, False, False, False, False, True]]
output:
array([1, 3, 4, 9])
code:
array27 >= 5
output:
array([False, False, False, False, True, True, True, True, True])
code:
# ~运算符可以实现逻辑变反,看看运行结果跟上面有什么不同 ~(array27 >= 5)
output:
array([ True, True, True, True, False, False, False, False, False])
code:
array27[array27 >= 5]
output:
array([5, 6, 7, 8, 9])
Tip : Although the slicing operation creates a new array object, the new array and the original array share the data in the array. Simply put, if the data in the array is modified through the new array object or the original array object, the modification is actually the same block data.
base
Fancy indexing and Boolean indexing will also create a new array object, and the new array copies the elements of the original array. The relationship between the new array and the original array does not share data. This can also be understood from the properties of the array mentioned above. Pay attention when using it.
Case: Processing an image by array slicing
Learning basic knowledge is always boring and lacks a sense of accomplishment, so we still come to a case to demonstrate the use of the array indexing and slicing operations learned above. As we mentioned earlier, images can be represented by a three-dimensional array, and then the image can be processed by operating on the three-dimensional array corresponding to the image, as shown below.
Read in the image to create a three-dimensional array object.
guido_image = plt.imread('guido.jpg')
plt.imshow(guido_image)
Perform reverse slice on the 0-axis of the array to realize the vertical flip of the image.
plt.imshow(guido_image[::-1])
Slice the 1-axis of the array in reverse to realize the horizontal flip of the image.
plt.imshow(guido_image[:,::-1])
Cut Guido's head out.
plt.imshow(guido_image[30:350, 90:300])
Methods of Array Objects
statistical methods
Statistical methods mainly include: sum()
, mean()
, std()
, var()
, min()
, max()
, argmin()
, , argmax()
etc. cumsum()
, which are respectively used to sum, average, standard deviation, variance, maximum, minimum, cumulative sum, etc. of the elements in the array, please refer to the following code.
array28 = np.array([1, 2, 3, 4, 5, 5, 4, 3, 2, 1])
print(array28.sum())
print(array28.mean())
print(array28.max())
print(array28.min())
print(array28.std())
print(array28.var())
print(array28.cumsum())
output:
30
3.0
5
1
1.4142135623730951
2.0
[ 1 3 6 10 15 20 24 27 29 30]
Other methods
-
all()
/any()
Method: Determine whether all elements of the array areTrue
/ determine whether the array has promisingTrue
elements. -
astype()
Method: Copy the array and convert the elements in the array to the specified type. -
dump()
Method: save the array to a file, you canload()
create an array by loading data from the saved file through the function in NumPy.code:
array31.dump('array31-data') array32 = np.load('array31-data', allow_pickle=True) array32
output:
array([[1, 2], [3, 4], [5, 6]])
-
fill()
Method: Fill the specified element into the array. -
flatten()
Method: Flatten a multidimensional array into a one-dimensional array.code:
array32.flatten()
output:
array([1, 2, 3, 4, 5, 6])
-
nonzero()
Method: Returns the index of the non-zero element. -
round()
Method: Round the elements in the array. -
sort()
Method: Sorts the array in-place.code:
array33 = np.array([35, 96, 12, 78, 66, 54, 40, 82]) array33.sort() array33
output:
array([12, 35, 40, 54, 66, 78, 82, 96])
-
swapaxes()
Andtranspose()
method: Swaps the axes specified by the array.code:
# 指定需要交换的两个轴,顺序无所谓 array32.swapaxes(0, 1)
output:
array([[1, 3, 5], [2, 4, 6]])
code:
# 对于二维数组,transpose相当于实现了矩阵的转置 array32.transpose()
output:
array([[1, 3, 5], [2, 4, 6]])
-
tolist()
Method: convert the array into Pythonlist
.