Python--Numpy library study notes

ndarray

  • Nadrray is an N-dimensional array object, a fast and flexible container for large data sets
  • You can use this array (ndarray) to perform mathematical operations on the entire block of data

Import the code of the Nump library

import numpy as np

Generate some random data

data = np.random.randn(2, 3)

Insert picture description here
–numpy.random.randn()

  • The randn function returns a sample or a set of samples with a standard normal distribution
  • The standard normal distribution is also called the u distribution. It is a normal distribution with 0 as the mean and 1 as the standard deviation, denoted as N(0, 1)

--Dn means each dimension, randn(2, 3) means to return an array with 2 rows and 3 columns

Arrays can perform mathematical operations

  • Multiply every element in the array by 10
data * 10

Insert picture description here

  • Add two arrays, each element in the array is added correspondingly
data + data

Insert picture description here
Numpy is a universal multidimensional container of isomorphic data, in which all elements must be of the same type.
Each array has

  • shape (a tuple representing the size of each dimension)
  • dtype (an object used to describe the data type of an array)

Take a look at the shape and dtype of the data array

data.shape

Insert picture description here
It can be seen that data is an array with 2 rows and 3 columns

data.dtype

Insert picture description here
It can be seen that the data type of the data array is'float64'

Create ndarray

The array function can be used to create an ndarray array. It accepts all serial objects (including other arrays).
Take the conversion of a list as an example:

data1 = [6, 7.5, 8, 0, 1]
arr1 = np.array(data1)
arr1

Insert picture description here
You can see that the list data1 is converted to the array arr1

If the list is composed of a set of equal-length lists, the array function will convert it into a multi-dimensional array

data2 = [[1, 2, 3, 4], [5, 6, 7, 8]]
arr2 = np.array(data2)
arr2

Insert picture description here
It can be seen that the list consisting of two lists is converted into a 2-dimensional array

We can use the attributes ndim and shape to verify

arr2.ndim

Insert picture description here
-ndim returns the dimension of the array, only one number is returned, which represents the dimension of the array

arr2.shape

Insert picture description here
-If there is no special instructions, np.array will select the most matching data type for the data to be created
-such as arr1, arr2 in the above example, we can see the created data type
Insert picture description here
, if there are decimals in the list, the array created by array The data type is floating-point -->
if the list of'float64' is all integers, then the data type of the array created by array is integer -->'int32'

The --np.zeros() method can create an array of all 0s with a specified length or shape.
If a number 10 is passed in, then a one-dimensional array of 10 '0's will be created by default

np.zeros(10)

Insert picture description here
The data type of the created array is floating point

np.zeros(10).dtype

Insert picture description here

Pass in a tuple representing the shape, you can create a multi-dimensional array,
such as creating a two-dimensional array of all '0's with 3 rows and 6 columns

np.zeros((3, 6))

Insert picture description here

--Np.empty() method specifies an uninitialized array of length or shape, or you can pass in a tuple to create a multi-dimensional array,
such as creating a three-dimensional uninitialized array (2 * 3 rows and 2 columns)

np.empty((2, 3, 2))

Insert picture description here
Note: The idea of ​​np.empty returning an array of all 0s is unsafe. In many cases, it returns some uninitialized garbage values.

np.arange()

--The arange in numpy is the array version of the python built-in function'range'. The
parameter N is passed in, which means to generate an integer array from 0 to (N-1)

For example, create an integer array of 0-14

np.arange(15)

Insert picture description here

np.arange(15).dtype

Insert picture description here
Note: Numpy focuses on numerical calculations. If not specified, the data types are basically'float64' (floating point numbers)

The following are some commonly used array creation functions, which are used frequently

  • Array: Convert the data to ndarray (multidimensional array), if dtype is not specified, it will match the data type most suitable for the source data by default
  • asarry: Convert data to ndarray (multidimensional array). The difference between it and array is --> When the source data is ndarray, array will
    copy out a copy of ndarray, but asarray will not
  • arange: similar to the python built-in function range, but arange returns an ndarry, and the built-in range returns a list
  • ones: Create an array of all '1's according to the specified shape and dtype, the default is'float64' floating point
  • ones_like: Take another array as a parameter (get the shape of the array) and create an array of all '1's based on the shape of the parameter
  • zeros, zeros_like: similar to ones and ones_like, but creates an array of all '0's
  • empty, empty_like: similar to ones and ones_like, but it only allocates memory space, but does not fill any values ​​(all uninitialized garbage values ​​are created)
  • full: Use all the values ​​in the fill value to create an array based on the specified shape and dtype (simulate a set of fill_value here)Insert picture description here
  • full_like: Create an array of the same shape with the shape of another array, the array value is fill_value
  • eye: Enter the parameter N, create a square N * N identity matrix (the diagonal is 1, the rest is 0), the array type is floating pointInsert picture description here
  • identity: Same as np.eye()

dtype

-Dtype contains the information needed to interpret a piece of ndarry's memory as a specific data type

Common Numpy data types

  • int8: type code: i1 --> signed 8-bit (1 byte) integer
  • uint8: Type code: u1 --> unsigned 8-bit (1 byte) integer
  • int16: type code: i2 --> signed 16-bit (2 bytes) integer
  • uint16: Type code: u2 --> unsigned 16-bit (2 bytes) integer
  • int32: Type code: i4 --> signed 32-bit (4 bytes) integer
  • uint32: Type code: u4 --> unsigned 32-bit (4 bytes) integer
  • int64: Type code: i8 --> signed 64-bit (8 bytes) integer
  • uint64: Type code: u8 --> unsigned 64-bit (8 bytes) integer
  • float16: Type code: f2 --> half precision floating point number
  • float32: Type code: f4 or f --> standard single-precision floating-point number (compatible with C float)
  • float64: Type code: f8 or d --> standard double-precision floating-point number (compatible with C double and Python float objects)
  • float128: Type code: f16 or g --> extended precision floating point number
  • complex64: Type code: c8 --> a complex number represented by two 32-bit floating point numbers
  • complex128: Type code: c16 --> a complex number represented by two 64-bit floating point numbers
  • complex256: Type code: c32 --> a complex number represented by two 128-bit floating point numbers
  • bool: Type code:? --> Boolean type storing True and False
  • object: Type code: O -->Python object type
  • string_: Type code: S --> fixed-length string type (1 byte per character), for example, to create a string of length 10, you should use S10
  • unicode_: Type code: U --> fixed-length unicode type (the number of bytes is determined by the platform)

– You can explicitly convert an array from one dtype to another dtype by using the astype method of ndarray.
Suppose there is an integer array arr

arr = np.array([1, 2, 3, 4, 5])

Insert picture description here
Convert arr to floating point

float_arr = arr.astype(np.float64)

Insert picture description here

If you convert a floating-point number to an integer, the fractional part will be truncated and deleted.
For example, there is a floating-point array arr2

arr2 = np.array([3.7, -1.2, -2.6, 0.5, 12.9, 10.1])

Insert picture description here
Convert to integer

int_arr2 = arr2.astype(np.int32)

Insert picture description here
You can see that the new array int_arr2 intercepts and deletes the decimal part of the original array
Insert picture description here

You can also use astype to convert a string array into a numeric form.
Suppose there is a string array numeric_strings

numeric_strings = np.array(['1.25', '-9.6', '42'], dtype = np.string_)

Insert picture description here

  • Note: When using np.string_ type, pay attention to the length of the string, because numpy's string data size is fixed, and no error will be reported when interception occurs
  • If during the conversion process, the conversion of string --> numeric value fails (such as "one" cannot be converted to numeric value 1, a ValueError will be raised

-We can also pass the dtype of an array as a parameter to the astype method.
Suppose there is an integer array int_array and a floating-point array calibers

int_array = np.arange(10)
calibers = np.array([.22, .270, .357, .380, .44, .50], dtype = np.float64)

Insert picture description here
If you want to convert int_array into the same floating-point array as calibers, you can pass the dtype of calibers into the parameter of the astype method

int_float = int_array.astype(calibers.dtype)
int_float.dtype

Insert picture description here

-You can also use the type code of the data type to represent dtype

empty_unit32 = np.empty(8, dtype = 'u4')

Insert picture description here

Numpy array operations

  • Any arithmetic operation between arrays of equal size will apply the operation to the element level

Create a two-dimensional array arr with 2 rows and 3 columns

arr = np.array([[1., 2., 3.], [4., 5., 6.]])
arr.shape

Insert picture description here
Multiply between arrays of equal size, and the elements are multiplied correspondingly

arr * arr

Insert picture description here

Subtract between arrays of equal size, and the elements are subtracted correspondingly

arr - arr

Insert picture description here

Arithmetic operations between arrays and scalars will propagate scalar value operations to each element

1 / arr

Insert picture description here

Comparison between arrays of the same size will produce a boolean array.
Let us create a two-dimensional array arr with the same size as arr

arr2 = np.array([[0., 4., 1.], [7., 2., 12.]])

Insert picture description here

– Compare arr and arr2

arr2 > arr

Insert picture description here
You can see that a Boolean array of the same size is generated, and each element in the array is a comparison of arr and arr2.

Basic indexing and slicing

One-dimensional array slice

  • On the surface, one-dimensional array slicing is similar to Python's list slicing function

Create a 0-9 integer array arr

arr = np.arange(10)

Insert picture description here

  • Take the sixth element in arr
arr[5]

Insert picture description here

  • Take the 6th to 8th elements in arr, close before slicing and then open
arr[5: 8]

Insert picture description here

  • You can assign a value to the sliced ​​part, and the source array is modified in place.
    If I assign the 6th to the 8th number of arr with '12'
arr[5: 8] = 12

Insert picture description here
As you can see, the arr array is modified in situ
-so the most important difference between array slice and python list slice is: array slice is a view of the original data, which means that the data will not be copied, and any modification on the view will directly reflect Onto the source array

For example, create a slice of arr_slice

arr_slice = arr[5: 8]

Insert picture description here

  • Note: If we modify the value of arr_slice, the change will be reflected in the original array arr

For example, assign the second number of arr_slice to '12345'

arr_slice[1] = 12345

Insert picture description here
As you can see, the data of the source array is also modified

– Slice [:] means to cut all the values ​​in the array. For
example, we assign a value of '64' to each tuple in arr_slice

arr_slice[:] = 64

Insert picture description here
It can be seen that each element in arr_slice is assigned a value of '64'

  • The reason why the above operation is different from python native slicing is that Numpy is designed to handle big data. If the data is copied and copied, it will put a lot of pressure on performance and memory.
  • Of course, if you want to get a slice copy of the ndarray (array) instead of the view, you can add a copy() method after the slice

For example, we want to get a slice copy of the 6th to 8th elements of arr and assign it with '6'

arr3 = arr[5: 8].copy()
arr3[:] = 6

Insert picture description here
As you can see, operating on the copy of the array slice does not affect the source array

Two-dimensional array

-In a two-dimensional array, the elements at each index position are no longer scalars like one-dimensional arrays, but one-dimensional arrays

For example, create a two-dimensional array arr2d

arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

Insert picture description here
Take the third element of arr2d

arr2d[2]

Insert picture description here
The result is a one-dimensional array

– If you want to get a single scalar element, you can recursively access the high-dimensional elements. For
example, I want to get the third scalar of the first one-dimensional array of a two-dimensional array

arr2d[0, 2]

Insert picture description here
The idea of ​​hierarchical recursion is used here
Insert picture description here

Multidimensional Arrays

– The principle of multi-dimensional arrays is similar to that of two-dimensional arrays

We create a 2 * 2 * 3 three-dimensional array arr3d

arr3d = np.array([[[1, 2, 3], [4, 5, 6]],
                 [[7, 8, 9], [10, 11, 12]]])

Insert picture description here

  • If you slice arr3d and pass in a scalar parameter, then what is returned is a two-dimensional array reduced by one dimension

For example, I want to get the first two-dimensional array of arr3d

arr3d[0]

Insert picture description here
As you can see, what is returned is a 2 * 3 two-dimensional array.
As the slice parameter increases, the dimension of the returned array decreases, until a scalar value is returned.

– Both scalar values ​​and arrays can be assigned to arr3d[0]

Before showing the assignment, we first create a copy of arr3d[0] old_values ​​in order to restore the source array

old_values = arr3d[0].copy()

Insert picture description here
Then, we assign the first two-dimensional array of arr3d this three-dimensional array to 42

arr3d[0] = 42

Insert picture description here
As you can see, each element in the first two-dimensional array obtained by slicing is assigned a value of 42

We use the original copy to restore the source array

arr3d[0] = old_values
  • Note that if there is no special treatment, the slice returns a view of the source array, and modifying the slice will affect the value of the source array

Slice index

– The slicing syntax of ndarray is similar
to that of one-dimensional objects such as Python lists. Look at the previous one-dimensional array arr, we take the second to sixth elements of arr

arr[1:6]

Insert picture description here

For a 3 * 3 two-dimensional array arr2d, the slice selects elements along one axis, the 0th axis is the row, and the 1st axis is the column.
We select the first two rows of the two-dimensional array

arr2d[:2]

Insert picture description here
It can be seen here that arr2d[: 2] is a simple way of writing arr2d[0: 2], which means that the first two lines of arr2d are selected (front closed and then open)


-It is also possible to pass in multiple slices at once. For example, we need to select all the data after the second column in the first two rows of arr2d array

arr2d[ :2, 1: ]

Insert picture description here

  • By mixing integer indexes and slices, you can get low-dimensional slices.
    For example, I want to select the first two columns of the second row.
arr2d[1,  :2]

Insert picture description here
We get a one-dimensional array of lower dimensions

  • Note: A single colon indicates that the entire axis is selected

Insert picture description here

  • Of course, the assignment operation of the slice will also be spread to the entire selection, because the slice is the view after the source array is processed, and the source array will be affected by the change of the slice

Boolean index

– Suppose, we have an array data for storing data and an array names for storing names (containing duplicates)

names = np.array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'])

Insert picture description here
We use the randn function in numpy.random to generate a standard normal distribution with 7 rows and 4 columns random array data

data = np.random.randn(7, 4)

Insert picture description here

  • Suppose, each name in the names array corresponds to each row in the data value.
    We want to select all rows corresponding to the name of'Bob'

– Let’s take a look at which names in names are'Bob'

names == 'Bob'

Insert picture description here

  • Here, you must use the'==' sign instead of the'=' sign, otherwise you will assign'Bob' to each element in names
  • It can be seen that a one-dimensional Boolean array is generated, and the element with the element'Bob' returns True, otherwise it is False

-Next, we pass the returned boolean array as an index into data

data[names == 'Bob']

Insert picture description here
It is not difficult to find that the rows corresponding to'Bob' in data have been selected

  • Note: The length of the Boolean array must be the same as the length of the indexed axis. If the length is inconsistent, an error will occur

-Of course, you can also index more axes, such as adding a column index, I want to get the data after the second column of the row corresponding to'Bob' in data

data[names == 'Bob', 2: ]

Insert picture description here

– If you want to select a value other than'Bob', you can use the inequality sign'! = 'or'~' to negate

data[names != 'Bob']

Insert picture description here
As a result, the remaining rows except the row corresponding to Bob are selected

– We often use the'~' operator to perform some conditional inversion. For
example, we first pass the boolean array of'Bob' to an object cond

cond = names == 'Bob'

Insert picture description here

Then use the ~ operator to reverse the object into the index of data

data[~cond]

Insert picture description here

As a result, the remaining rows except the row corresponding to Bob are also selected

-If we want to add judgment conditions, we can use & (and), | (or), and other Boolean arithmetic operators (you cannot use the keywords and and or in Python). For
example, I want to select'Bob at the same time The rows corresponding to the two names of 'and'Will' in data

mask = (names == 'Bob') | (names == 'Will')

Insert picture description here

data[mask]

Insert picture description here
As a result, the rows corresponding to Bob and Will are selected

  • Note: Selecting data in the array by Boolean index will always create a view of the data

– We often set values ​​through boolean arrays. For
example, set all negative values ​​in data to 0

data[data < 0] = 0

Insert picture description here

-You can also set the value of an entire row or column through a one-dimensional Boolean array

data[names != 'Joe'] = 7

Insert picture description here
As you can see, the data corresponding to the elements in the row whose name is'Joe' are all assigned the value 7

Fancy Index

  • Fancy indexing refers to: indexing with integer arrays

– Suppose there is an 8 * 4 array arr

arr = np.empty((8, 4))

Insert picture description here
This array is just to create space, and it contains uninitialized garbage values

-Now we fill this array with a for loop

for i in range(8):
    arr[i] = i

Insert picture description here

We can pass in a list of integers in a specified order or select a subset of the array

arr[[4, 3, 0, 6]]

Insert picture description here

Array transposition and axis conversion

  • Transpose is similar to array reshaping
  • Transpose does not perform any copy operations, and returns a view of the source data
  • The array has a special attribute T for transpose

Create a two-dimensional array with 3 rows and 5 columns

arr = np.arange(15).reshape((3, 5))

Insert picture description here
Transpose with attribute T

arr.T

Insert picture description here

  • When performing matrix calculations (such as calculating the inner product of a matrix), transpose is often needed
  • To calculate the inner product of the matrix, we can use the dot function in numpy
  • In fact, dot() returns the dot product of two arrays

– We create a two-dimensional array and try to calculate the inner product of it and its transposed array

arr = np.random.randn(6, 3)

Insert picture description here
Calculate inner product

np.dot(arr, arr.T)

Insert picture description here

-As for the transposition of a three-digit array, a concept must be introduced here: transpose requires a tuple consisting of axis numbers.
For example, create a 2 * 2 * 4 three-dimensional array

arr = np.arange(16).reshape((2, 2, 4))

Insert picture description here

  • Number the three axes of the three-dimensional array: 0, 1, 2 I imagine the length (0), width (1), and height (2) of a cuboid
  • When transposed, it is equivalent to the horizontal rotation of the cuboid by 90 degrees, and the values ​​of length and width are subjectively understood as interchange: the original length becomes width, and the width becomes length, so the length, width and height of the transposed cuboid correspond to The axis numbers of the original cuboid are 1 (width), 0 (length), 2 (height)
arr.transpose((1, 0, 2))

Insert picture description here

– Let’s talk about the transpose function in numpy. The parameters in transpose can be understood as the axis labels of the array

  • For one-dimensional arrays, numpy.transpose() does not work because there is only one axis
  • The transpose operation on the two-dimensional array is the transpose operation on the original array, and the axis label is converted from (0, 1) to (1, 0)
  • For a three-dimensional array, transpose will transform two of the three axes (see how you define this transpose)

-There is also a swapaxes method in ndarray, which can exchange axes.
For example, I want to exchange the second and third axes of a three-dimensional array

arr.swapaxes(1, 2)

Insert picture description here

  • The swapaxes method is actually a different way to transpose, which is very convenient
  • It should also be noted that the swapaxes method does not perform a copy operation, and returns a view of the source data

Guess you like

Origin blog.csdn.net/Baby1601tree/article/details/96840882