[Python data analysis] numpy basics

Array properties

The difference between ndim and shape

The number of ndim axes, that is, the number of layers from outside to inside (axis = 0 is the outermost layer)
shape The elements of each layer array return a tuple whose length is ndim, such as (2,2,3) means axis = 0, there are two elements; axis = 1, there are two elements, the innermost axis = 3 has 3 elements, ndim = 3
Insert picture description here

Create array

1. General creation

numpy.array(object, dtype = None, copy = True, order = None, subok = False, ndmin = 0)

numpy.empty (shape, dtype = float, order = 'C') to create an empty array
numpy.zeros (shape, dtype = float, order = 'C') to create a zero array
numpy.ones (shape, dtype = None, order = 'C') Create 1 array

2. Create an array from an existing array

numpy.asarray (a, dtype = None, order = None) a is an input parameter of any form, which can be a list, a tuple of a list, a tuple, etc.
numpy.frombuffer (buffer, dtype = float, count = -1, offset = 0) The buffer is passed in the parameter
numpy.fromiter (iterable, dtype, count = -1) in the form of a stream created from an iterable object

3. Create an array from a range of values

numpy.arange (start, stop, step, dtype) creates an array within the range of start-stop
np.linspace (start, stop, num = 50, endpoint = True, retstep = False, dtype = None) creates a one-dimensional equivalence array, num is the number of generated elements
np.logspace (start, stop, num = 50, endpoint = True, base = 10.0, dtype = None) to create a proportional sequence, base log base

numpy slicing and indexing

1. slice () slices by index

import numpy as np
a = np.arange(10)
s = slice(2,7,## 标题2)   # 从索引 2 开始到索引 7 停止，间隔为2
print (a[s])
[2,4,6]

2. [start: stop: step]
b = a [2: 7: 2]
Explanation of colon: If only one parameter is placed, such as [2], a single element corresponding to the index will be returned. If it is [2:], it means that all items starting from the index will be extracted. If two parameters are used, such as [2: 7], then extract the items between the two indexes ( excluding the stop index )

3. Include ellipsis ... to make the length of the selected tuple the same as the dimension of the array, use the ellipsis for the row position, and return the array containing the elements in the row

import numpy as np
 
a = np.array([[1,2,3],[3,4,5],[4,5,6]])  
print (a[...,1])   # 第2列元素
print (a[1,...])   # 第2行元素
print (a[...,1:])  # 第2列及剩下的所有元素

#输出结果
[2 4 5]

[3 4 5]

[[2 3]
 [4 5]
 [5 6]]

Advanced Index

Integer array index

import numpy as np 
 
x = np.array([[1,  2],  [3,  4],  [5,  6]]) 
y = x[[0,1,2],  [0,1,0]]  
print (y)
#输出结果,获取数组（0,0），（1,1），（2,0）位置处元素
[1,4,5]

Slice: or ... combined with an indexed array

a = np.array([[1,2,3], [4,5,6],[7,8,9]])
c = a[1:3,[1,2]]
#输出结果，1:3切片索引不含尾，[1,2]索引数组包含头和尾
[[5 6]
 [8 9]]

Boolean index
Use Boolean operations (such as comparison operators) to obtain an array of elements that meet specified conditions. x [x> 5]
a [~ np.isnan (a)] filters NaN
a [np.iscomplex (a)] in a and proposes complex numbers

Fancy indexing
refers to indexing using integer arrays, which is different from slicing. It always copies data to a new array . If the target is a multi-dimensional array, then the line corresponding to the
index x = np.arange (32 ) .reshape ((8,4))
print (x [[4,2,1,7]]) #line index

x=np.arange(32).reshape((8,4))
print (x[np.ix_([1,5,7,2],[0,3,1,2])])#传入多个索引数组（要使用np.ix_）

NumPy Broadcast (Broadcast)

When a.shape ≠ b.shape in operation, the broadcast mechanism is triggered

Broadcasting rules:

All input arrays are aligned with the array with the longest shape, and the missing parts of the shape are filled by adding 1 in front.
The shape of the output array is the maximum value in each dimension of the shape of the input array.
If a dimension of the input array and the corresponding dimension of the output array have the same length or its length is 1, the array can be used for calculation, otherwise an error occurs.
When the length of a dimension of the input array is 1, the first set of values in this dimension are used when computing along this dimension.
import numpy as np

a = np.array([[ 0, 0, 0],
           [10,10,10],
           [20,20,20],
           [30,30,30]])
b = np.array([1,2,3])
bb = np.tile(b, (4, 1))  # 重复 b 的各个维度
print(a + bb)
#结果
[[ 1  2  3]
 [11 12 13]
 [21 22 23]
 [31 32 33]]

The tile () function is to copy the original matrix horizontally and vertically.

tile (mat, (1, 4)) # Column is copied four times the original

NumPy iterate array numpy.ndite

Numpy array operations

Modify array shape

numpy.reshape(arr, newshape, order=‘C’)

numpy.ndarray.flat array element iterator

ndarray.flatten (order = 'C') returns a copy of the array, the changes made to the copy will not affect the original array, order: 'C'-by row, 'F'-by column, 'A'-original Order, 'K'-the order in which elements appear in memory.

a=[[0 1 2 3]
 [4 5 6 7]]
a.flatten()=[0 1 2 3 4 5 6 7]

numpy.ravel (a, order = 'C') flattened array elements, modification will affect the original array.

Flip array

numpy.transpose (arr, axes) swaps the dimensions of the array, similar to numpy.ndarray.T transpose

The numpy.rollaxis (arr, axis, start) function scrolls a specific axis backward to a specific position
axis: the axis to be scrolled backward , the relative position of other axes will not change
start: the default is zero, which means complete scrolling. Will scroll to a specific location.

a = np.arange(8).reshape(2,2,2)
b=np.rollaxis(a,2)#将轴 2 滚动到轴 0（宽度到深度）
c=np.rollaxis(a,2,1) # 将轴 0 滚动到轴 1：（宽度到高度）
原数组：
[[[0 1]
  [2 3]]

 [[4 5]
  [6 7]]]
b：
[[[0 2]
  [4 6]]
 [[1 3]
  [5 7]]]
c：
[[[0 2]
  [1 3]]

 [[4 6]
  [5 7]]]

numpy.swapaxes (arr, axis1, axis2) swap the two axes of the array

Modify array dimensions

numpy.broadcast (arr1, arr2) is used to imitate the broadcast object, it returns an object, the object encapsulates the result of broadcasting an array to another array.

numpy.broadcast_to(array, shape, subok)

numpy.expand_dims (arr, axis) Expand the array shape by inserting a new axis at the specified position

numpy.squeeze (arr, axis) removes one-dimensional entries from the shape of a given array

Join array

numpy.concatenate ((a1, a2,…), axis) connects two or more arrays of the same shape along the specified axis

numpy.stack (arrays, axis) stack array sequence along new axis

Split array

numpy.split (ary, indices_or_sections, axis) Note: If it is an array, it is the position to split along the axis (open left and right close)

numpy.hsplit is used to split the array horizontally

numpy.vsplit split along the vertical axis

Add and delete array elements

numpy.resize (arr, shape) If the size of the new array is greater than the original size, it contains a copy of the elements in the original array

numpy.append (arr, values, axis = None) values need to be the same shape as arr, axis: default is None. When the axis is not defined, it is a horizontal addition, and the return is always a one-dimensional array!

numpy.insert (arr, obj, values, axis) Axis parameters are not passed, the input array will be expanded before insertion

Numpy.delete (arr, obj, axis) returns a new array that deletes the specified subarray from the input array. As in the case of the insert () function, if no axis parameters are provided, the input array will be expanded.

numpy.unique (arr, return_index, return_inverse, return_counts) remove duplicate elements in the array

NumPy bit arithmetic

bitwise_and performs bitwise operation
on array elements bitwise_or performs bitwise operation on array elements
invert bitwise inversion
left_shift shifts the bits of the binary representation to the left right_shift shifts the bits of the binary representation to the
right

numpy function operation

String functions (in the numpy.char group class)

add () concatenates the string elements of two arrays one by one
multiply () returns the string multi-connected by element
center () centered string
capitalize () converts the first letter of the string to uppercase
title () converts the character The first letter of each word of the string is converted to uppercase
lower () The array element is converted to lowercase
upper () The array element is converted to uppercase
split () Specify the delimiter to split the string and return the array list
splitlines () returns the element The list of lines in, split by a newline
strip () removes the specific character at the beginning or end of the element
join () connects the elements in the array by specifying the delimiter
replace () replaces all sub-characters in the string with the new string String
decode () array elements call str.decode
encode () array elements call str.encode

Mathematical functions

sin (), cos (), tan ()
numpy.around (a, decimals) number returns the rounded value of the specified number
numpy.floor () rounds down
numpy.ceil () rounds up

Arithmetic function

add (), subtract (), multiply () and divide () add, subtract, multiply and divide
numpy.reciprocal () returns the parameter-by-element inverse of
numpy.power () power operation
numpy.mod () remainder

NumPy statistical functions

numpy.amin () and numpy.amax () calculate the minimum and
maximum values of elements in the array along the specified axis numpy.ptp () maximum-minimum value
numpy.percentile (a, q, axis) q percentile to be calculated number
numpy.median () median
numpy.mean () is the arithmetic mean of the sum of the number of elements along the axis of the dividing element
numpy.average () is not specified axis, the array is deployed
standard deviation np.std
NP. var variance

NumPy sorting, conditional brush selection function

numpy.sort (a, axis, kind, order) kind: The default is 'quicksort' (quick sort)
numpy.argsort () returns the index value of the array value from small to large.
numpy.lexsort () sorts multiple sequences, returns the index value, puts the item with higher priority in the back
sort_complex (a) sorts the complex numbers in the order of real part and then imaginary part
(a, kth [, axis , kind, order]) partition sorting
argpartition (a, kth [, axis, kind, order] can use the keyword kind to specify the algorithm to partition the array along the specified axis
numpy.argmax () and numpy.argmin () along the given The axis returns the index of the largest and smallest elements. The
numpy.nonzero () function returns the index of the non-zero elements in the input array.
Numpy.where ()
numpy.extract () extracts elements from the array according to a certain condition and returns the elements with full conditions

Byte swap

The numpy.ndarray.byteswap () function converts the bytes in each element of the ndarray into big-endian.

NumPy copy and view

Views generally occur when the view is modified to change the original value

1. The numpy slicing operation returns the view of the original data.
2. Call the view () function of ndarray to generate a view.

The copy generally occurs when the original value of the copy is unchanged

For slicing operations of Python sequences, call the deepCopy () function.
Call the copy () function of ndarray to produce a copy

NumPy matrix library (Matrix) numpy.matlib

numpy.matlib.empty (shape, dtype, order) returns a new matrix
numpy.matlib.zeros ()
numpy.matlib.ones ()
numpy.matlib.eye ()
numpy.matlib.identity () unit matrix
numpy.matlib The .rand ()
matrix is always two-dimensional, and ndarray is an n-dimensional array. Both objects are interchangeable

NumPy linear algebra

dot The dot product of two arrays, that is, the elements are multiplied correspondingly.
vdot The dot product of two vectors
inner The inner product of
two arrays matmul The matrix product of two arrays
linalg.det () The determinant of the array
linalg.solve () Solve the linear matrix equation
linalg.inv () Calculate the multiplicative inverse moment of the matrix

NumPy IO

numpy.save (file, arr, allow_pickle = True, fix_imports = True) The
numpy.savez () function saves multiple arrays to a file with npz extension.
np.loadtxt (FILENAME, dtype = int, delimiter = '')
np.savetxt (FILENAME, a, fmt = "% d", delimiter = ",")

Gray gray

Published 4 original articles · received 1 · views 128

Private letter concerns