Data Analysis with Python - Numpy Basics: Array and Vector Computing

Data Analysis with Python - Numpy Basics: Array and Vector Computing

  • ndarry, a fast space-saving multidimensional array with vector arithmetic and complex broadcasting capabilities
  • Standard math functions that perform fast operations on entire sets of data without for-loop
  • Tools for reading and writing disk data and tools for manipulating memory-mapped files?
  • Linear algebra, random number generation, and Fourier transform functions
  • Tools for integrating code such as C/C++

1. ndarry: a multidimensional array object

1. Create ndarry

#一维
In [5]: data = [1,2,3]
In [6]: import numpy as np
In [7]: arr1 = np.array(data)
In [8]: arr1
Out[8]: array([1, 2, 3])
#二维
In [11]: data2 = [[1,2,3],[4,5,6]]
In [12]: arr2 = np.array(data2)
In [13]: arr2
Out[13]:
array([[1, 2, 3],
       [4, 5, 6]])
#查看数组的信息
In [15]: arr2.shape
Out[15]: (2, 3)
In [16]: arr2.dtype
Out[16]: dtype('int32')

The array creation function
array()
arange() is similar to the Python built-in function range(), but range() returns a list of
ones, zeros creates an array of all 1/0, but the parameter passed in is a set, such as np .ones((2,3))
ones_like, zeros_like create an all 1/0 array with the same shape as the passed array
empty, empty_like create an empty array, allocate memory, do not store the value
eye, identity create a square matrix

2. Operations between arrays and scalars

In [36]: arr2
Out[36]:
array([[1, 2, 3],
       [4, 5, 6]])
In [37]: arr3
Out[37]:
array([[11, 12, 13],
       [14, 15, 16]])
#加
In [38]: arr2+arr3
Out[38]:
array([[12, 14, 16],
       [18, 20, 22]])
#乘
In [39]: arr2*arr3
Out[39]:
array([[11, 24, 39],
       [56, 75, 96]])
#减
In [40]: arr3-arr2
Out[40]:
array([[10, 10, 10],
       [10, 10, 10]])
#除
In [41]: arr3/arr2
Out[41]:
array([[11.        ,  6.        ,  4.33333333],
       [ 3.5       ,  3.        ,  2.66666667]])
#平方
In [42]: arr2**2
Out[42]:
array([[ 1,  4,  9],
       [16, 25, 36]], dtype=int32)

3. Indexing and Slicing

index:

arr2d[0,0]或者是arr2d[0][0]
arr3d[0,0,0]或者是arr3d[0][0][0]

slices: yes : mark

arr2d[:2,:2]
arr3d[:2,:2]

Distinguish array and list operations first.
Array slicing is performed on the original array, while list slicing operations are performed on data assignment.
If you need to slice a copy instead of the source array itself, you need toarr[5:8].copy()

#列表的切片
>>> l1 = list(range(10))
>>> l1
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> l2 = l1[5:8]
>>> l2
[5, 6, 7]
>>> l2[0]=15
>>> l2
[15, 6, 7]
>>> l1
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
#数组的切片
In [50]: arr = np.arange(10)

In [51]: arr
Out[51]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [52]: arr_slice = arr[5:8]

In [53]: arr_slice
Out[53]: array([5, 6, 7])

In [54]: arr_slice[0]=15

In [55]: arr_slice
Out[55]: array([15,  6,  7])

In [56]: arr
Out[56]: array([ 0,  1,  2,  3,  4, 15,  6,  7,  8,  9])

#二维数组的切片
In [95]: arr2d
Out[95]:
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [96]: arr2d[:2]
Out[96]:
array([[1, 2, 3],
       [4, 5, 6]])

Multiple slices can be passed in at one time

In [97]: arr2d[:2,:1]
Out[97]:
array([[1],
       [4]])

In [98]: arr2d[:2,:2]
Out[98]:
array([[1, 2],

#3维
In [83]: arr3d
Out[83]: [[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]]

In [84]: arr3d[1]
Out[84]: [[7, 8, 9], [10, 11, 12]]

In [85]: arr3d[1][1]
Out[85]: [10, 11, 12]

In [86]: arr3d[1][1][1]
Out[86]: 11

In [87]: arr3d[1][1][2]
Out[87]: 12

boolean index

#[True,False,True]就相当有是取第0/2行
In [121]: arr2d[[True,False,True]]
Out[121]:
array([[1, 2, 3],
       [7, 8, 9]])

In [122]: arr2d[[True,False,True],2]
Out[122]: array([3, 9])

fancy index

#与上边的博布尔型索引一样,也是取第0/2行
In [132]: arr2d[[0,2]]
Out[132]:
array([[1, 2, 3],
       [7, 8, 9]])

#花式索引注意以下问题

Fancy indexing, unlike slicing, always copies data into a new array, which results in the following

In [136]: arr2d[[0,2],[0,2]]
Out[136]: array([1, 9])

In [137]: arr2d[[0,2]][:,[0,2]]
Out[137]:
array([[1, 3],
       [7, 9]])
       

Array transpose and axis swap

Transpose is a special form of reshape that returns a view of the source data without copying.

In [142]: arr2d.T
Out[142]:
array([[1, 4, 7],
       [2, 5, 8],
       [3, 6, 9]])

4. Functions that operate on elements of an array

Functions that operate on a single array element

  • abs Calculate absolute value
  • sqrt computes the square root of each element
  • square calculates the square of each element
  • exp computes the base-e exponent of each element
  • log/log10/log2/log1p log1p是log(1+x)
  • sign Calculate the sign of each element
  • ceil calculates the smallest integer greater than or equal to this element
  • floor calculates the largest integer less than or equal to the element
  • rint rounds the element to the nearest whole number
  • modf returns the fractional and integer parts of the element, as two separate arrays
  • isnan is not a number Determine whether each element is a number
  • isfinite isinf judges that each element is infinite and infinite
  • cos/without/so
  • arccos/acccosh/arcsin

function that operates on two array elements

  • add adds elements in an array
  • subtract the elements in the first array minus the elements in the second array
  • multiply the corresponding elements of the array to multiply
  • divide floor_divide division, division with remainder discarded
  • power(a,b) Calculate the element in a to the b power of the corresponding element a in b
  • mod finds the remainder of a division
  • copysign assigns the sign of the element in the second array to the value in the first array
  • < >= <= == != compares the values ​​of corresponding elements

  • logical_and/logical_or/logical_xor

5. Some operations that can be processed with arrays

Vectorization is convenient for operations

Ternary operation

In [6]: xarr = np.array([1.1,1.2,1.3,1.4,1.5])

In [7]: yarr = np.array([2.1,2.2,2.3,2.4,2.5])

In [8]: cond = np.array([True,False,True,True,False])

In [9]: result = [x if c else y for x ,c ,y in zip(xarr,yarr,cond)]

In [10]: result
Out[10]: [1.1, 1.2, 1.3, 1.4, 1.5]

#usually np.whereused to generate another array from one array

In [11]: result2 = np.where(cond,xarr,yarr)

In [12]: result2
Out[12]: array([1.1, 2.2, 1.3, 1.4, 2.5])

Mathematical and Statistical Methods

These methods can be called either as instance methods arr2d.sum()or vianp.sum(arr2d)

  • sum calculates the sum of all elements
  • mean computes the mean of all elements
  • std/var calculates standard deviation and variance
  • min/max maximum and minimum
  • argmin/argmax index of min and max
  • cumsum returns a cumulative sum of all elements of an array
  • cumprod cumulative product of all elements

Methods for Boolean Arrays

#True直接当1计算
In [24]: (arr2d<4).sum()
Out[24]: 3

In [25]: cond
Out[25]: array([ True, False,  True,  True, False])

In [26]: cond.any()
Out[26]: True

In [27]: cond.all()
Out[27]: False

sort

  • np.sort() this will make a copy
  • arr2d.sort() is an operation on the source data

5. Input and output for array files

save the array to disk in binary form

  • np.save()
  • np.load()

access text files

  • e.g. loadtext ()
  • np.savetext ()

6. When linear algebra is not found, it is in numpy.linalg

  • Note: transpose arr.T
  • np.dot(arr1,arr2) product of two matrices
  • np.diag returns the diagonal elements/or converts a one-dimensional array to a square matrix with this as the diagonal
  • trace() calculates the sum of the diagonals
  • det calculates the determinant value of the f square matrix
  • eig computes eigenvalues ​​and eigenvectors
  • inv computes the inverse matrix
  • pinv computes the pseudo-inverse matrix
  • qr computes the QR decomposition
  • svd computes singular value decomposition
  • solve solves the linear equation Ax=b
  • lstsq computes the least squares solution of Ax=b

7. Random number generation numpy.random complements Python's built-in random

  • seed determines the seed for random number generation
  • permutation returns a random permutation of a sequence or returns a range of random permutations
  • shuffle shuffles a sequence in-place
  • rand produces uniformly distributed sample values
  • randint randomly picks integers from a given upper and lower range
  • randn produces normally distributed sample values
  • binomial yields sample values ​​from a binomial distribution
  • normal produces sample values ​​from a binomial distribution
  • beta yields sample values ​​from a beta distribution
  • chisquare produces sample values ​​from a chi-square distribution
  • gamma produces sample values ​​from a Gamma distribution
  • uniform produces (0,1) uniformly distributed sample values

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325057184&siteId=291194637