Python Numpy based tutorial
This article is a basic tutorial on Python numpy, where, Python Python 3.x version
What is Numpy
Numpy = Numerical + Python, which is the core library in Python scientific computing, efficient processing can be calculated multidimensional array. Furthermore, since many of the underlying function is written in C language, so the operation speed faster knockdown.
Basics
ndarray
NumPy main target is the same type of multidimensional arrays ndarray. It is a general multi-dimensional data structure with the container, all of the elements must be of the same type , and through a positive integer index tuple. With this object can perform some mathematical operation between the syntax elements and scalar data to the same block. In NumPy, the dimension referred to the shaft, the number of axes for the rank.
Introduce ndarray common attributes:
- ndarray.shape: indicates the size of each dimension of the array, is an integer tuple
- ndarray.dtype: description object in the array element type
- ndarray.ndim: the number of array axis
- The total number of array elements: ndarray.size
- ndarray.itemsize: byte size of each array element
Creating an array
Create an array of generally five ways:
1. By Python structure (list, tuple, etc.) into
The easiest way to create an array is to use the array object that can accept any type of sequence object, and then generate a new numpy array (ndarray) containing incoming data.
To give a simple example:
import numpy as np
a = np.array([1, 2, 3])
print(a)
print(a.dtype)
print(a.shape)
2. Use Numpy native array creation (arange, ones, zeros, etc.)
Such as:
b = np.zeros(10)
c = np.ones((1, 2))
3. read from the disk array
Np.load method used to read data.
4. Use string or buffer to create an array of bytes from the original
5. Use the special library functions (random, etc.)
Indexing and slicing
Basic Operation
Index surface one-dimensional array of looks and functions similar to Python list.
For slicing, when you will be a scalar value assigned to a slice, the value is automatically propagated to the entire constituency, with Python list of the most important difference: the role of Numpy slice in the array is the view of the raw data, but also data is not copied, all your changes are directly applied to the source data. This is because, Numpy beginning of the design is to handle large data, copy the data to replicate to naturally produce a lot of performance problems. If you want to get a copy of the data, you need to explicitly use .copy () method.
for example:
arr = np.arange(10)
print(arr)
print(arr[0])
print(arr[1:6])
arr_slice = arr[1:6]
arr_slice[1:3] = 5
print(arr_slice)
print(arr)
arr_copy = arr[1:6].copy()
arr_copy[1:3] = 6
print(arr_copy)
print(arr)
For a multidimensional array, each element on the location of the index is no longer a scalar, but the array can be passed in a comma-separated list of indexes to access individual elements. Other operations and the same one-dimensional array.
for example:
arr_2d = np.array([[1, 2, 3], [2, 3, 4], [3, 4, 5]])
print(arr_2d[0])
print(arr_2d[0, 1])
arr_2d_slice = arr_2d[1]
print(arr_2d_slice)
arr_2d_slice[0] = 1
print(arr_2d_slice)
print(arr_2d)
Slice index
ndarray Python list slice syntax and the like, for high-dimensional objects, more patterns, or may be sliced in a plurality of axes, indexing may be used in combination with an integer (reduced dimensions).
for example:
arr_test = np.array([[1, 2, 3], [2, 3, 4], [3, 4, 5]])
print(arr_test[:2])
print(arr_test[:2, 1:])
print(arr_test[1, :1])
arr_slice_test = arr_test[:2, 1:]
arr_slice_test[0] = 0
print( arr_slice_test)
print(arr_test)
Boolean Index
By Boolean index, we can facilitate rapid retrieval element array according to the specified conditions. If the variable or standard quantitation of large data processing using this filter function will definitely give a program designed to bring great convenience.
Here is a simple example:
In [1]: import numpy as np
In [2]: x = np.array([[0, 1], [2, 3], [3, 4]])
In [3]: x
Out[3]:
array([[0, 1],
[2, 3],
[3, 4]])
In [4]: x > 2
Out[4]:
array([[False, False],
[False, True],
[ True, True]])
In [5]: x[ x > 2] = 0
In [6]: x
Out[6]:
array([[0, 1],
[2, 0],
[0, 0]])
And can be combined using statistical methods to count the ndarray True Boolean values in the array, there are three common methods:
- sum (): count value of True
- any (): True if there are one or more arrays of test
- all (): Check the value of all whether the array is True
Fancy index
Fancy index (Fancy indexing) Numpy is a term that refers to the use of an integer index into the array.
Fancy index value based on the value of the array index as a subscript of an axis of the target array. For a one-dimensional array of integers as an index, if the goal is one-dimensional array, then the result is that index element corresponding to the location; if the goal is a two-dimensional array, the corresponding target is the next line.
Fancy indexes do not like a slice, it always copies the data to the new array.
for example:
In [1]: import numpy as np
In [2]: array = np.empty((4, 3))
In [3]: for i in range(4):
...: array[i] = i
...:
In [4]: array
Out[4]:
array([[0., 0., 0.],
[1., 1., 1.],
[2., 2., 2.],
[3., 3., 3.]])
In [5]: array[[1, 3]]
Out[5]:
array([[1., 1., 1.],
[3., 3., 3.]])
In [6]: array[[-1, -3]]
Out[6]:
array([[3., 3., 3.],
[1., 1., 1.]])
In [7]: array[np.ix_([3, 0],[2, 1])]
Out[7]:
array([[3., 3.],
[0., 0.]])
Operation shape
Shape conversion
This section describes how to modify the shape of an array of several common:
reshape (): modifies the array without changing the original data
flat (): iterator array element, the array may be processed for each data element in the
flatten (): Returns a copy of the array, the processing of the copy do not affect the original array format .flatten (order = ''), wherein the order = 'C' represents a row expand, 'F' expressed by columns , 'a' represents the original order, 'K' indicates the order the elements appear in memory.
ravel (): flattened array elements, the sequence is generally "C-style", returns an array of view of modification affects the original array.
This function receives two parameters:
for example:
arr = np.arange(12)
print(arr)
arr1 = arr.reshape(3, 4)
for item in arr1:
print(item)
for item in arr1.flat:
print(item)
print(arr1.flatten())
print(arr1.flatten(order="K"))
arr.flatten()[10] = 0
print(arr)
print(arr.ravel())
arr.ravel()[10] = 0
print(arr)
And swap shaft transpose
Describes several common methods:
- ndarray.T: transpose
- transpose: the dimension of the array of transducers
- rollaxis: rolling back the specified axis
- swapaxes: switching arrays for two axes
Transpose remodeling is a special form of the data, the data source returns the view. Simple transposition may be used .T, and the method can also be used transpose swapaxes.
for example:
arr = np.arange(12).reshape((2, 2, 3))
print(arr)
print(arr.T)
print(arr.transpose((1, 0, 2)))
print(arr.swapaxes(1, 2))
General functions: rapid progression of the elements set of functions
General function (ufunc) is a function of performing data ndarray operation element level, which can be divided into mono- and dicarboxylic be described.
One yuan func
Func one yuan can be seen as a simple variant of the element level, such as sqrt and COS, for example:
arr = np.arange(10)
print(np.sqrt(arr))
print(np.square(arr))
Two yuan func
Two receiving array, and then returns a result array, such as add and MOD, for example:
arr1 = np.arange(10)
arr2 = np.arange(10)
print(np.add(arr1, arr2))
Read more functions to official documents, ha ha, not go into here.
Array Operations
Basic operations
In Numpy, you can use ndarray perform some operation between math, grammar and general scalar elements on the same piece of data. Wherein the array of scalar computation will be applied to each of scalar array elements.
for example:
i = np.array([[1, 2], [3, 4]])
j = np.array([[5, 6], [7, 8]])
print(i + j)
print(i - j)
print(i - 1)
print(i * j)
print( i / j)
Above, and the multiplication is different from the matrix multiplication For matrix multiplication, may be used:
i = np.array([[1, 2], [3, 4]])
j = np.array([[5, 6], [7, 8]])
print(j.dot(i))
In addition, Numpy also provides the following commonly used statistical methods:
- min (): minimum value of the array
- max (): maximum value of the array
- sum (): summing array elements
- cumsum (): calculating and accumulating an axial element, the intermediate result of the composition returns an array of
- cumprod (): the cumulative product of all the elements
Array expression
A plurality of write processing array expression array data is very convenient and efficient, for example: Suppose we want to compute the function sqrt (x ^ 2 + y ^ 2) on a set of values (lattice type), using np.mashgrid function accepts a two-dimensional array to produce two-dimensional matrix:
In [1]: import numpy as np
In [2]: points = np.arange(-5, 5, 0.01)
In [3]: x, y = np.meshgrid(points, points)
In [4]: x
Out[4]:
array([[-5. , -4.99, -4.98, ..., 4.97, 4.98, 4.99],
[-5. , -4.99, -4.98, ..., 4.97, 4.98, 4.99],
[-5. , -4.99, -4.98, ..., 4.97, 4.98, 4.99],
...,
[-5. , -4.99, -4.98, ..., 4.97, 4.98, 4.99],
[-5. , -4.99, -4.98, ..., 4.97, 4.98, 4.99],
[-5. , -4.99, -4.98, ..., 4.97, 4.98, 4.99]])
In [5]: z = np.sqrt(x ** 2 + y ** 2)
In [6]: z
Out[6]:
array([[7.07106781, 7.06400028, 7.05693985, ..., 7.04988652, 7.05693985,
7.06400028],
[7.06400028, 7.05692568, 7.04985815, ..., 7.04279774, 7.04985815,
7.05692568],
[7.05693985, 7.04985815, 7.04278354, ..., 7.03571603, 7.04278354,
7.04985815],
...,
[7.04988652, 7.04279774, 7.03571603, ..., 7.0286414 , 7.03571603,
7.04279774],
[7.05693985, 7.04985815, 7.04278354, ..., 7.03571603, 7.04278354,
7.04985815],
[7.06400028, 7.05692568, 7.04985815, ..., 7.04279774, 7.04985815,
7.05692568]])
Filter condition
Introduce several common screening methods:
- where: Returns the index of the input array satisfy a predetermined condition of the element
- .argmax () and numpy.argmin () function returns along the given axis, respectively, the maximum and minimum index elements
- nonzero () function returns a non-zero input index array elements.
Examples
Next, the random walk Numpy to simulate operation in the array operation.
First, to achieve a simple random walk step 1000, starting from 0, 1 and -1 to generate random, random walk is determined during the first time reaches a certain value (tentatively 8) time (number of steps) to achieve :
import numpy as np
nsteps = 1000
draws = np.random.randint(0, 2, size=nsteps)
steps = np.where(draws > 0, 1, -1)
# 各步的累计和
walk = steps.cumsum()
# 第一次到达8的时间
walk_8 = (np.abs(walk) >= 8).argmax()
print(walk_8)