table of Contents
1>Create a new IPython session
2>Use Python's NumPy package to process data
3>Load an external data set in Python
4>Use Matplotlib for data visualization
1>Create a new IPython session
Jump to the opencv-machine-learning folder:
cd C:\Users\Kannyi\opencv-machine-learning
Activate the created conda environment:
activate py38
Open a new IPython session:
ipython
2>Use Python's NumPy package to process data
Introduce the NumPy module and verify its version:
import numpy
numpy.__version__
Use np as an alias to introduce the NumPy module and verify its version:
import numpy as np
np.__version__
Use the list command to create a list of integers. The range(x) function will spell out all integers from 0 to x-1:
int_list=list(range(10))
int_list
#结果:[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
You can make Python iterate all the elements in the integer list int_list, and use the str() function to process each element to create a list of strings:
str_list=[str(i) for i in int_list]
str_list
#结果:['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
The result of repeating all the elements in int_list twice:
int_list*2
#结果:[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
The result of multiplying all the elements in int_list by 2:
int_arr=np.array(int_list)
int_arr
#结果:array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
int_arr*2
#结果:array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18])
Each NumPy array also has the following attributes:
- size : the number of elements in the array.
- ndim : the number of dimensions.
- dtype : The data type of the array.
- shape : The size of each dimension.
print("int_arr size:", int_arr.size)
#结果:int_arr size: 10
print("int_arr ndim:", int_arr.ndim)
#结果:int_arr ndim: 1
print("int_arr dtype:", int_arr.dtype)
#结果:int_arr dtype: int32
print("int_arr shape:", int_arr.shape)
#结果:int_arr shape: (10,)
Access a single array element by index:
int_arr[3]
#结果:3
Start indexing from the end of the array, that is, negative index:
int_arr[-1]
#结果:9
int_arr[-2]
#结果:8
Slice the elements with subscripts 2~4 in the array:
int_arr[2:5]
#结果:array([2, 3, 4])
Slice the elements with subscripts 0~4 in the array:
int_arr[:5]
#结果:array([0, 1, 2, 3, 4])
Slice the elements in the array whose subscripts are multiples of 2 (including 0):
int_arr[::2]
#结果:array([0, 2, 4, 6, 8])
Slice the elements in the array in reverse order:
int_arr[::-1]
#结果:array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])
Create a two-dimensional array with 3 rows and 5 columns:
arr_2d=np.zeros((3,5))
arr_2d
'''
结果:
array([[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]])
'''
The initial value of all the arrays is 0. If the data type is not specified, NumPy will use the floating point type by default.
Create a 3×2×4 three-dimensional array, where the initial value of all arrays is 1:
arr_float_3d=np.ones((3, 2, 4))
arr_float_3d
'''
结果:
array([[[1., 1., 1., 1.],
[1., 1., 1., 1.]],
[[1., 1., 1., 1.],
[1., 1., 1., 1.]],
[[1., 1., 1., 1.],
[1., 1., 1., 1.]]])
'''
Get the first two-dimensional array in arr_float_3d by slicing the array:
arr_float_3d[0, :, :]
'''
结果:
array([[1., 1., 1., 1.],
[1., 1., 1., 1.]])
'''
Set the dtype attribute in the NumPy array to an 8-bit integer, and then multiply all the elements in the array by 255 to create a 3×2×4 three-dimensional array:
arr_uint_3d=np.ones((3, 2, 4), dtype=np.uint8)*255
arr_uint_3d
'''
结果:
array([[[255, 255, 255, 255],
[255, 255, 255, 255]],
[[255, 255, 255, 255],
[255, 255, 255, 255]],
[[255, 255, 255, 255],
[255, 255, 255, 255]]], dtype=uint8)
'''
3>Load an external data set in Python
Download the MNIST dataset of handwritten digits (0~9):
from sklearn import datasets
mnist_data=datasets.fetch_openml("mnist_784")
x=mnist_data["data"]
y=mnist_data["target"]
mnist_data.data.shape
#结果:(70000, 784)
mnist_data.target.shape
#结果:(70000,)
Check the values of all targets (de-duplication):
import numpy as np
np.unique(mnist_data.target)
#结果:array(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'], dtype=object)
4>Use Matplotlib for data visualization
Use mpl as an alias to introduce the Matplotlib module:
import matplotlib as mpl
Use plt as an alias to introduce the Matplotlib.pyplot module:
import matplotlib.pyplot as plt
The drawing automatically appears commands:
%matplotlib
Manual drawing commands (generally call the %matplotlib command above, don’t use this):
plt.show()
Create a linear space from 0 to 10 on the x-axis, and 100 sampling points:
import numpy as np
x=np.linspace(0, 10, 100)
Use the sin function in NumPy to get the values of all x points, and draw the results through the plot function in plt:
plt.plot(x, np.sin(x))
Get the following drawing output result:
Save the drawing results to the C:\Users\Kannyi directory:
plt.savefig('Figure1.png')
Introduce sklearn's data set:
from sklearn import datasets
Load actual data:
digits=datasets.load_digits()
digits has two different data fields:
- Data field : All pixels in data are arranged in a large vector.
- Images domain : images retain the 8×8 spatial arrangement of each image. If you want to draw a single image, it is more appropriate to use images.
print(digits.data.shape)
#结果:(1797, 64)
print(digits.images.shape)
#结果:(1797, 8, 8)
Use NumPy's array slicing to obtain an image from the data set:
img=digits.images[0, :, :]
Here, the first row of data is obtained from an array of 1797 elements, which corresponds to 8×8=64 pixels.
Use the imshow function in plt to draw this image:
plt.imshow(img, cmap='gray')
The cmap parameter specifies a color mapping. In the case of grayscale images, gray color mapping is more effective.
Get the following drawing output result:
We can use plt's subplot function to plot examples of all figures:
for image_index in range(10):
subplot_index=image_index+1
plt.subplot(2, 5, subplot_index)
plt.imshow(digits.images[image_index, :, :], cmap='gray')
The subplot needs to specify the number of rows, the number of columns, and the current subplot index.
image_index starts from 0, and subplot_index starts from 1.
The subplot function has the function of specifying the position of the drawing, which can be considered as providing a canvas; the imshow function is to place the drawing on the corresponding canvas.
Get the following drawing output result: