Use OpenCV and Python to process data

table of Contents

1>Create a new IPython session

2>Use Python's NumPy package to process data

3>Load an external data set in Python

4>Use Matplotlib for data visualization


1>Create a new IPython session

Jump to the opencv-machine-learning folder:

cd C:\Users\Kannyi\opencv-machine-learning

Activate the created conda environment:

activate py38

Open a new IPython session:

ipython

2>Use Python's NumPy package to process data

Introduce the NumPy module and verify its version:

import numpy
numpy.__version__

Use np as an alias to introduce the NumPy module and verify its version:

import numpy as np
np.__version__

Use the list command to create a list of integers. The range(x) function will spell out all integers from 0 to x-1:

int_list=list(range(10))
int_list
#结果:[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

You can make Python iterate all the elements in the integer list int_list, and use the str() function to process each element to create a list of strings:

str_list=[str(i) for i in int_list]
str_list
#结果:['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']

The result of repeating all the elements in int_list twice:

int_list*2
#结果:[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

The result of multiplying all the elements in int_list by 2:

int_arr=np.array(int_list)
int_arr
#结果:array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

int_arr*2
#结果:array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

Each NumPy array also has the following attributes:

  • size : the number of elements in the array.
  • ndim : the number of dimensions.
  • dtype : The data type of the array.
  • shape : The size of each dimension.
print("int_arr size:", int_arr.size)
#结果:int_arr size: 10

print("int_arr ndim:", int_arr.ndim)
#结果:int_arr ndim: 1

print("int_arr dtype:", int_arr.dtype)
#结果:int_arr dtype: int32

print("int_arr shape:", int_arr.shape)
#结果:int_arr shape: (10,)

Access a single array element by index:

int_arr[3]
#结果:3

Start indexing from the end of the array, that is, negative index:

int_arr[-1]
#结果:9

int_arr[-2]
#结果:8

Slice the elements with subscripts 2~4 in the array:

int_arr[2:5]
#结果:array([2, 3, 4])

Slice the elements with subscripts 0~4 in the array:

int_arr[:5]
#结果:array([0, 1, 2, 3, 4])

Slice the elements in the array whose subscripts are multiples of 2 (including 0):

int_arr[::2]
#结果:array([0, 2, 4, 6, 8])

Slice the elements in the array in reverse order:

int_arr[::-1]
#结果:array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])

Create a two-dimensional array with 3 rows and 5 columns:

arr_2d=np.zeros((3,5))
arr_2d
'''
结果:
array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])
'''

 The initial value of all the arrays is 0. If the data type is not specified, NumPy will use the floating point type by default.

Create a 3×2×4 three-dimensional array, where the initial value of all arrays is 1:

arr_float_3d=np.ones((3, 2, 4))
arr_float_3d
'''
结果:
array([[[1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 1., 1.],
        [1., 1., 1., 1.]]])
'''

Get the first two-dimensional array in arr_float_3d by slicing the array:

arr_float_3d[0, :, :]
'''
结果:
array([[1., 1., 1., 1.],
       [1., 1., 1., 1.]])
'''

Set the dtype attribute in the NumPy array to an 8-bit integer, and then multiply all the elements in the array by 255 to create a 3×2×4 three-dimensional array:

arr_uint_3d=np.ones((3, 2, 4), dtype=np.uint8)*255
arr_uint_3d
'''
结果:
array([[[255, 255, 255, 255],
        [255, 255, 255, 255]],

       [[255, 255, 255, 255],
        [255, 255, 255, 255]],

       [[255, 255, 255, 255],
        [255, 255, 255, 255]]], dtype=uint8)
'''

3>Load an external data set in Python

Download the MNIST dataset of handwritten digits (0~9):

from sklearn import datasets
mnist_data=datasets.fetch_openml("mnist_784")
x=mnist_data["data"]
y=mnist_data["target"]
mnist_data.data.shape
#结果:(70000, 784)

mnist_data.target.shape
#结果:(70000,)

Check the values ​​of all targets (de-duplication):

import numpy as np
np.unique(mnist_data.target)
#结果:array(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'], dtype=object)

4>Use Matplotlib for data visualization

Use mpl as an alias to introduce the Matplotlib module:

import matplotlib as mpl

Use plt as an alias to introduce the Matplotlib.pyplot module:

import matplotlib.pyplot as plt

The drawing automatically appears commands:

%matplotlib

Manual drawing commands (generally call the %matplotlib command above, don’t use this):

plt.show()

Create a linear space from 0 to 10 on the x-axis, and 100 sampling points:

import numpy as np
x=np.linspace(0, 10, 100)

Use the sin function in NumPy to get the values ​​of all x points, and draw the results through the plot function in plt:

plt.plot(x, np.sin(x))

Get the following drawing output result:

Save the drawing results to the C:\Users\Kannyi directory:

plt.savefig('Figure1.png')

Introduce sklearn's data set:

from sklearn import datasets

Load actual data:

digits=datasets.load_digits()

digits has two different data fields:

  • Data field : All pixels in data are arranged in a large vector.
  • Images domain : images retain the 8×8 spatial arrangement of each image. If you want to draw a single image, it is more appropriate to use images.
print(digits.data.shape)
#结果:(1797, 64)

print(digits.images.shape)
#结果:(1797, 8, 8)

Use NumPy's array slicing to obtain an image from the data set:

img=digits.images[0, :, :]

 Here, the first row of data is obtained from an array of 1797 elements, which corresponds to 8×8=64 pixels.

Use the imshow function in plt to draw this image:

plt.imshow(img, cmap='gray')

The cmap parameter specifies a color mapping. In the case of grayscale images, gray color mapping is more effective.

Get the following drawing output result:

We can use plt's subplot function to plot examples of all figures:

for image_index in range(10):
subplot_index=image_index+1
plt.subplot(2, 5, subplot_index)
plt.imshow(digits.images[image_index, :, :], cmap='gray')

The subplot needs to specify the number of rows, the number of columns, and the current subplot index.

image_index starts from 0, and subplot_index starts from 1.

The subplot function has the function of specifying the position of the drawing, which can be considered as providing a canvas; the imshow function is to place the drawing on the corresponding canvas.

Get the following drawing output result:

Guess you like

Origin blog.csdn.net/Kannyi/article/details/112412993