CS131 Study Note #0

1. Getting Started with Numpy

The essence of image recognition processing is matrix operation, and python's numpy library performs such operations, so learning numpy is a necessary step before image learning.

Usually used import numpy as npto use the numpy package

1.1 general matrix creation method

Cannot create empty array
The general way to create an array:y = np.array([[1,2,3,4,5], [6,7,8,9,10]])
Read size:y.shape
Create a zero matrix:np.zero((3,3))#创建大小为3*3的0矩阵
Create an identity matrix:identity = np.identity(3)
Create an all-ones matrix:ones = np.ones((2,2))

1.2 The use of Broadcasting and np.mean

import numpy as np
#如果我们想要将任一个矩阵的行平均值调整到0：
matrix = 10*np.random.rand(4,5)
row_means = matrix.mean(axis = 1).reshape((4,1))
matrix = matrix - row_means
print(matrix)
#axis 不设置值，对 m*n 个数求均值，返回一个实数
#axis = 0：压缩行，对各列求均值
#axis =1 ：压缩列，对各行求均值

1.3 numpy.random uses

numpy.random.randient uses

#low、high、size三个参数。默认high是None,如果只有low，那范围就是[0,low)。如果有high，范围就是[low,high)。
#返回随机的整数，位于半开区间 [low, high)。
>>> np.random.randint(2, size=10)
array([1, 0, 0, 0, 1, 1, 0, 0, 1, 0])

>>> np.random.randint(1, size=10)
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

>>> np.random.randint(5, size=(2, 4))
array([[4, 0, 2, 1],
       [3, 2, 2, 0]])

numpy.random.rand uses

#通过本函数可以返回一个或一组服从“0~1”均匀分布的随机样本值。随机样本取值范围是[0,1)，不包括1。 
>>> np.random.rand(3,2)
array([[ 0.14022471,  0.96360618],  
       [ 0.37601032,  0.25528411],  
       [ 0.49313049,  0.94909878]])

numpy.random.randn uses

#randn函数返回一个或一组样本，具有标准正态分布。
np.random.randn(2,4)
array([[ 0.27795239, -2.57882503,  0.3817649 ,  1.42367345],
      [-1.16724625, -0.22408299,  0.63006614, -0.41714538]])
#标准正态分布—-standard normal distribution
#标准正态分布又称为u分布，是以0为均值、以1为标准差的正态分布，记为N（0，1）。

1.4 boolean masks use

basic judgment

import numpy as np
array = np.array(range(20)).reshape((4,5))#4*5,1-20的矩阵
print(array)

output = array > 10
output
#out：
array([[False, False, False, False, False],
       [False, False, False, False, False],
       [False,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True]])

array[output]
#out：
array([11, 12, 13, 14, 15, 16, 17, 18, 19])

#可以进行多元的判断
mask = (array < 5) | (array > 15)
#mask = array < 5 | array > 15
mask
#out：
array([[ True,  True,  True,  True,  True],
       [False, False, False, False, False],
       [False, False, False, False, False],
       [False,  True,  True,  True,  True]])

practical use

#Given a matrix, change all of the negative values to zero
matrix = 2*np.random.rand(5, 5) - 1#（-1，1）均匀分布的随机矩阵
### SOLUTION ###
mask = matrix < 0
print(mask)
matrix[mask] = 0#将mask中的值全部赋为0
print(matrix)

1.5 reshape usage

#when your reshape, by default you fill the new array by rows
x = np.linspace(1, 12, 6)
print(x)
#[ 1.   3.2  5.4  7.6  9.8 12. ]

x = x.reshape((3,2)) #does not reshape in place!
print(x)
#[[ 1.   3.2]
# [ 5.4  7.6]
# [ 9.8 12. ]]

print(x.reshape(-1))#-1相当于默认值，将由系统自动算出
[ 1.   3.2  5.4  7.6  9.8 12. ]

print(x.reshape(2,-1))
[[ 1.   3.2  5.4]
 [ 7.6  9.8 12. ]]

1.6 numpy deep copy

We found that matrix assignments in numpy are all shallow copies, and the copies are addresses, for example:

array = np.linspace(1, 10, 10)
array
#out
#array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

dup = array
dup
#out
#array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

array[0] = 100
dup
#out
#array([100.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9.,  10.])

print(id(array))
print(id(dup))
#out
#120645422176
#120645422176

It can be seen that after using '=' to assign values, the addresses pointed to by array and dup are the same, so modifying one of them will also change the other. To avoid this situation, we use numpy's deep copy method.

#using copy
import copy
array = np.linspace(1, 10, 10)
dup = copy.deepcopy(array)
#此处也可以写为dup = np.copy(array)或者dup = array.copy()
print(id(array))
print(id(dup))
array[0] = 100
dup

120649253152
120664256640
array([ 1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

The wrong way: use slicing syntax [:]

#slicing
array = np.linspace(1, 10, 10)
dup = array[:]
print(id(array))
print(id(dup))
array[0] = 100
dup

2552119240816
2552119240336
[100.   2.   3.   4.   5.   6.   7.   8.   9.  10.]

We found that although the addresses are different, the values of dup and array still change together

2. Getting started with Pyplot

2.1 pyplots

import matplotlib.pyplot as plt

x = np.arange(10)**2
print(x)
plt.plot(x)
plt.show()

The output table is as follows:

Please add a picture description

Of course, many details can also be added:

plt.figure(figsize = (15,15))
plt.plot(x)
plt.title("This is a graph")
plt.xlabel("this is the x label")
plt.ylabel("this is the y label")
plt.show()

Please add a picture description

2.2 Scatter plot

x = np.concatenate((np.linspace(1, 5, 10).reshape(10, 1), np.ones(10).reshape(10, 1)), axis = 1)
print(x)
y = x[:,0].copy() + 2*np.random.rand(10) - 0.5
print(y)
plt.scatter(x[:,0], y)#散点图

3. Image reading

3.1 Basic Composition of Pictures

As we all know, an image is composed of three color layers of RGB. For an image, we can use a matrix of (h, w, 3) to represent it. Among them, h and w respectively represent the height and width of the picture, and 3 represents three basic color channels, and the numbers stored in the matrix corresponding to each color channel represent the grayscale value of the color light, and the pixels composed of three different grayscale colors Stitched into a colorful image.

The gray value is not the "black and white" value in the literal sense, but refers to the brightness value of a certain color. For example, a certain layer of the picture (400, 300, 1) represents the red channel matrix, and the red gray value is stored in it.

Each color channel stores its corresponding grayscale value, and the grayscale values of the last three layers of channels can be adjusted to the desired color in the picture according to the grayscale values of different colors in the three primary colors.

Take a random point in the picture, when displaying, put the red gray value of the point into the R channel, the green gray value into the G channel, and the blue gray value into the B channel, and the three gray values can be adjusted like The same as the color to call out the corresponding color.

All in all, channels represent channels of different colors, (of course there are some special channels, such as alpha channels, that store image transparency information.) The grayscale value represents the brightness of a color.

3.2 Code implementation of image reading

def display(img):
    plt.figure(figsize = (5,5))
    plt.imshow(img)#显示图片
    plt.axis('off')#不显示坐标轴
    plt.show() 
def load(image_path):
    out = io.imread(image_path)
    #读取图片，第二个参数默认为False，为True时是灰度图
    out = out.astype(np.float64) / 255
    return out

from skimage import io
img = load('image1.jpg')
display(img)

def rgb_exclusion(image, channel):
    out = image.copy()
    if channel == 'R':
        out[:, :, 0] = 0
    elif channel == 'G':
        out[:, :, 1] = 0
    elif channel == 'B':
        out[:, :, 2] = 0
    return out#关闭RGB通道中的一个

Note: scikit-image is an image processing package based on scipy. It processes images as numpy arrays. It is a very good digital image processing tool. It needs further study. The following table is for reference.

submodule name	Main functions
io	Read, save and display pictures or videos
data	Provide some test pictures and sample data
color	color space transformation
filters	Image enhancement, edge detection, sorting filters, automatic thresholding, etc.
draw	Basic graphic drawing that operates on numpy arrays, including lines, rectangles, circles, and text, etc.
transform	Geometric or other transformations, such as rotation, stretching, and Radon transformations, etc.
morphology	Morphological operations, such as opening and closing operations, skeleton extraction, etc.
exposure	Image intensity adjustment, such as brightness adjustment, histogram equalization, etc.
feature	Feature detection and extraction, etc.
measure	Measurement of image properties, such as similarity or contour lines, etc.
segmentation	Image segmentation
restoration	image restoration
util	Universal function

reference

https://zhuanlan.zhihu.com/p/360220467

https://www.jianshu.com/p/be7af337ffcd

4. Linear Algebra

4.1 Solving linear equations:

For example, say we wanted to solve the linear system
$A x = b$

A = np.array([[1, 1], [2, 1]])
b = np.array([[1], [0]])
#This function takes parameters A, b, and returns x such that Ax =b. 
x = np.linalg.solve(A, b)

4.2 Find the best fit line (best fit):

Linear regression finds the “line of best fit” by minimizing the residual sum of squares.

If we have n datapoints ${(x_1, y_1), ... ,(x_n, y_n)\}$ , the objective function takes the form $\Sigma_{i = 1}^n (y_i - f(x_i))^2$ where $f(x_i) = \theta_0 + \theta_1 x_1 + ... +\theta_n x_n$

It turns out the parameters such that the loss function is minimized are given by the closed form solution $\theta = (X^T X)^{-1} X^T y$

For this algorithm we recall the method of least squares in linear algebra:

For error: $E(x)=||b-Ax||^2$ Find x to minimize E, where A is a full-rank matrix, and p is the projection of b on the column A space.

By the Pythagorean theorem:

$Ax-p||^2+||b-p||^2=||b-Ax||^2$

For any b:

$||b-Ax||^2 \geq ||b-p||^2$

Therefore, E is minimized if and only if x is taken such that $A x = p$ . Since A has full rank, the equation has a unique solution:

$\hat{x} = (A^TA)^{-1}A^Tb$

Next we use python to do some practical operations

get some points first

x = np.concatenate((np.linspace(1, 5, 10).reshape(10, 1), np.ones(10).reshape(10, 1)), axis = 1)#axis=1表示按列拼接
print(x)
y = x[:,0].copy() + 2*np.random.rand(10) - 0.5
print(y)
plt.scatter(x[:,0], y)
plt.show()

Please add a picture description

Find the coefficient $\theta$

theta = np.linalg.lstsq(x, y, rcond=None)[0]
#leastsquare最小二乘求解，利用内置函数
print(theta)

[0.72037691 1.55604653]

or:

theta = np.linalg.inv(x.T.dot(x)).dot(x.T).dot(y)
#利用公式求解最小二程
print(theta)

Got the same result: [0.72037691 1.55604653]

Finally draw the line:

plt.scatter(x[:,0], y)
plt.plot(x[:,0], x[:,0]*theta[0] + theta[1])

Please add a picture description

[Computer Vision] CS131 Study Notes#0

CS131 Study Note #0

1. Getting Started with Numpy

1.1 general matrix creation method

1.2 The use of Broadcasting and np.mean

1.3 numpy.random uses

1.4 boolean masks use

1.5 reshape usage

1.6 numpy deep copy

2. Getting started with Pyplot

2.1 pyplots

2.2 Scatter plot

3. Image reading

3.1 Basic Composition of Pictures

3.2 Code implementation of image reading

reference

4. Linear Algebra

4.1 Solving linear equations:

4.2 Find the best fit line (best fit):

Guess you like