CS131 Study Note #0
1. Getting Started with Numpy
The essence of image recognition processing is matrix operation, and python's numpy library performs such operations, so learning numpy is a necessary step before image learning.
Usually used import numpy as np
to use the numpy package
1.1 general matrix creation method
- Cannot create empty array
- The general way to create an array:
y = np.array([[1,2,3,4,5], [6,7,8,9,10]])
- Read size:
y.shape
- Create a zero matrix:
np.zero((3,3))#创建大小为3*3的0矩阵
- Create an identity matrix:
identity = np.identity(3)
- Create an all-ones matrix:
ones = np.ones((2,2))
1.2 The use of Broadcasting and np.mean
import numpy as np
#如果我们想要将任一个矩阵的行平均值调整到0:
matrix = 10*np.random.rand(4,5)
row_means = matrix.mean(axis = 1).reshape((4,1))
matrix = matrix - row_means
print(matrix)
#axis 不设置值,对 m*n 个数求均值,返回一个实数
#axis = 0:压缩行,对各列求均值
#axis =1 :压缩列,对各行求均值
1.3 numpy.random uses
- numpy.random.randient uses
#low、high、size三个参数。默认high是None,如果只有low,那范围就是[0,low)。如果有high,范围就是[low,high)。
#返回随机的整数,位于半开区间 [low, high)。
>>> np.random.randint(2, size=10)
array([1, 0, 0, 0, 1, 1, 0, 0, 1, 0])
>>> np.random.randint(1, size=10)
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
>>> np.random.randint(5, size=(2, 4))
array([[4, 0, 2, 1],
[3, 2, 2, 0]])
- numpy.random.rand uses
#通过本函数可以返回一个或一组服从“0~1”均匀分布的随机样本值。随机样本取值范围是[0,1),不包括1。
>>> np.random.rand(3,2)
array([[ 0.14022471, 0.96360618],
[ 0.37601032, 0.25528411],
[ 0.49313049, 0.94909878]])
- numpy.random.randn uses
#randn函数返回一个或一组样本,具有标准正态分布。
np.random.randn(2,4)
array([[ 0.27795239, -2.57882503, 0.3817649 , 1.42367345],
[-1.16724625, -0.22408299, 0.63006614, -0.41714538]])
#标准正态分布—-standard normal distribution
#标准正态分布又称为u分布,是以0为均值、以1为标准差的正态分布,记为N(0,1)。
1.4 boolean masks use
- basic judgment
import numpy as np
array = np.array(range(20)).reshape((4,5))#4*5,1-20的矩阵
print(array)
output = array > 10
output
#out:
array([[False, False, False, False, False],
[False, False, False, False, False],
[False, True, True, True, True],
[ True, True, True, True, True]])
array[output]
#out:
array([11, 12, 13, 14, 15, 16, 17, 18, 19])
#可以进行多元的判断
mask = (array < 5) | (array > 15)
#mask = array < 5 | array > 15
mask
#out:
array([[ True, True, True, True, True],
[False, False, False, False, False],
[False, False, False, False, False],
[False, True, True, True, True]])
- practical use
#Given a matrix, change all of the negative values to zero
matrix = 2*np.random.rand(5, 5) - 1#(-1,1)均匀分布的随机矩阵
### SOLUTION ###
mask = matrix < 0
print(mask)
matrix[mask] = 0#将mask中的值全部赋为0
print(matrix)
1.5 reshape usage
#when your reshape, by default you fill the new array by rows
x = np.linspace(1, 12, 6)
print(x)
#[ 1. 3.2 5.4 7.6 9.8 12. ]
x = x.reshape((3,2)) #does not reshape in place!
print(x)
#[[ 1. 3.2]
# [ 5.4 7.6]
# [ 9.8 12. ]]
print(x.reshape(-1))#-1相当于默认值,将由系统自动算出
[ 1. 3.2 5.4 7.6 9.8 12. ]
print(x.reshape(2,-1))
[[ 1. 3.2 5.4]
[ 7.6 9.8 12. ]]
1.6 numpy deep copy
We found that matrix assignments in numpy are all shallow copies, and the copies are addresses, for example:
array = np.linspace(1, 10, 10)
array
#out
#array([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.])
dup = array
dup
#out
#array([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.])
array[0] = 100
dup
#out
#array([100., 2., 3., 4., 5., 6., 7., 8., 9., 10.])
print(id(array))
print(id(dup))
#out
#120645422176
#120645422176
It can be seen that after using '=' to assign values, the addresses pointed to by array and dup are the same, so modifying one of them will also change the other. To avoid this situation, we use numpy's deep copy method.
#using copy
import copy
array = np.linspace(1, 10, 10)
dup = copy.deepcopy(array)
#此处也可以写为dup = np.copy(array)或者dup = array.copy()
print(id(array))
print(id(dup))
array[0] = 100
dup
120649253152
120664256640
array([ 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.])
The wrong way: use slicing syntax [:]
#slicing
array = np.linspace(1, 10, 10)
dup = array[:]
print(id(array))
print(id(dup))
array[0] = 100
dup
2552119240816
2552119240336
[100. 2. 3. 4. 5. 6. 7. 8. 9. 10.]
We found that although the addresses are different, the values of dup and array still change together
2. Getting started with Pyplot
2.1 pyplots
import matplotlib.pyplot as plt
x = np.arange(10)**2
print(x)
plt.plot(x)
plt.show()
The output table is as follows:
Of course, many details can also be added:
plt.figure(figsize = (15,15))
plt.plot(x)
plt.title("This is a graph")
plt.xlabel("this is the x label")
plt.ylabel("this is the y label")
plt.show()
2.2 Scatter plot
x = np.concatenate((np.linspace(1, 5, 10).reshape(10, 1), np.ones(10).reshape(10, 1)), axis = 1)
print(x)
y = x[:,0].copy() + 2*np.random.rand(10) - 0.5
print(y)
plt.scatter(x[:,0], y)#散点图
3. Image reading
3.1 Basic Composition of Pictures
As we all know, an image is composed of three color layers of RGB. For an image, we can use a matrix of (h, w, 3) to represent it. Among them, h and w respectively represent the height and width of the picture, and 3 represents three basic color channels, and the numbers stored in the matrix corresponding to each color channel represent the grayscale value of the color light, and the pixels composed of three different grayscale colors Stitched into a colorful image.
The gray value is not the "black and white" value in the literal sense, but refers to the brightness value of a certain color. For example, a certain layer of the picture (400, 300, 1) represents the red channel matrix, and the red gray value is stored in it.
Each color channel stores its corresponding grayscale value, and the grayscale values of the last three layers of channels can be adjusted to the desired color in the picture according to the grayscale values of different colors in the three primary colors.
Take a random point in the picture, when displaying, put the red gray value of the point into the R channel, the green gray value into the G channel, and the blue gray value into the B channel, and the three gray values can be adjusted like The same as the color to call out the corresponding color.
All in all, channels represent channels of different colors, (of course there are some special channels, such as alpha channels, that store image transparency information.) The grayscale value represents the brightness of a color.
3.2 Code implementation of image reading
def display(img):
plt.figure(figsize = (5,5))
plt.imshow(img)#显示图片
plt.axis('off')#不显示坐标轴
plt.show()
def load(image_path):
out = io.imread(image_path)
#读取图片,第二个参数默认为False,为True时是灰度图
out = out.astype(np.float64) / 255
return out
from skimage import io
img = load('image1.jpg')
display(img)
def rgb_exclusion(image, channel):
out = image.copy()
if channel == 'R':
out[:, :, 0] = 0
elif channel == 'G':
out[:, :, 1] = 0
elif channel == 'B':
out[:, :, 2] = 0
return out#关闭RGB通道中的一个
Note: scikit-image is an image processing package based on scipy. It processes images as numpy arrays. It is a very good digital image processing tool. It needs further study. The following table is for reference.
submodule name | Main functions |
---|---|
io | Read, save and display pictures or videos |
data | Provide some test pictures and sample data |
color | color space transformation |
filters | Image enhancement, edge detection, sorting filters, automatic thresholding, etc. |
draw | Basic graphic drawing that operates on numpy arrays, including lines, rectangles, circles, and text, etc. |
transform | Geometric or other transformations, such as rotation, stretching, and Radon transformations, etc. |
morphology | Morphological operations, such as opening and closing operations, skeleton extraction, etc. |
exposure | Image intensity adjustment, such as brightness adjustment, histogram equalization, etc. |
feature | Feature detection and extraction, etc. |
measure | Measurement of image properties, such as similarity or contour lines, etc. |
segmentation | Image segmentation |
restoration | image restoration |
util | Universal function |
reference
https://zhuanlan.zhihu.com/p/360220467
https://www.jianshu.com/p/be7af337ffcd
4. Linear Algebra
4.1 Solving linear equations:
For example, say we wanted to solve the linear system
A x = b Ax=b Ax=b
A = np.array([[1, 1], [2, 1]])
b = np.array([[1], [0]])
#This function takes parameters A, b, and returns x such that Ax =b.
x = np.linalg.solve(A, b)
4.2 Find the best fit line (best fit):
Linear regression finds the “line of best fit” by minimizing the residual sum of squares.
If we have n datapoints { ( x 1 , y 1 ) , . . . , ( x n , y n ) } \{(x_1, y_1), ... ,(x_n, y_n)\} {(x1,y1),...,(xn,yn)}, the objective function takes the form l o s s ( X ) = Σ i = 1 n ( y i − f ( x i ) ) 2 loss(X) = \Sigma_{i = 1}^n (y_i - f(x_i))^2 loss(X)=Si=1n(yi−f(xi))2 where f ( x i ) = θ 0 + θ 1 x 1 + . . . + θ n x n f(x_i) = \theta_0 + \theta_1 x_1 + ... +\theta_n x_n f(xi)=i0+i1x1+...+inxn
It turns out the parameters such that the loss function is minimized are given by the closed form solution θ = ( X T X ) − 1 X T y \theta = (X^T X)^{-1} X^T y i=(XTX)−1XTy
For this algorithm we recall the method of least squares in linear algebra:
For error: E ( x ) = ∣ ∣ b − A x ∣ ∣ 2 E(x)=||b-Ax||^2E ( x )=∣∣b−Ax∣∣2. Find x to minimize E, where A is a full-rank matrix, and p is the projection of b on the column A space.
By the Pythagorean theorem:
∣ ∣ A x − p ∣ ∣ 2 + ∣ ∣ b − p ∣ ∣ 2 = ∣ ∣ b − A x ∣ ∣ 2 || Ax-p||^2+||b-p||^2=||b-Ax||^2 ∣∣Ax−p∣∣2+∣∣b−p∣∣2=∣∣b−Ax∣∣2
For any b:
∣ ∣ b − A x ∣ ∣ 2 ≥ ∣ ∣ b − p ∣ ∣ 2 ||b-Ax||^2 \geq ||b-p||^2 ∣∣b−Ax∣∣2≥∣∣b−p∣∣2
Therefore, E is minimized if and only if x is taken such that A x = p Ax=pAx=p . Since A has full rank, the equation has a unique solution:
x ^ = ( A T A ) − 1 A T b \hat{x} = (A^TA)^{-1}A^Tb x^=(AT A)−1ATb
Next we use python to do some practical operations
get some points first
x = np.concatenate((np.linspace(1, 5, 10).reshape(10, 1), np.ones(10).reshape(10, 1)), axis = 1)#axis=1表示按列拼接
print(x)
y = x[:,0].copy() + 2*np.random.rand(10) - 0.5
print(y)
plt.scatter(x[:,0], y)
plt.show()
Find the coefficient θ \thetai
theta = np.linalg.lstsq(x, y, rcond=None)[0]
#leastsquare最小二乘求解,利用内置函数
print(theta)
[0.72037691 1.55604653]
or:
theta = np.linalg.inv(x.T.dot(x)).dot(x.T).dot(y)
#利用公式求解最小二程
print(theta)
Got the same result: [0.72037691 1.55604653]
Finally draw the line:
plt.scatter(x[:,0], y)
plt.plot(x[:,0], x[:,0]*theta[0] + theta[1])