deep learning, computer vision tasks

Table of contents

computer vision tasks

1. K nearest neighbor algorithm

2. Score function

3. The role of loss function

4. Propagate the overall process forward

5. Back propagation calculation method

computer vision tasks

The process of machine learning :

  1. data collection

  2. feature engineering

  3. Modeling

  4. Evaluation and Application

Computer Vision :

Image representation: the image in the eyes of the computer, and a picture is represented as a three-dimensional array, with the value of each pixel ranging from 0 to 255.

Challenges in Computer Vision: Illumination Angle, Shape Change, Partial Occlusion, and Background Blending

1. K nearest neighbor algorithm

The K (k-Nearest Neighbor, KNN) classification algorithm is a relatively mature method in theory and one of the simplest machine learning algorithms. The idea of ​​this method is: in the feature space, if most of the k nearest samples near a sample (that is, the nearest neighbors in the feature space) belong to a certain category, then the sample also belongs to this category.

K nearest neighbor calculation process :

  1. Computes the distance from a point in a dataset of known type to the current point

  2. Sort by distance

  3. Select K points with the smallest distance from the current point

  4. Determine the probability of occurrence of the category of the first K points

  5. Return the category with the highest occurrence frequency of the first K points as the current point prediction category

Database sample: CIFAR-10

Introduction to the database:

10 types of labels, 50,000 training data, 10,000 test data, size 32*32

The distance calculation method of the image is actually very similar to the addition and subtraction of the matrix.

Limitations of K-Nearest Neighbors : It cannot be used for image classification, because background dominance is the biggest problem, and we focus on the main body (main component)

2. Score function

According to the score function, the category score of each input is calculated as follows: we only have the category score and cannot judge the classification effect, and the loss function is used to evaluate the classification effect.

Linear function: mapping from input ---> output

f(x, W) = Wx

The score function formula is a calculation method used to describe the score in a certain situation, and is generally used in scoring, evaluation, etc. The scoring function formula usually consists of multiple parameters, each parameter represents an influencing factor, and the final score is obtained by weighting these parameters.

3. The role of loss function

The loss function is a function that maps random events or their related random variables to non-negative real numbers.

In machine learning, the loss function is used to measure the gap between the model's predicted results and the real results, and usually the smaller the better. For example, in regression problems, you can use mean square error (MSE) and mean absolute error (MAE) as loss functions; in classification problems, you can use cross entropy (CrossEntropy) as a loss function, or use the binary Cross entropy (BCELoss) etc.

The matrix source is the result of optimization.

The role of the neural network is to deal with the corresponding problems through the appropriate matrix Wi.

uTools_1688799265492

Doing different tasks is the difference in the loss function.

There are actually many loss functions, and what we need is a function form that is closest to reality.

Loss function :

1 here is equivalent to an estimate of an approximate value.

uTools_1688799735799

Although the loss function values ​​of the two models are the same, model A considers the local area, and model B considers the overall situation. The two emphases are different, but the results are exactly the same.

Loss function = data loss + regularization penalty (R(W))

We always hope that the model is not too complicated, and the overfitting model is useless.

uTools_1688802832405

4. Propagate the overall process forward

The forward propagation algorithm, also known as the forward propagation algorithm, as the name suggests, is an algorithm that is performed from front to back.

Softmax classifier

Now we get a score for the input, but wouldn't it be nice to give me a probability!

How to convert a score value into a rate value?

This has something in common with mathematical modeling, which can often be divided by a similar function to get a probability value.

uTools_1688803136943

Normalize and calculate loss value

Propagate forward:

uTools_1688803485728

5. Back propagation calculation method

As an example:

uTools_1688803749742

Its functional formula is: f(x,y,z) = (x+y)z

q=x+y f=q*z

uTools_1688803849371

The value you want to ask for: the partial derivative of f to x, the partial derivative of f to y, and the partial derivative of f to z.

This is the chain rule we learned in advanced mathematics, the gradient is propagated step by step

uTools_1688804086446

The green line we see is our previous part of the forward propagation calculation, and the red part will carry the previous gradient to the back propagation calculation of the next layer.

The backpropagation algorithm, referred to as the BP algorithm, is a learning algorithm suitable for multi-layer neural networks, which is based on the gradient descent method. The input-output relationship of the BP network is essentially a mapping relationship: the function completed by a BP neural network with n inputs and m outputs is a continuous mapping from n-dimensional Euclidean space to a finite field in m-dimensional Euclidean space. Mapping is highly non-linear. Its information processing ability comes from the multiple compounding of simple nonlinear functions, so it has a strong ability to reproduce functions. This is the basis for the application of BP algorithm.

 

Guess you like

Origin blog.csdn.net/Williamtym/article/details/132131895