"Vernacular Deep Learning and Tensorflow" Reading Notes 01

table of Contents

Chapter 1 What is Machine Learning

1.1 Clustering

1.2 Return

1.3 Classification


In the process of getting started with deep learning, I have always relied on the explanation of video tutorials. In fact, this is not scientific. At present, the teaching videos related to deep learning are all explained by some well-known doctors. It is still obscure. After all, the gap is too big. It is difficult to learn something. Books and essays are the correct way to learn, but most of the essays are written by foreign gangsters. There are still certain requirements for English proficiency. As an engineering student (I will not engage in academics in the future) You also need to combine hands to do more with less.

I am currently in the first grade of graduate school. Although I used to do java, there is no way. Artificial intelligence is indeed the trend of this era. This is inevitable. No matter which industry I want to work in, I must understand artificial intelligence. , Briefly introduce the connection between current buzzwords: big data> artificial intelligence> machine learning> deep learning.

So I started reading the book "Vernacular Deep Learning and Tensorflow" this month, and shared the reading notes on the blog for my own learning records with everyone. I will put the pdf of this book at the link: https://pan.baidu.com/s/1-6HWhjMif4Klb_rigXqP4g Extraction code: w80n. I have chosen this book for a long time. It should be said that it is the easiest book to get started. Other books such as "Principles of Deep Learning and Tensorflow" are actually good but not suitable for introductory learning. Because I am also reading this book for the first time, I will read it twice, and then write my experience as a blog the second time, so that the quality of the blog will be higher.

Chapter 1 What is Machine Learning

To understand machine learning, you must first compare the undergraduate c / java / c ++ language, data structure and algorithm two courses of learning, remember the first code of the undergraduate hello world? As shown below: This code is very simple, this is non-machine learning, every step of the code execution is performed according to my wishes, in other words, the machine has no learning process, after understanding this, give the next example To illustrate what machine learning is.

public static void main(String[] args){
    System.out.println("HELLO WORLD!");
} 

Demand: To make a spam classification system, it is required to divide the mail into ordinary mail and spam.

Non-machine learning practice: You can write a piece of code if ----- else code to complete this function, for example, if (the email contains "recharge 8888 to win the jackpot" and other words) is judged as spam.

Machine learning approach: Let's call this system a classifier for the time being, it is more appropriate, giving a large number (10G or larger) of spam, these spam have specific terms, called training samples, pay attention to spam This classifier, the classifier determines the classification rules of spam based on the characteristics of these spam messages. This process is called training. As for how to train, it involves many different algorithms, but most of them are statistical statistical induction methods. After judging the rules, we will also give some "ordinary mail" and "spam" to this classifier to determine whether it can be classified correctly. This process is called verification / testing.

Think of the above, our given training data tells the machine that this is spam, this machine learning becomes supervised learning, and there is a type called unsupervised learning, which refers to not marking the training data, for example, the same is the above For example, when giving training samples, both "ordinary mail" and "spam" are given to it, and you don't tell the machine what kind of mail it is. Then the machine will classify ordinary mail and spam after training. Of course, there may be several errors in emails, the error is called error, this learning method is called unsupervised learning.

1.1 Clustering

Clustering (clustering) is a typical "unsupervised learning", is the analysis process of grouping a collection of physical objects or abstract objects into multiple classes composed of similar objects. Clustering is actually not complicated. Let ’s take a simple example. As shown in the figure below, when I was a kid, the teacher will tell you that this is a monkey. When you go to the zoo, you will be successful when you go to the zoo. Thousands of monkeys are different. When watching Journey to the West, there are more monkeys, and those monkeys can still talk. At this time, you will classify this category as monkeys, but these monkeys are not the same as the teacher said. Well, this process is clustering. In other words, we inadvertently completed the clustering process.

Humans are born with this ability of induction and summarization. They can put the similar things of cognition together as a class of things. They can be different from each other, but we have a “limit” in our heart, as long as we are within this Inside, the characteristics are slightly different but not so harmful, they are still this kind of thing. Therefore, we will only classify the monkeys in the zoo as the monkeys in our books when we were young, but not the snakes or monkeys.

The most commonly used clustering algorithm is k-means. The basic idea is to use the distance between each vector-the Euclidean distance in space or the Manhattan distance to judge whether it belongs to the same category from the distance. As shown in the figure below: A and B indicate the types of monkeys we know, C is a new monkey, how to judge which category C belongs to? According to the distance, it can be known that C should belong to a category with B, because the distance is close, this is the basic idea of ​​the clustering algorithm.

1.2 Return

Regression (regression) is a process of "resulting from fruit", it is an inductive idea. After we get a large number of samples, we guess how the relationship between them exists. This is the process of regression. In the machine In the field of study, there are two main types of regression—linear regression and nonlinear regression.

The so-called linear regression is that during the observation and induction of the sample, the vector and the final function value assume a linear relationship, and then the relationship is designed as

Here, w and x are 1 * n and n * 1 matrices, respectively, and wx refers to the inner product of these two matrices. Specifically, taking a patient ’s body index as an example, if the body index and blood glucose index Linear relationship, then we can write

Then the unknowns of this equation are w1-w5 and b, we need to ask for these values, and these values ​​are the most appropriate, how to understand this "suitable"? That is to say, according to these w and b, this equation finds that the blood glucose value is similar to the real blood glucose value, so how much is it different? This gap is called the Loss function. Let me analyze this function, wx + b is the predicted value, and y is the real value. The accumulative symbol in front means that the smaller the Loss, the better.

There is another type of regression problem called nonlinear regression. The more representative algorithm is the logistic regression algorithm. It looks more like a solution to the classification problem. The difference between logistic regression and the above is that the linear regression above finds that y is The specific value is right, y represents blood sugar, which may be 80ml, 70ml or so, but the y value of logistic regression is generally 0 or 1, as shown in the following figure, in fact, the linear regression wx + b is placed in the denominator e times Fang Shang.

Let z = wx + b, you can get the following figure

The image of this function looks like this

Analyze this image to know what the logistic regression is doing. The horizontal axis is z and the vertical axis is y. It is found that z changes from negative infinity to positive infinity, but y only changes between 0 and 1. This value of y is It can be understood as "yes" and "no".

1.3 Classification

Classification is the most used algorithm in machine learning. The classification algorithm is usually called a "classifier", which is equivalent to a black box. We throw a sample from the entrance, and the exit returns us the category of this sample. For example, we Throw a picture of a monkey at the entrance, and the word monkey will appear at the exit.

When this classifier is initialized, it does not have this function. When we feed it many labeled pictures, it only has such ability after summing it up by itself.

The classification problem involves several concepts. Here we use the examples in the book to explain, for example, there are 1000 samples of the training set, which is 1000 pictures, 200 of which are cats, 200 are dogs, 600 are rabbits, a total of 3 Category, we manually label these pictures:

Cat ---- "0"

Dog ---- "1"

Rabbit ---- "2"

The one-hot encoding is mentioned here in the book. Briefly explain, the final result of this classifier is to tell us which category the picture belongs to. One-hot encoding means that the output result is like this:

(0, 1); "Cat": 1

(1, 0); "dog": 0

(2, 0); "Rabbit": 0

I do n’t know if you understand it. Cats, dogs, and rabbits represent 0, 1, 2, respectively; the value they follow is 1, 0, 0. Indicates that this result is a cat, because the cat is 1, and the others are 0. ok, continue to explain, out of 200 pictures of cats, 180 are correctly identified as cats, and the remaining 20 are dogs; all 200 dogs are correctly identified; 550 of 600 rabbit pictures are correctly identified, and 30 are misclassified as cats. 20 photos were misjudged as dogs, which is normal in machine learning.

Ok, now two concepts come into play: recall rate and accuracy rate. Cats: 200 of 180 are correct, and the recall rate is 180/200 = 90%; dogs: when retrieving pictures of dogs, 200 are prepared and 40 are misjudged, so the accuracy of dogs is 200/240 = 83.3 %. The following is the general routine of machine learning. The problem of machine learning is generally classification and regression. The general steps for doing this type of problem are shown in the following figure:


 

 

 

Published 111 original articles · Like 60 · 70,000 + views

Guess you like

Origin blog.csdn.net/Haidaiya/article/details/85632794