We often come across some of these terms, such as SVM (support vector machine), Bayes, k near the law. These are classifiers, to find these terms, you will find a large push of a mathematical formula, which instantly persuaded me that math is not very good people, following a brief talk about my understanding;
Definition of the book: In machine learning, classifier role in the training data base category labeled good judgment on a new observation sample belongs to the category.
What does that mean: We start with a simple method for approaching k, which is our KNN algorithm;
The principle is very simple, is to take a point from this point to find the nearest n points, whichever category most, it is predicted that a category.
We need to find 蓝星
Category, which either belongs 红色类
or belong to 绿色类
. The KNN algorithm K
refers to the K neighbors to vote a certain point, the majority. K neighbors in the most votes properties represent attributes that point. In this example we will set to 3 K, we will 蓝星
draw a circle surrounded by the most recent K = 3 points.
We saw 蓝星
last three neighbors are 红色类
, so we can say that Blue Star class is 红色类
.
This is the KNN algorithm, this algorithm we can understand the concept of classification.
1: classification is to classify the unknown data based on existing data, which means that we need a database.
2. The existing database using a series of algorithms to classify what we call the training sample, we need to know the classification of our existing training samples of each data;
3. In the classification of unknown samples, how to classify it? In the known training sample basis, based on the characteristics of training samples, using a mathematical formula to divide the unknown sample.
The basic concept of the classifier understand clearly, in order to further in-depth, refer to the following blog post: