Courses Address: http: //www.auto-mooc.com/mooc/detail mooc_id = BA91C867A68E92651FBF224828ECAE6E & major_id = E1007D8658541BD264785AA3709ADA25?
This is a note!
1.0 Data basic algorithm
1.1 clustering algorithm
Class: similar set of elements.
classification
Good categories defined in advance, the number of categories is fixed; label formation according to certain criteria, and then classified according to distinguish tags.
Clustering
There is no prior booking categories, the number of classes uncertain. Without manual annotation and clustering classifier is trained beforehand, category re-clustering process generated automatically.
K-means clustering algorithm
K-means clustering algorithm.
Step: 1, first randomly determined centroid, FIG b; 2, the samples is calculated the distance to the centroid; 3, the sample cluster, FIG c; 4, recalculation of the cluster, each centroid, FIG d; 5, implementation of 2-step cycle.
SOM clustering
KNN with K-means the difference
Reference: https: //www.tuicool.com/articles/qamYZv
KNN algorithm process is like this:
We can see from the above figure, the figure of the data set is good data that are playing the label, one is blue square, one is red triangle, green circle that we are to be classified The data.
If K = 3, then the nearest triangle there are two red and one blue square from the green point, three-point vote, so green this point to be classified belongs to a red triangle.
If K = 5, then there is from the nearest point two red triangles green and three blue squares, this five-point vote, so green this point to be classified belong to the blue square.
We can see, KNN is essentially a method of statistical data based on ! In fact, many machine learning algorithm is based on statistical data.
Cluster performance metrics
Distance calculation:
Mahalanobis distance (what ??? clustering radar is to be made will learn about)
1.2 dimensionality reduction algorithm
Covariance matrix? ? ? (When will learn about)
1.3 regression algorithm