(1) related to the algorithm
1. Supervised learning: linear regression, logistic regression, neural networks, SVM.
Linear Regression (hereinafter, the third line X 0 (I) in fact is 1, may be eliminated)
Logistic regression
Neural Network (before propagating to write, reverse the frame will be calculated automatically)
SVM
2. Unsupervised learning: clustering (K-mean), dimensionality reduction (PCA)
K-mean
PCA
3. anomaly detection
4. Recommended system
(2) Strategy
1. the deviation and variance, regularization
Subtracting the training error is the highest level of human deviation (underfitting), reduced cross-validation set error training error variance (overfitting);
Regularization variance solve problems, not to [theta] 0 regularization;
2. Learning curve
The whole process of observation bias and variance, it is more comprehensive.
3. Error Analysis
What kind of reason to find the error caused the largest, best place to spend the time.
4. Evaluation Method
Make use of a single evaluation index, not suitable class skew accuracy, precision and recall rate with determination
The accuracy of the viewing angle is predicted (prediction of how much of the sample positive positive sample), the sample is perspective recall (number of positive samples to be predicted)
F1 = 2 (Fri) / (F + ri)
The split of the data set
,, training set for training model for screening for cross-validation set of model / parameter adjustment, test sets used for final evaluation.
6. Upper Limit
Every step is assumed that the output is correct, can improve the accuracy of the number and improve the highest place is the best place to spend time resolved immediately.
(3) Application
1.OCR
Detection, segmentation, to identify, and now often not divided, direct sequence identification.
2. The large-scale machine learning
Small quantities of training methods and the use of parallel computing.