The pre-model data into:
- Training set (Training data) -------------------- used to train, constructed model
- The validation set (Validation data) ------------------ quality models used in the training phase of the test model
- Test set (Testing data) --------------------- model training and other good after, and then the test set to evaluate the quality of the model
Training methods of machine learning
- Supervised learning: the training data set with label
- Unsupervised Learning: unlabeled data sets such as clustering
- Semi-supervised learning: a learning supervised learning and unsupervised learning mode combination. To solve a small amount of data with labels and tags are not a lot of training and classification of problems
Common applications
- Return (future trend data based on historical data)
- Classification (image recognition, spam classification, text classification) classification basically tagged supervised learning
- Cluster (Cluster is no label classification), properties of similar classified as a class
Regression: prediction data is continuous value (Rate)
Category: Data Category forecast data, and the category is known (that is not a Class A Class B)
Clustering: forecast data categorical data, but the unknown category, no tag