General idea of machine learning problems

# Define the problem
Introduction libraries and data,
Two pairs of data sets processed thin
3 establish visualization datasets
# Understand the data
It is observed by descriptive statistics and data visualization
# data preparation
Pre-processing of data, and make better display data issues, and the relationship between input and output
Screening 1 error duplicate data
2 feature selection, feature attributes and remove excess adding new features properties
3 scale data into the data adjustment or adjustment in order to better show the data distribution
# Evaluation algorithm
The purpose is to find the optimal subset of algorithms
Separating the evaluation data set, to validate the training set
2 define the model evaluation criteria used to evaluate arithmetic model
3 review of a sample of linear and nonlinear algorithm algorithm
4 Accuracy comparison algorithm, this process requires a lot of time
Optimization Model #
method
1 to determine the optimum parameters by parameter adjustment
2 to improve the accuracy of the model through a set of algorithms
# Deployment results
1 using a test set validation Optimization Model
2 to generate a model through the entire data set
The model serialization 3, in order to predict the new data set


Guess you like

Origin www.cnblogs.com/sugar-k/p/11480353.html