2, feature acquiring

Internet companies in the most complex models are very few data scientists are doing, most of the engineers do is basically moving in the data warehouse brick, constantly data cleansing, then one is constantly looking for business analysis features. 

Source feature has two parts, (1) business has been tidied various features of the data, we look for characteristics suited to our problem needs; (2) we are looking for advanced data features from the business features.

For the business has been tidied various features, how are we going to find features for our problems need it? This requires us to find experts in the field to understand the business, so they gave some advice, if the classification of a drug for efficacy, we asked these experts, what factors will affect the efficacy of drugs, small impact to be a big impact these features are our first candidate feature set.

This feature set may be large, we need to reduce the dimension, screening, etc., that is, our data clean-up phase of things to do.

 

Guess you like

Origin www.cnblogs.com/pacino12134/p/11368641.html