【data analysis】

After the advent of the digital era, the various stages of the business can be recorded all aspects of product sales also recorded, customer behavior and online behavior are collected down. Enterprises have a multi-dimensional data, including sales data, customer consumption data, customer behavior data, operations data. Once you have the data, data analysis possible. A typical data analysis, such as Wal-Mart case of beer and diapers, tarts and flashlights, Target judgment 16-year-old teenage pregnancies are a manifestation of this relationship.

1. Data analysis value

One is to enhance efficiency, to help companies improve data processing efficiency, reduce data storage costs.

Another is to provide guidance to businesses, such as precision marketing, fraud, risk management and business improvement.

2. Team and Role

Data analysis team should belong to the independent sector to provide services to all business sectors, with independent technical team, we can set up separate large data computation and analysis platform, were analyzed using the latest data processing techniques to build the model.

In addition the data analysis team should come from the business sector, business data with a high degree of sensitivity, can be broken down as business needs demand data, the data business scene with the scene, and data analysis together.

DBA: providing raw data processed for the data scientists and data analysts, these data are the basis of data analysis and modeling

Business expert: data modeling from business experience and business knowledge, it is a professional business expert analysis to find the business law in order to find the direction of modeling, and modeling gives recommendations and explanations.

Data scientists: to use their professional skills to help business experts and analysts to model data and calculations.

Data Analyst: propose recommendations based on the results and analysis of data, complete the data from the original application to commercialize a key step to

Operations Specialist: to achieve business decisions. By the planned operational activities, the results of data analysis applications into the actual business activities.

The preparatory work before 3. Data analysis

Data Source Selection

Select data sampling

Data type selection

Missing values

Outlier detection and treatment

Data Standardization

The crude classification data (Categorization) Processing

Variable selection

 

4. The method of evaluating the data model

(1) AUC value discrimination method

AUC = 1, is the perfect classifier.
AUC = [0.85, 0.95], good results
AUC = [0.7, 0.85], the general effect
AUC = [0.5, 0.7], the effect is low, but the forecast for stocks has been very good
AUC = 0.5, like random guessing (Example: throw copper), model no predictive value.
AUC <0.5, worse than random guessing; but as long as the line is always counter-prediction, it is better than random guessing.

(2) KS discriminance

KS value greater than 0.2 would represent a good predictability

 

Reference documents:

On the data analysis and data modeling

Guess you like

Origin www.cnblogs.com/badboy200800/p/11099206.html