1. What is data analysis
Refers to the use of professional statistical analysis methods to analyze a large amount of data, conduct detailed research and summarize, extract valuable information, and form effective analysis conclusions, thereby affecting business decisions
2. The importance of data analysis
Everything, if we can't quantify it, we can't really understand it; if we can't understand it, we can't really control it; if we can't control it, we can't really change it.
In the era of big data, the human brain cannot comprehend the complexities, but data analysis can interpret the meaning; in the face of unknown factors that are difficult to control, data analysis can predict the laws.
Data analysis can make up for our overconfidence in intuition, thinking about problems and making decisions more scientifically and rationally.
3. The role of data analysis
Current situation analysis, what happened in the past? Diagnose business conditions such as through descriptive statistics
Cause analysis, why did it happen? For example, through analysis methods such as dimension disassembly and index disassembly, combined with actual business, to find business abnormal points
Predictive analytics, what might happen in the future? For example, based on user behavior data, predict whether they are about to churn, and take measures to retain users who are about to churn
4. How to analyze data?
1. Clarify the purpose and thinking of the analysis
Thinking determines the result, it is necessary to clarify the purpose of data analysis, form a clear thinking framework, and avoid analysis for the sake of analysis
2. Data Collection
To collect relevant data sets based on analysis purposes, most of which are internal data of the company, and may also involve external data
Relational management database (RMDB, using SQL language to fetch data), data warehouse (WareHouse, using HiveSQL to fetch data)
File: excel, csv, txt, etc.
System/platform: manual export, python automation scripts such as selenium
Internet: Web crawlers
API: requests request library, parsing json files, etc.
3. Data cleaning
Organize the data into a structure and format that is neat and clean and conducive to subsequent analysis. The data may be scattered, and various data sets need to be integrated
Handling of outliers, error values, and missing values
Field splitting, merging, information extraction, format conversion, etc.
Table association: left, right, outer (full), inner join, Cartesian product table, etc. (left half, left anti join, etc.)
Table structure conversion: row to column (long table to wide table), column to row (wide table to long table, etc.), row and column transposition, data pivot (reverse pivot)
4. Data analysis
Need to master common analysis methods and machine learning algorithms
Basic analysis methods: composition analysis, comparative analysis, group analysis, cross analysis, trend analysis, etc.
Advanced analysis methods: linear regression, logistic regression, decision tree, random forest, clustering and other algorithms
5. Data Visualization
Present the analysis point of view in the form of a graph
Words are not as good as a table, a table is not as good as a picture, a picture is worth a thousand words
Basic statistical charts: pie charts, bar charts, line charts, scatter charts, radar charts, funnel charts, etc.
Professional statistical charts: histograms, heat maps, boxplots, violin plots, kernel density estimation maps, etc.
6. Data analysis report
Summarize important analysis conclusions and findings into PPT to form a complete data analysis report
Pyramid structure, total score total form
Conclusion first, top-down, inductive grouping, logical progression
The structure is clear, the hierarchy is clear, the key points are highlighted, and the main points are clarified
7. Data application
Apply feasible proposals to actual business scenarios and solve the company's actual business problems
Provide data support for business decision-making and realize data-driven business growth
4. Data Analysis Tools
If you want to do your job well, you must first sharpen your tools, and you need to master the mainstream data analysis tools
Excel, a very important foundation
PowerBI/Tableau, a powerful business intelligence BI tool
SQL, the necessary database data query language
Python, the computer language of choice for artificial intelligence
5. How to get started with data analysis
0 basic students, you can refer to this learning route to start learning.