Introduction to data analysis: how to train data analysis thinking?

 This article is published by NetEase Cloud 


Author: Wu Binbin (This article is only for internal sharing in Zhihu. If you need to reprint, please obtain the author's consent and authorization.)

In our life, we often hear about two reasoning modes, one is induction and the other is deduction. These two thinking modes can help data analysts complete the original business logic accumulation, and on this basis, quickly locate business problems and improve Analysis efficiency, but for data analysts who are just getting started, how to quickly complete the analysis report of the project under the premise of lack of project experience? Here, a thinking mode of outreach reasoning is introduced, which is convenient for entry analysts to complete their daily work.

So what is outreach reasoning mode?

In the McKinsey model of thinking, it divides the entities involved in the human reasoning process into three parts: rules, situations, and results.

  • Rules: usually the view of the world;
  • Situation: It is a known fact of the existence of this world;
  • Outcome: Apply the rule to the situation, to expect something to happen.

Any of these three entities can be used as a starting point for reasoning, but different starting points mean different ways of reasoning.

  • Rules-based reasoning methods can be called deductive reasoning.

For example, if you don't work hard at ordinary times, you will fail the test (rule); in reality, a doesn't work hard at ordinary times (situation); so a fails the test (result).

  • The method of reasoning that starts from a situation is inductive reasoning.

a usually does not work hard (situation); a failed the test (result); so the reason for a failing the test may be because of not working hard at ordinary times.
  • Outward thinking is the method of reasoning that starts with results.

a Failing an exam (result), failing an exam is usually due to lack of effort in usual times (rules), check if you don't work hard at ordinary times (situation).

From daily work, we can find that the thinking mode of outreach reasoning is very suitable for the work mode of multi-dimensional analysis and positioning of daily data analysts. It is a kind of thinking logic that data analysts, especially entry-level data analysts, should have. How to do outreach reasoning? In the vernacular, outreach reasoning is forcing oneself to think about the various possible reasons for the problem, and then the focus is to collect data to prove that these reasons are or are not these reasons. In the work process, MECE structured decomposition is the main method. According to the daily work, the following three processes can be simplified:

  • List all the factors relevant to the problem under consideration.
  • Perform hierarchical and correlation comparisons on all relevant factors, separate factors at different levels, and combine the same factors in the same level to ensure the independence of each factor.
  • Arrange and combine the factors according to the correct logical relationship.

As shown below: We can decompose the problem, and the principle of decomposition is

  • Each part is independent of each other (Mutually Exclusive)
  • All parts are Collectively Exhaustive

On this basis, data analysis and positioning are carried out by level to find the most detailed reasons.

In our work, we mainly use these two decomposition methods.

  1. Divided by business function structure, such as channels, operations, functions and other related modules, the relevant indicators are mapped to the main modules, and the cause of the problem can be quickly located through simple and fast communication, but the disadvantage is that the analysis results are not direct enough and rely on external resources to collect information .
  2. Divided by causal structure (index decomposition) revenue = daily activity * payment rate * arpu and other indicators causal relationship, by locating the fluctuation of indicators, locating the most detailed indicators, and moving down the auxiliary dimension, the cause of the problem can be clearly identified. This method is relatively safe The method is the main method in daily work, but the disadvantage is that a relatively complete indicator logic system needs to be built.

The above two decomposition methods are combined and applied for different project requirements, but external resource collection and complete index logic system training are the two most difficult thresholds for entry data analysts to senior analysts. Inductive and deductive thinking improves business familiarity. After completing the initial accumulation of business, the subsequent analysis process can gradually reduce the level and combination of expanded reasoning, and gradually improve the efficiency of problem cause location.



If you want to know about NetEase Big Data, please click here NetEase Big Data|Professional Privatized Big Data Platform


Learn about NetEase Cloud :

NetEase Cloud official website: https://www.163yun.com/

New user gift package: https://www.163yun.com/gift

NetEase Cloud Community: https://sq.163yun.com/


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325697629&siteId=291194637