Data Analysis Specification Summary

 Structural specification and writing

1. Clear structure and clear priorities

The data analysis report must have a clear structure, which can reduce the cost of reading and facilitate the transmission of information. Although different types of analysis reports have their applicable presentation methods, in general, as a type of argumentative paper, most analysis reports still apply the total- point- ( total ) structure  .

It is recommended to learn the principle of the pyramid, the central idea is clear, the conclusion is first, the above is unified, classified into groups, and logically progressive. The structure of the text should be important first and then secondary, the overall situation first and then details, the conclusion first and then the reason, and the result first and then the process. For the less important content, stop at the end, and discard the trivial things that are not related to the theme.

2. The core conclusion comes first, with logic and basis

The conclusion seeks refinement but not excess. In most cases, data analysis is to find problems. If an analysis report can have one of the most important conclusions, it has achieved its purpose. Simplified conclusions can lower the reader's reading threshold, on the contrary, 100thatare too cumbersome and problematic= 0. The report should give clear answers and clear conclusions around the background and purpose of the analysis

The conclusion of the analysis must be based on a rigorous and rigorous data analysis and derivation process. Try not to have speculative conclusions .A conclusion that is too subjective will lose its persuasiveness. A conclusion that you are not sure about yourself must not mislead others in the report.

But in reality, some reasonable guesses cannot be verified intuitively and feasible. When giving a speculative conclusion, it must be based on a reasonable and partly verified basis, and the conclusion must be given cautiously and explained as a guess. If conditions permit, it can be demonstrated through research / return interviews.

Don't shy away from  " bad conclusions . On the basis of accurate data and reasonable derivation, discovering product or business problems and directly hitting pain points is actually a great value of data analysis.

3. Combined with actual business, reasonable suggestions

Based on the analysis conclusion, there must be targeted suggestions or detailed solutions, so how to write suggestions?

First, figure out who to advise. Different target objects are located in different positions, and they look at problems from different angles. For example, high-level managers pay more attention to the direction. Analysis reports need to provide in-depth insights into the business and point out potential opportunities. Middle-level managers and employees focus on specific strategies. Based on the analysis conclusions, they can pass specific measures to improve the situation.

Secondly, it is necessary to make suggestions based on the actual situation of the business. Although the suggestion is based on data analysis, it is easy to be limited if it is only considered from the perspective of data, or even fall into the misunderstanding of leaving the business and ignoring the industry environment, resulting in the result that it is better not to mention the suggestion. Therefore, suggestions must be based on a deep understanding of the business and full consideration of the actual situation.

Going a step further, if you can give the benefits after the implementation of this suggestion, how much the conversion of orders will be increased, how much the transaction will be increased, and how much cost can be saved, etc., the value points will be directly passed on to the readers.

The above mentioned the writing principles of the report. For example, refer to iResearch, "Retention and Future - Internet Development Trend Report Behind the Epidemic":

Tips : Try to write the analysis report from the perspective of the reader, the content is easy to understand, and the language is standardized and cautious. If the target of the report is not an expert in the field, avoid using too many obscure words and phrases. At the same time, the terminology used in the report must be standardized and consistent with established standards (such as company indicator specifications) and industry-recognized terms.

 Data usage and graphs

Data analysis is often 80% data processing and 20% analysis. Most of the time, collecting and processing data will indeed take a lot of time, and finally analyze on the basis of correct data. Since everything is to find the correct conclusion, it is extremely important to ensure that the data is accurate, otherwise all efforts will be misleading other people.

1. Analysis needs to be based on reliable data sources

For identifying the reliability of information / data, there are mainly four methods: similar comparison, narrow / broad comparison, related comparison and deductive absurdity.

  • Similar comparison: compare information with the same or similar caliber but different sources.
  • Example: The most common is to check and verify the run-out data and report data.

  • Narrow / Broad Contrast: By contrasting with information in a broader (contained) or narrower (contained) sense.
  • Example: Comparing the sales of 3Ccategories with the total sales of the mall,it is obviously wrong that the sales of3C3Csales;the UV/channelsis also similar to thetotalUVofAPP 

  • Relevant comparison: by comparing with relevant and relevant information.
  • Example : For the Dnof a certain platform, for the same benchmark date,the D60retention rate must be lower thanthe D30retention rate. If it is greater than that, it is wrong data.

  • Deduction to absurdity: Deduce the result through in-depth deduction of the existing evidence, and judge whether the result is reasonable.
  • Example: For example, the unit price of sales on a certain platform is2,000, and the total sales are100million; the calculated number of trading users on the day is100,000, and by multiplying by the unit price of customers, the sales volume of the day200million, which obviously does not match the business volume. Bad data.

Tips : The above are commonly used methodologies. The core is to have a sufficient understanding of the business and a clear understanding of the data of key indicators ,so the judgment of the accuracy of the data will be a matter of course. In this regard, the suggestion is to observe the data of the core business every day, analyze the reasons for fluctuations, and cultivate business understanding and data sensitivity.

2. Try to graphically improve readability

Using charts instead of a large number of piled numbers will help readers see problems and conclusions more clearly and intuitively. Of course, there should not be too many charts, too many charts will also make people feel at a loss.

Make the diagram complete. A diagram must contain complete elements in order for readers to understand it at a glance. Chart elements such as title, legend, units, footnotes, and data sources are like the internal organs of a chart.

Be aware of the rules and regulations.

  • First, avoid generating meaningless diagrams. The only criterion for deciding whether to make a picture is whether it can help you express information effectively.
  • Second, don't break the chart. It is best for a chart to reflect a point of view, highlight key points, and allow readers to quickly capture the core idea.
  • Third, only choose the right ones , not complicated ones.
  • Fourth, a one-sentence title.

  • Line chart: The selected line type should be relatively thick, generally no more than 5 lines , no inclined labels are used, and the scale of the vertical axis generally starts from0. The line style of the predicted values ​​is changed to a dashed line .
  • Column chart: use the same color for the same data series. Without slanted labels, the vertical axis normally starts at0. Generally speaking, it is best to add data labels to column charts. If data labels are added, the vertical scale lines and grid lines can be deleted.
  • Bar chart: use the same color for the same data series. Instead of using slanted labels, it is best to add data labels, and try to arrange the data from large to small for easy reading.
  • Pie chart: There are relatively few usage scenarios for pie charts. If you want to use it, please pay attention to the following: arrange the data from the12o'clock position, and the most important components are close to the 12 o'clock position. Do not have too many data items, keep them within 6 items ,and do not use explosive pie chart separation. However, you can separate a certain sector, provided that you want to emphasize this sector. Pie charts do not use a legend. No3Deffects are used. When the sector is filled with color, it is recommended to use a white border line, which has a better sense of cutting.

  • Be wary of charts lying

  • Bluffing growth: People like to study the development trend of a line, such as the growth trend of the stock market, housing prices, and sales. Sometimes, in order to attract readers, the trend is deliberately exaggerated. As shown in Figure 1, the growth rate is exaggerated by truncating the numberaxis, from Figure2See growth is slow.

  • Camouflage of 3D effect: 3Dgraphics are easy to cause visual deviation. As shown in Figure1there isa 3Deffect. It looks like A->B->C->D->Eis increasing in order, but actuallyD>E. Be extra careful about the camouflage of the chart.

Common Data Analysis Mistakes

" Speaking with data " has become a buzzword.

In many people's minds, data represents science, and science means truth. " Data doesn't lie " has become a common mantra when persuading others. Is this really the case? Let's talk about those common myths.

1. The control variable fallacy

When doing A/B testing, the variables are not well controlled, resulting in the test results not reflecting the experimental results. Or when comparing data, the two indicators are not comparable. For example, in order to test the impact of different marketing time points on the next conversion, but experiment A used SMS marketing, experiment B used telemarketing, and did not control variables (marketing methods), which made the experiment unable to draw conclusions.

2. Sample fallacy

  • Insufficient sample size

One of the cornerstones of the basic theory of statistics is the law of large numbers , which can reflect specific laws only when the amount of data reaches a certain level. If the sample size is extremely small, it is recommended to extend the timeline to obtain a sufficient number of samples. Or remove unimportant qualifications and increase the number of samples.

  • Presence of selection bias or survivorship bias

Another cornerstone of statistics is the central limit theorem . A simple description is that in the overall sample, the average value of any group sample will surround the overall average value of this group.

For example, during the application upgrade period, measure the number of logged-in users, the number of transaction users and other indicators to determine whether users like the new version better than the old version. It sounds very reasonable, but there is actually a selective bias hidden here, because when a new version is released, the first batch of users who upgrade are often the most active users. Often these users have better indicators, but it does not mean that the new version is more advanced. good.

  • mix in dirty data

This kind of data is relatively destructive and may lead to wrong conclusions. Usually, we will use the method of data verification to block out the data that fails the verification. At the same time, when analyzing specific businesses, it is also necessary to reasonably limit the data used for specific businesses and filter out abnormal outliers to ensure better data quality.

3. The fallacy of causal correlation

Mistaking correlation for causation and ignoring mediating variables. For example, someone found that there was a clear correlation between the sales of ice cream and the number of children who drowned in rivers and streams, so they ordered to reduce the sales of ice cream. In fact, it may just be because both of these happened during the hot summer weather. The hotter the weather, the more people buy ice cream, and the number of people swimming in the river also increases significantly.

4. Simpson's Paradox

To put it simply, when two group data with large differences are added together, the party that is dominant in the group comparison will be the loser in the overall evaluation.

5. Personal Cognitive Fallacy

Subjective assumptions, experiences as facts, individuals as a whole, characteristics as a whole, and seeing as facts.

Let’s take a subjective example: the conversion rate of a product from page A to page B is 30% , which is directly judged to be very low, and it can be deduced to increase to 75% . However, the conversion rate of the actual similar products or user behavior decision pages is only so high, and a wrong conclusion is drawn.

Standards are crucial, data + standards = judgment. Only with judgment can in-depth analysis be carried out. Find standards through group comparison ( quadrant method, multi-dimensional method, 28 method, comparison method ), and find"good/bad"points through analysis and comparison if there are standards.

Statistical laws and theories cannot be wrong, it is the people who use them who make mistakes. Therefore, when we analyze data, we must be extra careful. It is difficult to distinguish wrong data under the cloak of science.

Guess you like

Origin blog.csdn.net/xljlckjolksl/article/details/131619467