Data analysis - examples of five common methods

1. Comparison

    Commonly known as comparison, you will not feel something if you look at one piece of data alone. You must compare it with another piece of data to feel it. For example, picture a and picture b below.

Picture a has no feeling

After comparing Figure b with yesterday's trading volume, you will find that there is a big difference between today and yesterday.

This is the most basic idea and the most important idea. It has a wide range of applications in reality, such as selecting products , monitoring increments , etc. These processes are doing "comparison". After the decision-making bosses get the data, if the data is independent and cannot be compared, they will not be able to judge, which is equivalent to No useful information can be read from the data.

2. Split _

   The word analysis is understood literally, which means splitting and analyzing. Splitting does not mean analysis. Analysis includes splitting . Splitting can help us find out the reasons ( this is simply the ultimate meaning ) .

Going back to the first thought of "comparison", when a certain dimension can be compared, we choose to compare. When you find a problem after comparison and need to find out the cause? Or there is no comparison at all. At this time, it was split.

For example:

The operation group XX children's shoes , after comparing the transaction data, found that today's sales are only 50% of yesterday's. At this time, it is meaningless no matter how we compare the sales volume. At this time, it is necessary to decompose the sales dimension and split the indicators.

Sales = number of transacted users * unit price per customer, and number of transacted users is equal to number of visitors * conversion rate.

As shown in Figure c and Figure d:

Figure c is a dismantling of an indicator formula.

Figure b is a simple decomposition of the components of traffic (it can also be divided very finely and completely)

The results after the split will be much clearer than before the split, making it easier to analyze and find details. It can be seen that splitting is one of the necessary thinking for analysts.

3. Dimensionality reduction

    When faced with a large number of dimensional data but unable to do anything, when the data has too many dimensions, it is impossible for us to analyze every dimension. There are some related indicators from which we can filter out the representative dimensions. That’s it. As shown in the following table:

With so many dimensions, it is not necessary to analyze each one. We know that the number of transaction users/ the number of intended customers = conversion rate. When this dimension exists and can be converted through calculations through the other two dimensions, we can reduce the dimension.

You only need to choose two out of three for the number of users who have made transactions, the number of visitors and the conversion rate. In addition, the number of transaction users * unit price = sales volume, you can also choose two of these three.

We generally only care about data that is useful to us. When there are certain dimensions of data that are irrelevant to our analysis, we can filter them out to achieve dimensionality reduction.

4. Dimension increase

Dimensionality increase and dimensionality reduction are corresponding, and if there is decrease, there must be increase . When our current dimensions cannot explain our problem well, we need to perform an operation on the data and add one more indicator. Please see the picture below.

We found a search index and a number of categories. One of these two indicators represents demand and the other represents competition. Many people use search index/ number of categories = multiples, and use multiples to represent the competition degree of a word. This approach is adding dimensionality. The added dimension is called an auxiliary column .

Dimensionality increasing and dimensionality reduction require a full understanding of the meaning of the data, and purposeful conversion operations on the data in order to facilitate our analysis.

5. Hypothesis _

   When we are unsure about the future, or when we are confused . We can apply hypothesis. Hypothesis is a professional term in statistics, commonly known as hypothesis. When we don't know the result, or there are several options, then we summon hypotheses. We first assume that there is a result, and then use reverse thinking.

From the result to the cause, what kind of cause must be there to produce this result. It's a bit of a roots search. (For example, you analyze the ultimate goal of having a relationship with your girlfriend) Then, we can know how many causes have been satisfied now and how many more causes are needed. If it is a multi-selection situation, we can use this method to find the best path decision (coping method)

Of course, the power of hypotheses goes beyond that. Hypothesis is like a horse (walking empty). In addition to the result, the process can also be hypothesized.

When we return to the purpose of data analysis, we will know that only by clarifying the problems and needs can we choose the analysis method.

Three major data types: This belongs to the concept of sneak exchange, which is actually a subdivision of time series. It is not a real data type, but this is something that is often encountered when processing sales data. The data is placed on the coordinate axes to divide the past, present and future.

The largest data type in the past

Past data refers to historical data, data that has already occurred.

Function: used to summarize, compare and refine knowledge

Such as: historical store operation data, refund data, order data

The second largest data type now

The concept of [now] is relatively vague. Today, this month, and this year can all be current data, depending on our time unit. If we use days as the unit, then today's data is the current data. Only by comparing current data with past data can you know where you are now. Current data alone is of little use.

Function: Used to understand the current situation and discover problems

For example: store data of the day

The third largest data type in the future

Future data refers to data that has not yet occurred and is obtained through prediction. For example, when we do planning, budgeting, etc., these have not yet arrived at the time point, but we already have data. This data is used as a reference. The prediction is not 100%, and there is always a slight discrepancy.

Function: used for prediction

Such as: store planning, sales plan

The three types of data flow in one direction, and the future will eventually become the present until it becomes the past.

For example: put the data on the coordinate axis and divide it by time period, and the role of each data will be very clear.

Guess you like

Origin blog.csdn.net/u014156887/article/details/133137644