How much excel do you need to know to do data analysis

When it comes to data analysis, many people will naturally think of Excel, SQL, Python and other tools. It made many friends fall into the sea of ​​books and couldn't extricate themselves. They often asked: How much do I need to learn to be able to understand?

Let’s talk about the simplest Excel today.

1. What does .Excel do?

Excel is not a dedicated tool for data analysis. Excel is just a basic tool for processing tables. In addition to simple table making, Excel includes table-based data calculations, data chart drawing, and statistical/operational research formulas. Moreover, it can be used without writing code at all, and the difficulty of getting started is extremely low.

2. What is the use of Excel for data analysis?

First ask: What is the job of data analysis?

In short, there are 5 parts:

  • store raw data
  • Extract data on request
  • Calculate data as required
  • graph the data
  • Interpret the data and draw conclusions

These five parts can be realized with Excel. Moreover, Excel is very easy to operate, so it is very suitable as an enlightenment tool for newcomers to data.

So, how do you know Excel?

3. How to understand Excel?

Theoretically, it is impossible to 100% understand all the functions of Excel. Don’t you see that books that only talk about Excel functions are hundreds of pages thick. Moreover, there is no need to understand it 100%. As a tool for enlightenment for newcomers to data, the key to understanding Excel is to understand the four basic operations.

Operation 1: Create a table. People who do not have the habit of data analysis present data in Excel and write about how they will use the data in the end. However, this form is too solid and cannot be reused, so it often causes repeated withdrawals. People who are accustomed to data analysis can easily perform calculations by distinguishing dimensions and indicators, and the data operation efficiency is very high.

Operation 2: Clear indicator calculation/summary calculation. People who have the habit of data analysis, when doing Excel, know very well whether they are calculating a new indicator or summarizing the overall data of an old indicator. Such a habit can clarify the logical relationship between indicators during calculation, which is not only less prone to errors, but also enables more analysis based on newly derived classification labels/indicators.

Operation 3: Proficiency in using pivot tables. Pivot table is the core function of Excel data analysis. Mastering it well will not only make the calculation faster, but also help you understand the concept of data aggregation.

Operation 4 : Select the appropriate chart according to the requirement. It is easy to make a chart, but choosing a chart according to actual needs is the requirement of data analysts.

After mastering the above four major operations, you can already distinguish a data startup from finance and HR. Because at this time, you will no longer shuffle the data horizontally and vertically, but process the data according to a certain logic and standardization. These four major operations will be encountered later when learning SQL, Python and other tools. At that time, it can be understood by comparing the operation of Excel.

In contrast, Excel has some seemingly complicated operations, such as statistical/financial functions, such as linear programming toolbox, statistical testing tools, linear regression models, etc. Although the principle behind it is very complicated, the operation is just a set of formula calculations, and you can understand it when you have actual needs. For the sake of complexity and complexity, there is no need.

4. From understanding Excel to data analysis

Simply knowing Excel is not enough for data analysis. Because data analysis problems in reality are often colloquial and casual, such as:

  • "Little bear girl, how is the recent sales situation?"
  • "Sister Xiong, why is the effect of this event worse than the last one?"
  • "Little Xiongmei, what is the predicted performance for the next month?"

There is no button in Excel that can directly click on the answers to the above questions. We need to convert the questions ourselves.

There are three common needs:

Requirement 1: Provide data. For example, "Xiao Xiongmei, how is the recent sales situation?" At this point, we should confirm the classification dimensions and indicators of the statistical data. Then make an Excel example, let the demander confirm that the format is correct, and then extract the data from the database.

Requirement 2: comparative analysis. For example, "Xiao Xiongmei, why is the effect of this activity worse than last time?" At this time, the following three questions should be discussed clearly, and the Excel example should be prepared, and then the corresponding data should be extracted

  • who compares with whom
  • Which indicators to compare
  • Do you want to classify

Requirement 3: Predictive analysis. For example, "Little Xiongmei, what is the forecast for next month's performance?" At this time, the business side needs to choose whether to use a business model to predict or use an algorithm model to predict. If business model forecasting is used, business logic, assumptions and other elements need to be prepared in Excel in advance

If it is an algorithm model, then a simple time series/regression model can be directly calculated in Excel.

Of course, for complex models, let's use SPSS/Python, don't make it difficult for Excel.

After this transformation, the Excel operation can be applied to the data analysis work.

However, the above are just simple examples, and data analysis problems in reality may be more complicated. Therefore, it is even more necessary to lay a good template in Excel from the beginning , so that the business side can see the appearance of the output and confirm that it meets the requirements. Avoid burying your head in hard work, writing code for a few days and being sprayed by people: "What are you analyzing!" That would be too miserable. As a bridge between business and technical personnel, Excel is really easy to use.

5. Why Excel has not been replaced

As the amount of data increases, the task of storing raw data is generally handed over to the database. Extraction and calculation can also be done using tools such as Sql and Python. Therefore, the usefulness of Excel is greatly reduced today. But this still cannot shake its position. Because it has three additional advantages :

Advantage 1: It can handle all kinds of scattered data . For example, questionnaires, such as data exported from official accounts, Tmall, and Douyin background, such as data attached to external ppts. These are difficult to enter the database, and can be processed uniformly through excel. At this point, you need to further master the common functions of Excel data cleaning, such as: concat (link string), replace (replacement content), Vlookup (matching field), and so on.

Advantage 2: It is convenient for business departments to calculate by themselves. Many times, businesses like to do their own calculations. At this time, they can prepare the classification dimensions and indicators that the business wants to calculate, and then teach the business side to use the pivot table to calculate whatever they want. Save the old look for us to count over and over again

Advantage 3: The completed pictures can be directly arranged in the PPT. Writing a report is easy. I won't say all of this.

Guess you like

Origin blog.csdn.net/xljlckjolksl/article/details/132252060