This section Highlights:
- Excel Functions
- Excel Pivot Table
- Excel Visualization
- Power BI dashboard
Excel is a good helper data analyst, can be used to make data Kanban / connect to the database / statistical analysis - simple machine learning models ; it can be said, in the era of big data, Excel is still one of the important work of data analysts.
Why is the data analyst to learn Excel?
- Excel is one of the widely used data analysis tools, compare facilitate communication among cross-sectoral
- Easy to use, without having to learn programming to operate
- Powerful, covering the whole process of data analysis
Excel data analysis process can cover all six areas:
- data collection
- Data Cleansing
- Data Transformation
- Data exploration
- Statistical Analysis and Modeling
- Analysis presented
Data exploration: a comprehensive understanding of the information conveyed by the data, Discovery entry point for further in-depth analysis, there is a common means of mapping visualization, statistical calculation and so on.
Excel and R comparison: both can cover all six aspects of the data analysis process, but usually some of the steps used in EXCEL or simply work for small and clean data analysis method is not complicated relatively simple analysis, and programming languages R / Python is actually work as the primary tool, adapted to the data / analysis process more complex, more and deeper analysis needs analysis; the amount of data in the processing size: Excel adapted to process the data line 10000, R for processing data <4GB, others need to use tools large data;
For Excel to complete the work scene:
- Simple retrieval: Excel function (e.g. using Excel quickly find a particular characteristic value)
- Intuitive Distribution: Visualization (e.g., a pie chart display portions ratio case)
- Dynamic rendering: pivot tables - tables to generate statistical summary and summarized:
Small scale chopper:
- test1: quickly find a market as China, sales channels, online chat, CRM & ERP product type combinations corresponding to this business unit sales target condition; - multi-condition screening
- test2: extracting geographic information of an ID; - LEFT
- test3: calculate the maximum sales target; - MAX
- test4: Computing 2019 - June total sales target (considering only greater than ¥ 10,000 sales target prediction unit (line)) - SUMIF (region, "> 100000")
- test5: calculation table data corresponding to the standard sales STDEV function with a difference;
- test6: Sales combination with the judgment table in the IF and the AND function is not in all areas between 10,000 to 20,000, if it is returns TRUE, if not return FALSE;
Basic knowledge of Excel functions to:
- Example simple function - 1 = 1+; (a selected cell, input function (equal sign to begin) in the function block, the calculated value of the cell)
- Selecting a variable as a function of other cells
- Other functions copied by dragging and dropping to other cells';
- "$" Indicates the absolute position of the cell (F4);
- ":" Indicates a plurality of cells as a function of input variables;
The data processing functions:
Data analysis process of six steps: data acquisition - Data Cleansing - Data Conversion - Data Exploration - Statistical analysis and modeling - data presentation, where data cleansing and data conversion of these two phases commonly used functions, collectively referred to herein as "data treatment class "function.
Data processing common class of functions - commonly used functions
Function name | Features |
---|---|
LEFT/RIGHT | From the text to the left / right side to extract one or more characters |
CONCATENATE | The two or more strings a |
ONLY | Returns the length of the text string |
TRIM | Delete all spaces other than "a single space between the words" from the text |
REPLACE | Replaces part of a text string for other text |
How to query Excel functions required:
Method One: Excel function (given by Category)
Method two: look for and learn Excel functions you need through a search engine
data analysis functions:
often use Excel functions in data exploration and statistical analysis and modeling of these two stages, this is called the "class data analysis function"
SUBTOTAL function (filter) / the IF / the SUMIF
Excel functions: Alphabetic
VLOOKUP Use these steps:
- Production Inquiry Form
- Choose VLOOKUP function
- Select function parameters
- Copy function to query all the corresponding position on the table
Precautions: In a large table, query in the query should be based on the value of the right value, or can not be queried.
PivotTable:
What is a PivotTable (Pivot Table)?
The table information review and summarize the stop means; PivotTable table may be based on a larger amount of information, to generate a table of statistical review and summarize. The so-called "statistical summary" may include sum, average, or other statistics.
Scenarios and requirements:
- Presents the distribution of one-stop "** sales target" in all regions and markets
- The flexibility to query each specific sales channels and product types sales target
common production steps: - 1. Select PivotTable
- 2. Select PivotTable Field
- 3. Select the calculation type of table data
- 4. The data selection table "value display mode"
- The generated table of a "processed" ease of understanding
Note: The fourth step, fifth step based on business needs to operate;
data visualization:
What is Data Visualization: The graphical means to clearly and effectively convey information and communication behind the data.
Why should it be visualized?
- Assist in a comprehensive, multi-angle understanding of data
- Explain to the people do not understand the results of the analysis of data
- Easier to detect patterns and trends in data, so further research
Excel, Power BI and R-ggplot2 difference in visual
Using Excel mapping:
Introduced six commonly used graphics, such as a column, line, pie charts, scatter plots and histograms. Five kinds of graphics and learners their different usage scenarios.
1. Column Chart
Small tasks:
- 1. Distribution of sales targets in each area
- 2. Changes in sales volume target in each month
The effect is as follows:
2. FIG polyline
Adaptation scenario: Find a trend to adapt to
recommend the input data: "Value Matrix 'with row and column names are (similar to a bar graph)
Small tasks:
- 1. Distribution of sales targets in each area
- 2. The change in the sales goal each month
3. pie
Match the scene: Zoom case
Recommended input data: the data table two, as a "classification tag" is a numerical value.
Small task: the distribution of sales goal in each market.
4. Scatter
Match the scene: Zoom case
: Recommended input data of two values listed in the data table
Small task: the relationship between the different market sales and sales volume targets
5. histogram
Match the scene: a measure of the frequency data in the data set appears
Recommended input data: a numeric column
Small tasks: the distribution of different sales markets
partial image contrast characteristic
axis transformation:
Category horizontal axis labels: This section corresponds to each value of the abscissa. For example: 201901-201906 these values.
Legend entries (Series): This section corresponds, in every abscissa value on top, there is need to look at several categories, such as every month, we have to look at North America, Latin America, Europe & Middle East, Asia a few value categories.
Y is a value corresponding to the numerical size of each of the above categories.
Before the change:
after conversion:
Power Bi mapping basis:
VS data reporting dashboards
Report: containing text, tabular form data, and a small amount of static documents graphics - Features: regularly passed to the interests of different stakeholders; non-real-time, focused narrative in the text;
dashboard: it can be personalized to display a specific indicators, data visualization tools and KPI's; - features: usually real-time updates, real-time view of stakeholders; focus on visual presentation complete information.
Power BI features:
- Production of interactive graphics - contains a wealth of information, interesting
- Easy to get started - as opposed to other programming interactive mapping mode
- Facilitate teamwork - Post shared in real time with the team to the cloud
- Automatic Updates - Automatic Update dashboard connection to real-time data
Business: Data need to present - the 500 performance target different business units & time
works:
https://app.powerbi.com/view?r=eyJrIjoiMDk0NTc2MWMtNjhjYy00ZTJjLTlkNmEtYjNkNjFiYWZkZWFjIiwidCI6ImE0NmQwMTZhLTA1NTQtNGE0Yy05OTM5LTgxMWQwM2U0Yzk1YyIsImMiOjEwfQ%3D%3D
Recommended Learning Resources:
Power BI using a minimalist version
Tutorial: Power BI Services Portal
Power BI for service users dashboard
Recommended books:
Excel / Power BI series next update: ① production dashboard development process ② index + match function