1 . Data analysis Step 1: Ask a question - which we have to analyze business metrics? 2 . Data analysis Step 2: Understand the data - the familiar Excel interface work - Excel each field represents what meaning? - What type of data in Excel have? - Basic operational data 3 Data analysis Step 3: How to use Excel data cleansing? - Excel commonly used functions How to use? - How to remove data column space? - How to regulate the value of malformed data sources? - how to split the cell? - Time format data how to deal with? - How is the data sorting and filtering? 4 . Data analysis Step 4: How to obtain business metrics? - how to build a pivot table? - How to use vlookup data analysis? - How to use search engines to solve the problems encountered? 5Project combat: recruitment website information analysis - which cities the demand for larger data analyst job? - Data Analyst in contrast to the average salary in different cities - need to find what data analyst job skills?
First, the data analysis step:
Questions, understand the data, data cleaning, building the model, data visualization.
Excel data type: string (text type), numeric type, the new Logical
Data cleaning step:
(1) selecting a subset
Try to hide unwanted data, for example, do not need to select a column, right-click to hide
Show hidden operation:
(2) column name to rename
Change the name list
(3) delete duplicate values
Select the data to delete duplicate items
(4) missing values
1) How to count how many missing data?
2) How to locate all the missing values?
In the beginning - Find and Select - targeting criteria
3) How artificial one-time completion of all missing values?
Crtl select all, and then enter the ctrl + enter in a grid
(5) The same process
For example, the processing date
Breakdown of the data processing:
Select Data -> disaggregated
(6) data ordering
(7) Processing outlier
Second, the commonly used functions
AVERAGE
FIND Finds the starting position of a character string occurs in another string. FIND (String to find, position of the character string in the cell)
When the Find function to locate, always starting at the specified position, the first position of the string returned matches found, regardless of whether there followed a string that matches.
LEFT/RIGHT
MIND (cell location where the character string, starting position, length, taken)
For example, the minimum salary and the maximum salary statistics: 7K -12k
Cleaning the data, the uppercase K made lowercase k
Check the minimum salary and the highest salary: Data - "brush selected, see if there is an abnormal value, for example, more than 20k case, Negotiable salary of
To remove it to check the abnormal values are processed separately.
Minimum salary with left
Multi-table associated with the query --- vlookup function
What to look for, where to find, the first of several columns, or to find the nearest find is accurate
Third, arrange shortcuts
Quickly select an area: ctrl + shit + arrow keys