1 Data & Charts

Blind force force: Although the statistics professional, but Xueyibujing. Few universities have received professional training, properly properly learn residue. Therefore decided to re-review after work, reading materials Jiajun Ping of the "Statistics" 7th Edition. Weekly updates.
I do not follow the logical order book and all the knowledge to write my notes, I wrote those things and compare my work-related (think after that point that knowledge can be applied to data at work), will also write I think we can write the application how to work, somewhat of a place please exhibitions ~

 

The first week is the content of Chapter 3: The chart shows data.
1 Data Audit : checking the data for errors. (Completeness and accuracy (outliers))
The ratio of the difference between the ratio 2
Ratio is the ratio of all the data of each data portion;
Ratio is the ratio between the different categories of data.
Data packets 3
Univariate value of the packet: the value of each into a group. When suitable for discrete variables, and less variable values
Group Distance packet: for continuous variables or variable value or more.
3.1 About group from the grouping
step:
① determine the number of groups. 5-15 group.
② determined group from each group (difference between the upper and lower limits of each group). Group Distance = (maximum value - minimum value) / the number of groups.
③ The packet preparation frequency table (frequency group + + frequency).
Grouping principle: do not leak.
① For continuous variables:
Group 1. limit not included, a≤X <b.
2. The upper limit value of a group in the form of a decimal point. eg.10 ~ 11.99,12 ~ 13.99
② For discrete variables: the adjacent two groups limit interruption. eg.140 ~ 149,150 ~ 159
If the maximum and minimum values ​​of difference data of all the other data is large, the opening can be used group.
The first group: "xx or less", the last group: "XX above"
Not equidistant groups: for example, for age groups.
Practical applications: subgroup analysis of the prices of commodities segments. Gross margin, sold out rate

 

Cumulative Cumulative upwardly and downwardly 4 

 Suitable for sequential data, such as: satisfied, in general, satisfied. 

 You can do cumulative distribution.

 

Simple talk about the above data, enter the following chart elements:

 

5 Overview of data types and mainly illustrates a method

5.1 Quality Data (belong to the following summary table)

Bar chart, pie chart, FIG annular

5.2 numerical data

Original data: FIG stem, Boxplot

Packet Data: Histogram

Time series data: FIG line

Multivariate Data: scattergram (two-dimensional), Bubble (D), a radar chart (multi-dimensional) 

 

 About 5.3 Histogram

① which side of the left and right tail rather long, indicating that the left (right) side.

② difference with bar charts and histograms?

First, put sideways histogram bar called ~

Then the bar graph and the histogram difference:

1 is a bar graph showing the frequency of length; is a histogram showing the frequency of each group by the area (because there are not equidistant packet Oh, so the area is represented by the frequency - the height of each group, the width of each group represents a group distance);

2. Since the packet data having continuity, so the pieces of histogram columns are on together, and the bar graph is separated;

3. The bar chart is just for show disaggregated data, histogram display numeric data. 

 

5.4 numerical data ungrouped pattern applicable

Stem and Leaf: the original distribution of data show

Boxplot: data can show the degree of dispersion (the shape can be seen by boxplot characteristic data distribution), more commonly used method for comparison .

Little excerpt respect boxplot: Boxplot not provide accurate data regarding metric skewed distribution of the data set is large shape information reflects more blurred, the best combination of mean, standard deviation, skewness, distribution functions and the like describes the shape of the distribution of the data set.

 

5.5 radar chart can similarity of each portion of the comparative sample. 


 Application of the above graphic at work:

Bar charts, pie charts, line graphs very common;

Box plots, ring chart, histogram, scatter, bubble, radar, Pareto chart how I have not been used.

Then I went to look at the use of the company's data to make the above chart - Come and see if I can find something interesting.

On foot!

See you next time! 

 

Guess you like

Origin www.cnblogs.com/dream-nalizi/p/11787503.html