First, numerical data analysis
Four numerical data
Numerical data analysis has four main aspects.
Center
Measure of central tendencySpread
Measuring the degree of dispersionShape
Shape dataOutliers
Outliers
Analysis of categorical data
Less categorical data analysis part to consider. Categorical data analysis method is generally independent entity falling within view of each group number or proportion. For example, if we look at the dog's breed, we will be concerned about the number of dogs in each breed, or how the proportion of each breed of dog.
1.1 measure of central tendency
Way measure of central tendency of three ways:
Mean
MeansMedian
MedianMode
The mode
1.1.1 Mean
Often referred to as the mean or average or expected value in mathematics. We all values by adding then dividing by the number of all the measured values of the data set to calculate the mean.
1.1.2 median
The median our data is divided into two parts, less than half of it, half above it. How to calculate the median up to us with an even number or odd observations.
The median values of the odd number
If we have an odd number of observations, the median is the middle of that number directly. For example, if we have seven observations and press ascending order, the median is the fourth value. If we have nine observations, the median is the fifth value.
The median value of an even number
If we have an even number of observations, the median is the average of the two middle values. For example, if we have eight observations and in ascending order, fourth and fifth of the average value is calculated.
To calculate the median, we must first sort value.
1.1.3 The mode
The mode refers to the largest number of data values in a set of data appears.
A data set may have multiple modes, it may not be the mode.
Countless public
If all the values of the data set the same as the frequency of occurrence, the mode does not exist. If we have a group of data sets:
1, 1, 2, 2, 3, 3, 4, 4
It is not the mode because the same number of all observations occurred.
Multiple modes
If the number of two (or more) number that appears is the largest, there are multiple modes. If we have a group of data sets:
1, 2, 3, 3, 3, 4, 5, 6, 6, 6, 7, 8, 9
Which has two modes 3 and 6, because both values appear three times, the highest frequency, while others value only appears once.
Symbolic expression
polymerization
Random variables and expressions
1.2 degree of dispersion measurement