Abstract: This article shows the distribution of experience through a simple example, and explains the concepts and definitions involved
Example: For population X, a set of samples of size 10 is drawn, and the observed value is:
【1.9,2.5,0.1,0.5,4,5.9,4.5,7.9,7.5,9.9】
Step 1: Sort the sample observations and find the worst
Sort: [0.1, 0.5, 1.9, 2.5, 4, 4.5, 5.9, 7.5, 7.9, 9.9]
Range: 9.9-0.1 = 9.8 ## Maximum Observation Value-Minimum Observation Value
Step 2: Determine the group distance and number of groups.
Interval: [0:10] ## Interval should contain all observations, the left and right boundary values are slightly wider than the observations boundary
Number of groups: how many groups this interval is divided into, generally
Group distance: divide the interval [0:10] into m cells, the distance between each cell is called the group distance
For convenience, the cells are divided into: [0,2), [2,4), [4,6), [6,8), [8,10)
Step 3: Calculate the number of samples (frequency) that fall into each interval, and make the overall X empirical distribution function
(0,2) --- 3
(2,4) --- 1
(4,6) --- 3
(6,8) --- 2
(8,10) -1
Step 4: Make a histogram to obtain an approximate density function
Empirical distribution function concept
The distribution function of the population X is the theoretical distribution, which is often unknown. As in the above example, we can only obtain the observed values of the samples, and we do not know the theoretical distribution function of the population. Therefore, we use the empirical distribution function to describe the distribution of the population (inference), and the histogram to describe the density function of the population X (approximate). When we have enough observations, the empirical distribution function keeps approaching the overall distribution function.