"Probability and Statistics" continuous random variables: digital distribution and characteristics

wedge

In the previous years, we introduced the discrete random variable. But in fact, the value of a continuous random variable region of application areas is also very common. For example, the speed of cars, equipment and other continuous uptime, which are very widespread in practical applications, continuous random variables can characterize some problems of discrete random variables can not be described.

Probability density function

We say that the number of discrete values corresponding to the random variable is the number of its columns correspond distribution probability mass function PMF; and the number of continuous random variable corresponding to the value is often uncountable, discretely distributed random variables column then the corresponding probability density function PDF, both the concept is exactly correspond to. We can say that the distribution of the random variable corresponding to the column, it can be said that the random variable corresponding probability function(根据离散或连续,可以是质量或密度)

We recall Discrete Random Variable column:

Performed by adding the three probability values corresponding to the event, we will be able to obtain the total probability of the event corresponding to the set: P (x∈S) = P X (. 1) P + X (2) + P X (. 3 )

The most obvious difference continuous random variable and random variables are discrete, the number of continuous random variable is infinite, uncountable, such as a simple addition is not directly, but in the interval range of the real axis, probability density function integration operation.

Here we are going to carry out the special nature of the probability density function is emphasized:

  • 第一:实数轴上单个点的概率密度函数 PDF,其值不是概率,而是概率律,因此他的取值是可以大于 1 的。
  • 第二:连续型随机变量的概率,我们一般讨论的是在一个区域内取值的概率,而不是某个单点的概率值。实际上,在连续区间内讨论单个点是没有意义的。

The probability of continuous random variable values in a range, we can calculate solved by integrating. Figure above example, the random variable in the interval [a, b]probability of that is:

That is the area within the shaded range. So this further confirms the above conclusion of the second, which means that we are concerned about is not a single point but a range of values ​​of probability calculation.

  • 当x=a时,P(a≤X≤a) = 0,因此即便区间两边相等也无所谓
  • 同理P(a≤ X ≤b) = P(a≤ X <b) = P(a< X ≤b) = P(a< X <b)

Similarly, we continue the analogy, the probability of continuous random variables and non-negative normalized reflected in:

Non-negative: for all x has f the X- (x)> 0

Normalization: P (-∞ ≤ X ≤ + ∞) = 1

Expectation and variance of continuous random variables

Never to this new scene is continuous panic. In the discrete random variable, we distributed through the column, to obtain the weighted mean, that is expected to get a discrete random variable. which is:每一个可能的取值乘上对应的概率再相加

Then a scenario of continuous random variable, we define pull the dead desirable E [X] is the number of individual core definition repeated experiments, the average value of the random variable X (可不是直接将可能的取值加起来再除以总数,而是像我们上面说的那样,可能的取值乘上对应的概率、再分别相加,要考虑到权重在里面。), at this time we will replace the probability density distribution of the column function PDF, summation replaced by integrating it, namely:

Variance is the same, buckle definition: The variance is 随机变量到期望的距离的平方the 期望:

About variance may be more around, we are not considered random variables, but consider a set of numbers 10,203,040, we say that this group variance value is equal 每一个值到平均值距离的平方then 相加、除以总个数. Now replaced by discrete random variables, then put into the average expectation calculated first 每一个值到期望距离的平方, and then 不要相加、除以总个数, instead 各自乘上对应的概率然后直接相加即可,因为要考虑到权重. So for continuous probability mass function is the probability density function can be replaced, so that the variance 随机变量到期望的距离的平方of期望

Then we look at several very important practical example of continuous random variables

Normal distribution

The normal distribution is a continuous probability distribution of the random variable, you can almost see him in all walks of life, nature snowfall statistics for many years a place in human society in such a place Middle School boys average height noise signal in areas such as education in an area college entrance examination scores, signal systems, a large number of natural and social phenomena are by normal forms of distribution.

Normal distribution has two parameters, μ is the mean of a random variable, the other is the standard deviation [sigma] is a random variable, the probability density function PDF he is:

When we specify different mean and standard deviation parameters can be obtained probability density of the normal distribution curve different, the shape of the normal distribution probability density curves are similar, they are bell curve symmetric about the mean μ, the probability after leaving the mean density curve zone showing a rapid drop form. Further, when the mean μ = 0, standard deviation σ = 1, we call the standard normal distribution.

import numpy as np
from scipy.stats import norm
import plotly.graph_objs as go

x = np.linspace(-10, 10, 1000)
normal_1 = norm(loc=0, scale=1).pdf(x)
normal_2 = norm(loc=1, scale=2).pdf(x)
normal_3 = norm(loc=-1, scale=2).pdf(x)

trace1 = go.Scatter(x=x,
                    y=normal_1,
                    line={"width": 4, "color": "green"},
                    name="均值为0、方差为1")

trace2 = go.Scatter(x=x,
                    y=normal_2,
                    line={"width": 4, "color": "yellow"},
                    name="均值为1、方差为2")

trace3 = go.Scatter(x=x,
                    y=normal_3,
                    line={"width": 4, "color": "red"},
                    name="均值为-1、方差为2")

fig = go.Figure(data=[trace1, trace2, trace3], layout={"template": "plotly_dark"})
fig.show()

We see that for too positive in terms of distribution, the x-axis is the average of the curve vertex coordinates. The average value increases, the curve shifts to the right, whereas the left. The larger the variance is thin and the curve, the smaller the more squat curve.

index distribution

We look at the second continuous random variables, exponential random variables we want to say. Use exponential random variables is very extensive, he generally used to characterize the time until something happened used.

For example, from now you look at the time since the beginning of the life of a termination equipment remaining time until a light bulb with a bad time left, the desert planet meteorite fall time of need and so on.

Exponential random variable X is the probability density function is:

Wherein the parameter [lambda] is the exponential distribution, and must satisfy λ> 0, the exponential distribution pattern feature is that when the random variable X exceeds a certain value, this value increases as the probability decreases exponentially while. When discussing the probability characteristics of the exponential distribution, we generally focus attention to three aspects:

  • 第一个:随机变量 X 超过某个指定值 a 的概率,当然此处需要满足 a≥0。依照定义,我们有:

  • 第二个:随机变量 X 位于区间 [a,b] 内的概率,实际上也很简单:

  • 第三个:也就是整个指数分布的数字特征,同时也包含参数 λ 的物理含义。我们在这里可以通过期望和方差的定义,直接用积分求得,这里就不多赘述,直接拿出结论:E[X] = 1 / λ、V[X] = 1 / λ^2

Guess you like

Origin www.cnblogs.com/traditional/p/12588052.html