[Statistics Notes] (14) Probability and Probability Distribution

Probability and probability distribution

Probability is a numerical value that measures the probability of accidental occurrence. Suppose that after repeated trials (represented by X), accidental events (represented by A) occurred several times (represented by Y). With X as the denominator and Y as the numerator, a numerical value (represented by P) is formed. In many experiments, P is relatively stable at a certain value, and P is called the probability of A appearing. If the probability of accidental event is determined by long-term observation or a large number of repeated trials, then this probability is statistical probability or empirical probability.

The discipline that studies the internal laws governing incidents is called probability theory. Belongs to a branch of mathematics. Probability theory reveals the manifestation of internal laws contained in accidental phenomena. Therefore, probability has an important effect on people's understanding of natural and social phenomena.

Classical definition of probability

If a test satisfies two:
(1) The test has only a limited number of basic results;
(2) The probability of each basic result of the test is the same.
Such an experiment is a classical experiment.
For event A\large P(A)= \tfrac{m}{n} in a classical experiment , its probability is defined as:, where n represents the total number of all possible basic results in the experiment. m represents the number of basic test results included in Event A. This method of defining probability is called the classical definition of probability.

Classical probabilities are limited to the range of random trials with only a limited number of possible results, which limits their application. Therefore, people have proposed a method to determine the probability of an event based on the frequency of repeated trials, that is, the statistical definition of probability.

Statistical definition of probability

Under the same conditions, a random test n times, an event A appears m times ( \large m\leqslant n), then the ratio is  \large \frac{m}{n}called the frequency of event A. As n increases, the frequency fluctuates up and down around a constant p, and the amplitude of the fluctuation gradually decreases and tends to stabilize. The stable value of this frequency is the probability of the event, which is recorded as:

\large P(A)= \tfrac{m}{n}= p

example:


Random events and their probabilities

Events that may or may not occur under certain conditions are called random events.
Usually an event in an experiment consists of basic events. If there are n possible results in an experiment, that is, the experiment is composed of n basic events, and all the results are equally likely, then such an event is called an equal event.
Mutually exclusive events: Two events that cannot occur simultaneously are called mutually exclusive events.
Opposite events: That is, there must be a mutually exclusive event called an opposite event.

In a specific random experiment, each possible result is called a basic event , and the set of all basic events is called a basic space.

Random events (referred to as events) are composed of some basic events. For example, in a random test of rolling two dice in a row, Z and Y are used to indicate the number of points for the first and second occurrence, respectively. Values ​​1, 2, 3, 4, 5, 6, each point (Z, Y) represents a basic event, so the basic space contains 36 elements. "The sum of points is 2" is an event, which is composed of a basic event (1, 1), which can be represented by the set {(1, 1)}, "The sum of points is 4" is also an event, which is (1 , 3), (2, 2), (3, 1) is composed of 3 basic events, which can be represented by the set {(1, 3), (3, 1), (2, 2)}.

If the "sum of points is 1" is also regarded as an event, it is an event that does not contain any basic events, and is called an impossible event . P (impossible event) = 0.

This event cannot happen in the experiment. If the "sum of points is less than 40" is regarded as an event, it contains all basic events. This event must occur in the test, which is called an inevitable event . P (necessary event) = 1. In actual life, we need to study various events and their relationships, various subsets of elements in the basic space and their relationships, etc.


Discrete random variable and its distribution

Definition of random variables

Random variable (random variable) represents a real-valued single-valued function of various results of random experiments. Random events can be quantified regardless of whether they are directly related to quantity, that is, they can be expressed in a quantized manner. Simply put, random variables refer to the number of random events . For example, the number of points that appear when a dice is rolled, the number of calls received by the telephone exchange within a certain period of time, the height of a person randomly checked, the displacement of particles suspended in a liquid in a certain direction, etc. are all random variables. Examples.

When doing experiments, it is often relative to the test results themselves, and we are still mainly interested in some functions of the results. For example, when rolling a dice, we often care about the points and numbers of the two dice, and do n’t really care about the actual result, that is, we may care about the point and the number is 7, but not the actual Whether the result is (1, 6) or (2, 5) or (3, 4) or (4, 3) or (5, 2) or (6, 1). The quantities we focus on, or more formally, these real-valued functions defined in the sample space are called random variables.
Because the value of the random variable is determined by the test results, we can assign probabilities to the possible values ​​of the random variable.

Random variables can be divided into discrete random variables and continuous random variables.

Discrete
Discrete (discrete) random variable means that the value of the variable is limited or countable within a certain interval.

For example, the number of births and deaths of the population in a certain area in a certain year, the effective number and the ineffective number of patients treated with a certain medicine for a certain disease, etc.

Discrete random variables are usually classified according to the probability mass function, mainly divided into: Bernoulli random variables, binomial random variables, geometric random variables and Poisson random variables.
 

Continuous type
(continuous) random variable means that there are an infinite number of variables in a certain interval, or the values ​​cannot be listed one by one.

For example, the length and weight values ​​of male healthy adults in a certain area, and the measured values ​​of serum transaminase in a group of infectious hepatitis patients.

There are several important continuous random variables that often appear in probability theory, such as uniform random variables, exponential random variables, gamma random variables, and normal random variables.


Probability distribution of discrete random variables

Expectation and variance of discrete random variables

 

Common distribution of discrete random variables

  • 0-1 distribution
  • Binomial distribution (Bernoulli distribution)
  • Poisson distribution

0-1 distribution

Random variables can only take two values, 0 and 1. Its distribution law is:

Or disassemble and write as

 The distribution law table is:

\large X \large 0 \large 1
\large p_{k} \large 1-p \large p

 

 

 

Binomial distribution (Bernoulli distribution)

The Bernoulli distribution refers to the random variable X, the parameter is p (0 <p <1), if it takes the probability p and 1-p to take 1 and 0 as values. EX = p, DX = p (1-p). The number of successful Bernoulli trials follows the Bernoulli distribution, and the parameter p is the probability of success. The Bernoulli distribution is a discrete probability distribution, which is a special case of the binomial distribution at N = 1. It is named in honor of the Swiss scientist James Bernoulli (James Bernoulli or James Bernoulli).

Bernoulli test

If an infinite sequence of random variables \large X{_{1}},X{_{2}},……......, \large X_{n} are independent and identically distributed (i.i.d.), and each random variable \large X_{i}are subject to parameters \large pBernoulli distribution, then the random variable \large X{_{1}},X{_{2}},……......, \large X_{n}the formation parameters \large pof a series of Bernoulli test. Similarly, if \large na random variable \large X{_{1}},X{_{2}},……... is \large X_{n}independently and identically distributed, and all obey \large pthe Bernoulli distribution with the parameter , then the random variable \large X{_{1}},X{_{2}},……..., \large X_{n}forms \large pa \large nheavy Bernoulli test with the parameter

Here are a few examples to illustrate, assuming that a uniform coin is tossed repeatedly. If the \large ipositive side appears in the first toss, the order \large X_{i}= 1; if the reverse side appears, the order \large X_{i}= 0, then, the random variable \large X{_{1}},X{_{2}},……..., \large X_{n}forming \large p= \frac{1}{2}a series of Bernoulli with parameters The test, again, assumes that 10% of the parts produced by a particular machine are defective. Randomly select \large none for observation. If the \large ifirst part is defective, order \large X_{i}= 1; if there is no defect, order \large X_{i}= 0, then a random variable \large X{_{1}},X{_{2}},……... \large X_{n}is formed parameter is \large p= \frac{1}{10}the \large nweight of Bernoulli trial.

Poisson distribution

The \large Xpossible values ​​of random variables are 0,1,2, ⋅⋅⋅, and the probability of taking each value is:

Where \large \lambda > 0is the Poisson distribution or variance of mathematical expectation (mathematical expectation and variance equal to the Poisson distribution, the parameters are equal \large \lambda), called \large Xsubject to parameters \large \lambdaof Poisson distribution, referred to as: \large X ~  \large \pi \left ( \lambda \right ).

The Poisson distribution has only one parameter \large \lambda.


Probability distribution of continuous random variables

Since continuous random variables can take any value in a certain interval or on this real axis, we cannot list each value and its corresponding probability as discrete random variables, but must use other methods, usually using Functions and distribution functions are described. When a function is \large f(x)used to represent a continuous random variable, we will \large f(x)call it a probability density function (Probability Density Function).

It should be pointed out that it is  \large f(x)not a probability, that is \large f(x)\neq P(X=x), it is \large f(x)called a probability density function, but \large P(X=x)it is zero under the condition of continuous distribution. In the case of continuous distribution, the area under the curve represents the probability. For example, the probability of the random variable X between a and b can be written as:

That is, the shaded area in the figure below:

 

The expectations and variances of continuous random variables are defined as:

Distribution of continuous random variables:

  • Evenly distributed
  • index distribution
  • Normal distribution

Evenly distributed

index distribution

Normal distribution

 

The expected value of the normal distribution \large \mudetermines its position, and its standard deviation \large \alphadetermines the amplitude of the distribution. As can be seen from the maximum value formula, \large \alphathe smaller the graph becomes, the sharper it is, and thus the greater the probability of \large Xfalling in the \large \muvicinity.

 

Published 646 original articles · praised 198 · 690,000 views

Guess you like

Origin blog.csdn.net/seagal890/article/details/105472269