The underlying logic of unmanned vehicles "seeing the world" - probability density and Bayesian rule

Table of contents

written in front

1. Core concept - probability density

1.1 "Point probability" is meaningless

1.2 How to use the probability density function?

2. "Bayes' rule" in unmanned vehicles

2.1 Joint probability density

2.2 Prior probability:

2.3 Conditional probability 

2.4 Marginal probability

2.5 Posterior probability 

final words


written in front

In the previous section, we understood probability in a layman's terms.

The core concept of "seeing the world" of unmanned vehicles - probability and the two schools of thought qualitatively discuss the two ways of understanding probability. It briefly introduces the similarities and differences between the probability school and the Bayesian school, and hopes to clarify the meaning of probability in the form of questions and answers. Meaning, lay a solid foundation for the subsequent study of probability density function and Bayesian rule. https://blog.csdn.net/slampai/article/details/127643964

After understanding probability, we start to try to understand probability density, which is a prerequisite for understanding Bayes' rule.

The reason is simple, Bayes' rule derives probability density functions, not probabilities.

1. Core concept - probability density

There are two types of random variables, discontinuous and continuous.

A non-continuous random variable has a discontinuous value space and can be a finite number of random events or an infinite number of random events. When the number of random events that can take values ​​is limited, the probability value can be calculated independently for each random event. For example, tossing a coin is either heads or tails. Even considering the special situation of the coin standing after landing, there are only 3 cases. It is clear and natural to directly use probability to express.

A continuous random variable, as its name implies, can take any value in a continuous interval. For example, when we check our body temperature, we may get a series of values ​​due to the difference in the instrument itself or the way it is used. The characteristic is that the value is not limited to a specific value. We cannot point out the probability of a certain data occurrence alone, but we can get the probability of a certain interval.

From the perspective of probability, they can all have the definition of probability, but the meaning is slightly different. However, if we discuss it from the perspective of probability density, there will be a huge difference.

That is, only continuous random variables have the definition of probability density . Because the probability density function is the derivative of the continuous probability distribution function (cdf ) , which reflects the change of the probability distribution.

The main point we emphasize here is that the Bayesian probability formula does not use probabilities, but directly uses the probability density pdf .

1.1 "Point probability" is meaningless

Now that the probability distribution function cdf is mentioned, let's briefly introduce it.

First of all, many people will fall into a misunderstanding - confusing point probability and probability density. Let me talk about the conclusion first - for continuous random variables, we do not pay attention to "point probability", but only to "interval probability".

Take the one-dimensional normal distribution as an example, as shown in the blue line in the figure below, a smooth bell. For any point on the blue curve, a probability density value can be obtained.

Note that what is obtained is only the probability density value at this point, not the point probability value. Refer to the relationship between mass and density. Density is a measure of the degree of mass aggregation. The higher the density, the greater the mass in a unit space. Looking at the probability and probability density again, we can think that the greater the value of the probability in the unit interval.

Maybe you will ask, why not calculate the point probability value? This can be understood from an integral point of view. For a simple point, it is not impossible to calculate the integral, but the result will always be 0 (review the integral formula). The same is true for calculating the probability value at a certain point, just as density is different from mass, the mass of a single point is 0, which is meaningless.

1.2 How to use the probability density function?

Imagine that we go through a series of derivations and end up with a probability density function. This function can be very complicated, so complicated that you can't estimate the curve or graph corresponding to this probability density function, let alone draw it. How should we change it? If we forcefully fit this probability density distribution in a known model, such as a normal distribution, can the result be expressed in a simplified way?

Correspondingly, there are two usages of the probability density function.

The first one is to ignore the specific shape of the probability density function of the random event x, which may be because of complexity, or it may not pay attention to the overall distribution, but to find the value corresponding to the maximum probability density in an optimal way, and directly put This value serves as the best estimate of the random event x. This idea is similar to the practice of the frequentist school. For example, use the classic probability model to do repeated experiments, obtain a statistical probability, and find the value of the event with the highest occurrence frequency as the estimated value of this type of experiment.

The second way is more special, and for some people, it may be more natural. It does not pay attention to each specific value of random event x, but focuses on the entire value space of random event x and the corresponding probability density distribution. (Note: It doesn’t have to be Gaussian distribution, it can also be other niche distributions) In vernacular, I can’t give you an accurate answer, but I can circle a space for you and give you the value of each value in the space. probability situation.

To sum up, the first usage belongs to the optimization method, hoping to give a definite result under the maximum probability density (or the point corresponding to the maximum probability density). The latter usage is often used in probability derivation, giving an interval (note: between plus and minus infinity is also an interval), and giving the probability density in this interval.

At first glance, the former directly gives the so-called optimal value, while the latter needs to give a probability density distribution. It seems that the latter has more information and is more complicated to express. However, if you choose the normal distribution, this elegant, symmetrical and concise form of distribution, you will find that the latter expression can also be expressed very concisely. Kalman filtering is a good example. (Dig a hole and talk about it later.)

2. "Bayes' rule" in unmanned vehicles

Bayes' rule, also called Bayesian probability density formula. Remember his name - Bayes, for future extended reading.

After understanding the meaning of probability density, we can happily understand Bayes' rule. In particular, we will understand it in the context of state recognition for autonomous vehicles.

Before dismantling Bayes' rule, let's take a look at this simple, beautiful and symmetrical formula.

p(A/B) = \frac{p(B|A)*p(A)}{p(B)}

After our appreciation, we started discussing each part of the formula.

2.1  Joint probability densityp(A, B)

Why the sudden mention of joint probability? In Bayes' rule, there is no joint probability density! ?

I am not mistaken here. Bayes' rule itself is derived from the joint probability density.

p(A, B) = p(A|B) * p(B) = p(B|A) * p(A)

As the name implies, joint probability expresses the probability that two random variables occur at the same time. In order to strengthen the understanding of the concept, let's do an exercise first.

In the measurement model of the unmanned vehicle, A is the sensor measurement value y, and B is the pose x of the vehicle at the current moment. Therefore, the joint probability p(y, x)represents the probability density value when the car pose is x and the measurement value is y.

Do another exercise. In the motion model of the unmanned vehicle, A is the pose xt of the car at the current moment, and B is composed of two state quantities, which are the pose xt-1 of the car at the previous moment and the current moment The motion increment v, so the joint probability p(xt, xt-1, v) represents the probability density value when the car’s pose and motion increment are xt, xt-1, and v at this moment and the previous moment.

This kind of exercise seems very simple, and repeated practice helps to understand the meaning of probability density and increase the proficiency of Bayesian derivation.

2.2 Prior probability:p(A)

Don't be fooled by the name, he is just a probability.

In layman's terms, a "self-righteous" probability is based entirely on limited knowledge and information or experience. As for the method of obtaining, it can be the probability obtained based on experience, summarized by oneself, or provided by the supplier of the equipment (they are actually frequency values ​​​​obtained after a large number of repeated experiments), or calculated in the previous step of the continuous derivation process. have to. With a lot of subjectivity.

For example, in an unmanned vehicle system, p(x)it represents the probability value that the unmanned vehicle pose is x at the current moment. p(y)Indicates the probability that the most recent sensor measurement was y.

2.3 Conditional probability p(B|A)

Conditional probability is also well understood. The general idea is, under the premise of known condition A, our degree of confidence in random event B.

Still take the measurement model of the unmanned vehicle as an example, p(y|x)which means that when the current pose of the unmanned vehicle is x, the probability value of obtaining the sensor measurement data is y. In some sources, it is also called the probability model of sensor measurement.

2.4 Marginal probabilityp(B)

The name "marginal probability" comes from the calculation process of hand-calculated discrete marginal probability. When p(m, n)each value of is written in a grid of m rows and n columns, it is natural to sum each row in the grid, and then write the result of the summation P(m) on the paper to the right of each row at the edge. Note: Words echoing "edge" appear here.

For continuous variables, we need to use integrals instead of sums:

p(b)=∫p(b,a)da

The meaning is the same as the marginal probability calculation method of discrete random variables. For any random event b, calculate the probability value that b is known, and a and b occur simultaneously (in the case of known b, calculate the integral of the joint probability density).

Chances are, when you come into contact with this technical term, you will find it difficult to understand. (In SLAM, when you come into contact with BA Dafa, there is a similar concept called marginalization, which will be equally difficult to understand. First dig a hole, and then expand it in SLAM.) My understanding is that this probability is a joint p(A, B)probability upper limit, because we know that the value of conditional probability must be less than 1, according to the definition of joint probability, there must be

p(A, B) = p(A|B) * p(B)< p(B)

Of course, you can also say that p(A) can also be called marginal probability. It's just a professional name.

In the unmanned vehicle system, the understanding of p(B) and p(A) is similar, so I won’t repeat them here.

2.5 Posterior probability p(A|B)

Clearly, the posterior probability is opposed to the prior probability. In layman's terms, it's an afterthought.

When you have the results, go to discuss the reasons. In other words, analyze the cause according to the result. In Bayesian probability derivation, it is usually said that when the observer’s knowledge or information increases by B, the probability of random event A occurring is an update of the prior probability p(A) of the random event, reflecting A change in the observer's knowledge or information.

In the unmanned vehicle system, the common posterior probability is p(x|y), which indicates the probability that the unmanned vehicle pose is x when the sensor measurement information is y. In particular, P(xt|xt-1, v) represents the probability value that the updated pose of the unmanned vehicle is xt when the initial pose of the unmanned vehicle is xt-1 and after a motion increment v. It can also be considered as a probabilistic model of the state transition (or motion) of the unmanned vehicle.

final words

Say important things three times.

In Bayes' rule, probability density functions are used instead of probabilities.

In Bayes' rule, probability density functions are used instead of probabilities.

In Bayes' rule, probability density functions are used instead of probabilities.

As for the probability density function, there are many professional terms, which are hard to pronounce and have a sense of distance when you first contact them. But if you think about it carefully, you will find that they are also very natural nomenclature.

This is what I understand the underlying logic of unmanned vehicles to see the world.

Hope to inspire and help you.

 

Guess you like

Origin blog.csdn.net/slampai/article/details/127856872