Understanding Kalman filtering in simple terms

Important note: This article is compiled from online information. It only records the blogger's process of learning relevant knowledge points and has not been deleted.

1. Reference materials

The Kalman filter I understand
is about Kalman filter, an easy-to-understand tutorial
for a simple analysis of Kalman
filterHow a Kalman filter works, in
picturesTalk about Kalman filterExtended
Kalman Filter (EKF) With Python Code ExampleKalman
filter Kalman Filter in
Kalman Filter Algorithm SLAM Derivation of Mobile Robot Technology (7.5) -- Kalman Filter and Extended Kalman Filter for State Estimation

video material

From giving up to mastering! Kalman filtering from theory to practice~
[Kalman Filter]_Kalman Filter_The most detailed mathematical derivation on the entire network.
How to explain Kalman filtering in as simple and detailed a way as possible?

2. Related introduction

Dirichlet Distribution of probability distribution (prior, posterior, likelihood)

1. Probability distribution

Take radar ranging as an example. Radar measurement finds that the missile has a probability of 0.8 at a position of 7.2m, a probability of 0.1 at a position of 7.2m, and a probability of 0.1 at a position of 6.9m. These data are called probability distributions. Data consisting of a number of values ​​and the probabilities of those values ​​occurring is called a probability distribution .

2. Noise error

When using the compass software on the mobile phone, the compass will still fluctuate within a certain range, although it is obviously not moving; when using the scale, it is obviously not moving, but the data of the scale keeps changing. The above phenomena are noise errors. LiDAR, millimeter-wave radar, temperature sensors, etc. all have noise errors during use.

  • The probability distribution of the distance between the missile and the target measured by the radar at time t is zt = N (7, 0. 1 2) z_{t}=N(7,0.1^{2})zt=N(7,0.12 ), 0.1 is the radar noise error.
  • The probability distribution of the position of the missile at time t-1 is xt − 1 = N ( 10 , 0. 2 2 ) x_{t-1}=N(10,0.2^2)xt1=N(10,0.22 ), 0.2 is the noise error of the missile position at the last moment.

Kalman filtering is used to reduce noise errors by fusing radar measurements and missile state data from the previous moment. It is mainly divided into two steps:

  1. Based on the missile position in the last secondestimated valueand the speed of the missile (assuming uniform motion), roughly estimate the position of the missile at the current momentestimated value;
    Rough estimated value of x at time t = xt − 1 − vt − 1 = N ( 10 , 0. 2 2 ) − N ( 4 , 0. 7 2 ) = N ( 6 , 0. 2 2 + 0. 7 2) Rough estimate of x at time t=x_{t-1}-v_{t-1}=N(10,0.2^2)-N(4,0.7^2)=N(6,0.2^2 +0.7^2)Rough estimate of x at time t=xt1vt1=N(10,0.22)N(4,0.72)=N(6,0.22+0.72)

  2. Missile position measured by radarMeasurementsand a rough estimate of the missile’s location.estimated value, perform linear weighting to obtain an accurate estimate of the missile position. This value is calledbest estimate

    Simply multiply these two probability distributions to get the probability distribution of the exact estimate.
    xt = Rough estimate of x at time t × zt = N ( 6 , 0. 2 2 + 0. 7 2 ) × N ( 7 , 0. 1 2 ) x_t=x\text{at}t\text{time rough estimate}\times z_t=N(6,0.2^2+0.7^2)\times N(7,0.1^2)xt=Rough estimate of x at time t×zt=N(6,0.22+0.72)×N(7,0.12)

    N ( 6 , 0. 2 2 + 0. 7 2 ) × N ( 7 , 0. 1 2 ) = N ( 6 , 0.5 3 2 ) × N ( 7 , 0. 1 2 ) = N ( 6 ∗ 0. 1 2 + 7 ∗ 0.5 3 2 0.5 3 2 + 0. 1 2 , 0. 1 2 ∗ 0.5 3 2 0.5 3 2 + 0. 1 2 ) \begin{aligned}&N(6,0.2^2+0.7^2)\times N(7,0.1^2)=N(6,0.53^2)\times N(7,0.1^2)\\&=N(\frac{6*0.1^2+7*0.53^2}{0.53^2+0.1^2},\frac{0.1^2*0.53^2}{0.53^2+0.1^2})\end{aligned} N(6,0.22+0.72)×N(7,0.12)=N(6,0.532)×N(7,0.12)=N(0.532+0.1260.12+70.532,0.532+0.120.120.532)

    x t = N ( 6 ∗ 0. 1 2 + 7 ∗ 0.5 3 2 0.5 3 2 + 0. 1 2 , 0. 1 2 ∗ 0.5 3 2 ( 0.5 3 2 + 0. 1 2 ) 2 ) = N ( ( 1 − 0.5 3 2 0.5 3 2 + 0. 1 2 ) ∗ 6 + 0.5 3 2 0.5 3 2 + 0. 1 2 ∗ 7 , 0.5 3 2 + 0. 1 2 ( 0.5 3 2 + 0. 1 2 ) 2 ) \begin{aligned} &x_{t}=N(\frac{6*0.1^{2}+7*0.53^{2}}{0.53^{2}+0.1^{2}},\frac{0.1^{2}*0.53^{2}}{\left(0.53^{2}+0.1^{2}\right)^{2}})=N((1-\frac{0.53^{2}}{0.53^{2}+0.1^{2}})*6+\frac{0.53^{2}}{0.53^{2}+0.1^{2}}*7, \\ &\frac{0.53^{2}+0.1^{2}}{\left(0.53^{2}+0.1^{2}\right)^{2}}) \end{aligned} xt=N(0.532+0.1260.12+70.532,(0.532+0.12)20.120.532)=N((10.532+0.120.532)6+0.532+0.120.5327,(0.532+0.12)20.532+0.12)

    Among them, 0.5 3 2 0.5 3 2 + 0. 1 2 \frac{0.53^{2}}{0.53^{2}+0.1^{2}}0.532+0.120.532calledKalman gain

3. Gaussian distribution

Gaussian distribution is also known asnormal distribution,记的N ( μ , σ 2 ) \textsf{N}(\mu,\sigma^2)N ( μ ,p2 ), amongμ , σ \mu,\sigmam ,σ is the parameter of normal distribution, and they are the expectation and variance of Gaussian distribution respectively. When these two parameters are determined, the probabilityp ( x ) \mathbf{p}(\mathbf{x})p ( x ) is also determined. In particular, whenμ = 0, σ 2 = 1 \mu=0,\sigma^2=1m=0,p2=When 1 , the distribution ofstandard normal distribution. Gaussian distribution has the highest probability at the mean position.

Gaussian distribution of random variables, the probability density function of which is:
N ( x , μ , σ ) = 1 σ 2 π e − ( x − μ ) 2 2 σ 2 \mathcal{N}(x,\mu,\sigma) =\frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{(x-\mu)^{2}}{2\sigma^{2}}}N(x,m ,s )=p2 p.m 1e2 p2( x μ )2

3.1 Mean and variance

In the normal distribution, each random variable Variance represents the degree of uncertainty in a variable. Variance is used as credibility, and the larger the variance, the less credible it is.

Mean μ \muμ is the mathematical expectationμ = E ⁡ ( X ) \mu=\operatorname{E}(X)m=E(X),方差 D ( X ) = σ 2 = E ⁡ { [ x − E ⁡ ( x ) ] 2 } D(X)=\sigma^2=\operatorname{E}\{[x-\operatorname{E}(x)]^2\} D(X)=p2=And {[ xE ( x ) ]2 }is a measure of the dispersion of random variables. The square root of the variance is the standard deviation, which isσ \sigmas .

3.2 Covariance

Introduction to Covariance Matrix (Covariance Matrix)

Covariance describes the degree of correlation between two variables. The mathematical expression is:
C ov ( X , Y ) = E [ ( X − E [ X ] ) ( Y − E [ Y ] ) ] = E [ XY ] − 2 E [ Y ] E [ X ] + E [ X ] E [ Y ] = E [ XY ] − E [ X ] E [ Y ] \begin{aligned} Cov\left(X,Y\right)& =E\ left[(XE\left[X\right])\left(YE\left[Y\right]\right)\right] \\ &=E\left[XY\right]-2E\left[Y\right] E\left[X\right]+E\left[X\right]E\left[Y\right] \\ &=E\left[XY\right]-E\left[X\right]E\left[ Y\right] \end{aligned}Co v _(X,Y)=E[(XE[X])(YE[Y])]=E[XY]2E _[Y]E[X]+E[X]E[Y]=E[XY]E[X]E[Y]

3.3 Gaussian distribution multiplication formula

N ( μ 1 , σ 1 2 ) ∗ N ( μ 2 , σ 2 2 ) = N ( σ 1 2 μ 2 + σ 2 2 μ 1 σ 1 2 + σ 2 2 , σ 1 2 σ 2 2 σ 1 2 + σ 2 2 ) N(\mu_{1},\sigma_{1}^{2})*N(\mu_{2},\sigma_{2}^{2})=N(\frac{\sigma_{2} {1}^{2}\mu_{2}+\sigma_{2}^{2}\mu_{1}}{\sigma_{1}^{2}+\sigma_{2}^{2}}, \frac{\sigma_{1}^{2}\sigma_{2}^{2}}{\sigma_{1}^{2}+\sigma_{2}^{2}})N ( m1,p12)N ( m2,p22)=N(p12+p22p12m2+p22m1,p12+p22p12p22)

4. Bayesian formula

Posterior probability p ( θ ∣ x ) = Prior probability p ( θ ) ∗ Likelihood p ( x ∣ θ ) Full probability P ( x ) \text{Posterior probability}p(\theta|x)=\frac{ \text{Prior probability}p(\theta)*\text{Likelihood}p(x|\theta)}{\text{Total probability}P(x)}Posterior probability p ( θ x )=Total probability P ( x )Prior probability p ( θ )Likelihood p ( x θ )

Among them, the total probability P ( x ) P(x)P ( x ) is the same asθ \thetaθ is an independent constant, so the posterior probability is proportional to the product of the prior and the likelihood.

5. Prior probability vs posterior probability

The prior probability is a pre-objective assessment of the possibility of a certain event happening, and the posterior probability is the probability that something that has happened was caused by a certain cause.

5.1 Prior probability (prior)

Prior probability p ( θ ) p(\theta)p ( θ ) : The event has not occurred, find the random variableθ \thetaThe probability that θ occurs.

5.2 Posterior probability (posterior)

Posterior probability p ( θ ∣ x ) p(\theta|x)p ( θ x ) : eventx \mathrm{x}x has occurred, find the random variableθ \thetaThe probability that θ occurs.

5.3 Likelihood

Likelihood probability p ( x ∣ θ ) p(x|\theta)p ( x θ ) : The likelihood function is a function of parameters that describes whether the model parameters are reasonable given a specific observation value. That is,From "cause" to the possibility of "effect"

5.4 Examples

There are three random events A, B, and C, and the possible outcome is D.

  • A: Rain;
  • B: Traffic jam;
  • C: Got up late;
  • D: Late.

The probability of occurrence of events A, B and C P ( A ) , P ( B ) , P ( C ) {P(A),P(B),P(C)}P(A),P(B),P ( C ) isPriori probability, the prior probability is independent of the late outcome.

A late event has occurred, which event (A, B, C) caused by the probability, respectively recorded as P ( A ∣ D ) , P ( B ∣ D ) , P ( C ∣ D ) P(A|D), P(B|D),P(C|D)P(AD),P(BD),P ( C D ) , calledPosterior probability

When the event (A, B, C) occurs, the description of D's lateness is recorded as P ( D ∣ A ) , P ( D ∣ B ) , P ( D ∣ C ) P(D|A),P(D |B),P(D|C)P(DA),P(DB),P ( D C ) , calledlikelihood probability. Which of the three events is more reasonable to attribute the lateness to rain, traffic jams, and getting up late? The most reasonable answer is what we often saymaximum likelihood

6. Observed vs Predicted

predicted value, also known asvaluation, is the mathematical model prediction value (Prediction).

observed value, also known asactual value, is the sensor measurement value (Measurement), for example: lidar directly measures the distance of 7m between the robot and the obstacle.

7. Filtering and Filter (Filter)

filtering is weighting(Filtering is weighting). The function of filtering is to give different weights to different signal components . The simplest loss pass filteris to directly give the low-frequency signal a weight of 1, and give the high-frequency part a weight of 0. For more complex filtering, such as Wiener filtering, the weights need to be designed based on the statistical knowledge of the signal.

From the perspective of statistical signal processing, noise reduction can be regarded as a type of filtering. The purpose of noise reduction is to highlight the signal itself and suppress the influence of noise. From this perspective, noise reduction means giving a high weight to the signal and a low weight to the noise. Wiener filter is a typical noise reduction filter.

7.1 Filtering algorithm in the field of robotics

The filtering algorithm in the field of robotics refers to estimating the true position of the robot from noisy robot position data . Commonly used filtering algorithms include Bayes filtering and Kalman filtering.

According to the movement model (mathematical physics model), a rough estimate of the robot's pose and posture can be obtained.estimated value, and then use a filtering algorithm to combine this rough estimate with the actual sensor (such as GPS, gyroscope)MeasurementsFusion can obtain accurate robot position and attitudeestimated value

7.2 Low pass filter

A filter that can easily pass low-frequency filtering is called a low-pass filter.

8. Mobile model

The position information of the robot at the last moment is given, and a control command is given, and the expected position of the robot after executing the command is calculated through a series of formulas. The series of formula calculations here are mobile models. Before implementing the filtering algorithm, it is necessary to model the robot's movement model first, that is, to establish the robot's movement model. There are two commonly used robot mobility models, namely:Mileage-based motion modelSpeed-based mobility model

Mileage-based robot mobility model

The taxi you usually take has an odometer. This odometer can count how far the tires have rotated. To understand intuitively, mileage is the amount of change in a certain indicator. The change can be a change in distance or an angle.

The mileage-based robot movement model (Odometry-based Motion Model), the problem to be solved is that the position and orientation angle of the robot at the previous moment are known, and the expected moving distance and angle of the robot during the period from the previous moment to the present moment are known In the case of changes, find the position and orientation angle of the robot at the current moment. For example, it is known that the position coordinates of the robot at the last moment are (1, 0), and the change during this period is (0.1, 0.2), the mobile model will calculate the estimated position of the robot at the current moment: (1, 0) + (0.1, 0.2 ) + (a random variable, a random variable).

9. Control model

Input the control command and the robot's position information, and the control model tells the robot how far it is expected to go and how much its body orientation changes after executing the command.

10. State estimation

10.1 Problem introduction

Take a simple example to describe the state estimation problem.

In a one-dimensional space, there is a coordinate y 0 y_0 along the positive direction of the coordinate axis.y0sign board. We stay put and use a rangefinder to measure the distance to the sign as ddd . So, our current coordinatesxxWhat is x ?
Insert image description here

The coordinates of the known signboard are y 0 y_0y0, as long as the distance dd to the sign is measuredd , we can get our coordinatesx = y 0 − dx=y_0 - dx=y0d

However, there is an error in our measuring instrument, and the calculation result is not accurate, so we can think about it this way: since we have not moved, we can measure it multiple times and take the average. At this point, we have completely constructed a state estimation problem:

  • In a one-dimensional space, estimate your own coordinates xxx
  • Give a solution: Since the coordinates y 0 y_0 of the signboard are informedy0, which can be used as a reference;
  • Use a measuring instrument to measure the distance dd from yourself to the signd , use this to back-calculate your own coordinatesxxx

Let's call it "estimation" because we can't have infinite time to make infinite measurements to obtain coordinates xxthe true value of x . But when the number of measurements is enough, the result is considered accurate enough, that is, some means are used to approximate the true value.

10.2 Concepts related to state estimation

The above example can explain what is the state estimation of a discrete-time, linear, time-invariant, Gaussian system. Specifically, it can be explained as:

  • Discrete time (discrete-time) : describes that our measurement method is not continuous in time;

  • Linear : Describes the type of mathematical model of the sensor (rangefinder). Since it is an equation related to the observation of a reference object, we also call it: observation equation/observation model (observation model) . By reference yyy and the ranging resultddd , calculate its own coordinatesxxx,即 x = f ( y , d ) = y − d x=f(y, d)=y-d x=f(y,d)=yd , the model isa linear function;

  • Time-invariant : The mapping relationship in the sensor mathematical model does not change with time. Time-invariant does not refer to the state to be estimated xxx does not change with time, it refers to the mapping relationship fwith the statef:x,yd does not change with time. Simply put, this means thatthe observation equation does not change with time. For example: observation equationf ( x , y ) = y − xf(x, y)=y−xf(x,y)=yx will not becomef ( x , y ) = y − 2 xf(x,y)=y−2xf(x,y)=y2 x and other mapping relationships;

  • Gaussian distribution : describes the type of measurement noise. Measurement noisennn obeys Gaussian distribution.

    Considering that there is an error in the rangefinder, that is, there is measurement noise, and the measured value dddactual is the result of combining the true value with measurement noise. Therefore, it is necessary to improve the measurement equation and add a measurement noise nnon the basis of the theoretical resultsn . The formula looks like this:
    d = f ( x , y ) ⏟ theoretical result = dtrue + nd=\underbrace{f(x,y)}_{\text{theoretical result}=d_{true}}+nd=theoretical result=dtrue f(x,y)+
    An ideal situation for n is that the measurement noise n is Gaussian distributed.

10.3 Measurements based on the law of large numbers

Measurement noise that follows a Gaussian distribution is a random variable, and when repeated measurements are made, the results obtained are not the same and fluctuate around the true value. But in the measurement number KKWhen K is large enough, the accumulation of such noise items will tend to the mean. If the mean of the noise is 0, then the exact result can be approximately obtained:
dmean = 1 K ∑ dk = dtrue + 1 K ∑ ≈ 0 nk ⏟ ≈ 0 ≈ dtrue \begin{aligned} d_{mean}& =\frac{1 }{K}\sum d_{k} \\ &=d_{true}+\underbrace{\frac{1}{K}\sum_{\approx0}n_{k}}_{\approx0} \\ &\ approx d_{true} \end{aligned}dmean=K1dk=dtrue+0 K10nkdtrue

3. Macroscopic understanding of Kalman filtering

Kalman filter series 1——Kalman filter

The Kalman filtering algorithm is a recursive predictive filtering algorithm. The algorithm involves filtering and also involves the prediction of data at the next moment.

1. Introduction of Kalman filter

Let's take a small example in life to introduce Kalman filtering: There is a coin and a ruler. The purpose is to obtain the size of the coin as accurately as possible. A total of k times of measurement are measured. Each measurement result is recorded as Z 1 . Z 2 , … Z k Z_{1},Z_{2},\ldots Z_{k}Z1,Z2,Zk. How do we estimate the size of a coin? A natural idea is to take the average. The calculation process is as follows:
Insert image description here

A simple understanding is that the estimated result of a certain measurement is related to the estimated result of the previous time, which embodies the recursive idea of ​​Kalman filtering. Expressed as:

Current estimated value = last estimated value + coefficient * (current measured value - last estimated value)

X ^ k = X ^ k − 1 + 1 k ( Z k − X ^ k − 1 ) \hat{\mathrm X}_k=\hat{\mathrm X}_{k-1}+\frac{1}{k}(Z_k-\hat{\mathrm X}_{k-1}) X^k=X^k1+k1(ZkX^k1)

Parameter explanation

  • X ^ k \hat{\mathrm X}_{k} X^krepresents the kth estimated value
  • Z to Z_kZkRepresents the k-th measurement value;
  • X ^ k − 1 \hat{\mathrm X}_{k-1} X^k1Represents the k-1th estimated value;

It can be seen from the above formula that,kth estimated value = k-1th estimated value + 1 k \frac1kk1(kth measured value - k-1th estimated value). If k increases with the number of measurements, 1 k \frac1kk1will tend to 0, and the above formula will become: X ^ k = X ^ k − 1 \hat{\mathrm{X}}_k=\hat{\mathrm{X}}_{k-1}X^k=X^k1, that is, as k increases, the k-th estimated value is only related to the (k-1)-th estimated value, and has nothing to do with the k-th measured value. That is, as k increases, the measurement results are not important. In fact, this is also very easy to understand. When we measure many times, the previous estimated value is actually very close to the true value, so the significance of measuring one more time becomes less significant; on the contrary, if k is very small, that is, k =1 (k represents the number of measurements, k=0 is meaningless), the above formula will be: X ^ k = Z k \hat{\mathrm{X}}_k=Z_kX^k=Zk, that is, the estimated value at this time is completely determined by the measured value.

2. Kalman coefficient

Convert 1 k \frac1k in the above formulak1Replaced by K k \text{K}_kKkIndicates that K k \text{K}_kKkthat isKalman coefficient

K k \text{K}_kKkThe calculation formula is as shown in the figure below:
Insert image description here

Macroscopic understanding of this formula:

  • e EST ≫ e MEA e_{EST}\gg ​​e_{MEA}eISeAND AWhen , that is, when the estimation error is much larger than the measurement error, K k \text{K}_kKktends to 1, X ^ k \hat{\mathrm{X}}_{k}X^ktends to Z k Z_{k}Zk, indicating belief in the measured value.
  • e EST ≪ e MEA e_{EST}\ll e_{MEA}eISeAND AWhen , that is, when the estimation error is much smaller than the measurement error, K k \text{K}_kKktends to 0, X ^ k \hat{\mathrm{X}}_{k}X^ktends to X ^ k − 1 \hat{\mathrm{X}}_{k-1}X^k1, indicating confidence in the estimate.

3. Kalman filter data fusion

This chapter uses a simple example to demonstrate the role of Kalman filter data fusion.

There is an electronic scale and an ordinary scale, and an object is measured at the same time. The weight measured by the electronic scale is Z1=30, and the standard deviation is σ 1 = 2 {\sigma _1} = 2p1=2 ; The weight result measured by ordinary scales is Z2=32, and the standard deviation isσ 1 = 4 {\sigma _1} = 4p1=4

Assuming that they satisfy the normal distribution, you can draw the distribution diagram of the weight of the objects measured by the two scales, as follows:
Insert image description here

The above picture Z1 shows the distribution of the measurement results of the electronic scale, and Z2 shows the distribution of the measurement results of the ordinary scale. Before doing the calculation, I think we can guess the real value first. It can be seen from the above that the standard deviation of electronic scales is only 2, while the standard deviation of ordinary scales is 4, which means that the measurement results of electronic scales are more stable, so the measurement results may be more accurate, and finally our fusion results may be more biased towards electronic scales. The result is closer to 30. Of course, these are just my guesses. Now let’s calculate and see if the optimal estimate is consistent with our guesses.

Taking the first measurement result Z1 as the estimated value and the second measurement result Z2 as the measured value, we can get the estimated value Z ^ \hat{\mathrm{Z}}ZThe expression of ^ is as follows:
Z ^ = Z 1 + k ( Z 2 − Z 1 ) \hat{Z}=Z_1+k\left(Z_2-Z_1\right)Z^=Z1+k(Z2Z1)
in order to makeZ ^ \hat{\mathrm{Z}}Z^ Optimal, you need to makeZ ^ \hat{\mathrm{Z}}ZThe standard deviation of σ Z ^ \sigma_{\hat{Z}}pZ^Minimum, that is, variance var ( Z ^ ) \mathrm{v}\mathrm{ar}\left(\hat{Z}\right)v yr(Z^ )is the minimum, and the derivation formula is as follows:
Insert image description here

Given the infinitive: σ Z ^ 2 = ( 1 − k ) 2 σ 1 2 + k 2 σ 2 2 {\sigma_{\hat{Z}}}^2=\left(1-\mathrm{k}\ right)^2{\sigma_1}^2+k^2{\sigma_2}^2pZ^2=(1k)2p12+k2 p22 , in order to makeσ Z ^ 2 {\sigma_{\hat{Z}}}^2pZ^2 is the smallest, you need to find k. Let the derivation of the above formula with respect to k equal to 0 and find the extreme point. The derivation is as shown in the figure below:
Insert image description here

So far we have the finalestimated valueandstandard deviation, we can see that this result is closer to 30 as we guessed before, and we can find that the standard deviation of this result is smaller than the standard deviation of the previous two measurements. Draw the distribution diagram of the three results, as shown below:
Insert image description here

after_merge represents the result after fusion. As can be seen from the above figure, the distribution of the result after fusion is higher and thinner, and the effect is better.

4. Popular understanding of Kalman filtering

How to explain Kalman filtering in simple terms and in as much detail as possible?

Suppose you raise a pig:
Insert image description here

One week ago, the pig's weight was 46±0.5kg. Note that I used ±0.5 here, which means that I am not so sure about the weight of this pig a week ago, that is to say, the weight of 46kg has an error of 0.5kg.

Now, I've had it for another week. So I want to know how much it will weigh in one week, and what is the approximate error?
Insert image description here

In order to get the weight after one week, I have two methods: one is to calculate an approximate value based on the pig weight formula obtained from my many years of pig raising experience, and the other is to directly weigh it. Of course, both methods have certain errors. Assume that the weight obtained by the empirical formula is 48kg, with an error of 2kg; the weight obtained by directly weighing is 49kg, with an error of 1kg.
Insert image description here

However, I am a Virgo, and I feel that whether it is the value obtained by the empirical formula or the value obtained by direct weighing, it is not accurate enough. I hope there is a method that can combine the weight of the pig a week ago, the value estimated by the empirical formula and the value obtained by direct weighing, and comprehensively consider to obtain a value that is closest to the real weight of the pig and has the smallest error. This is what Kalman filtering does.

Now let’s abstract the pig-raising model into a mathematical formula:
Insert image description here

On the left side of the figure above, the weight of the pig in the previous week can be abstracted as the state value at time k-1, represented by the optimal estimated value at time k-1 plus an error term, and the same is true for the right side. Where,
P k = E [ ekek T ] \mathcal{P}_{\mathrm{k}}=\mathbb{E}\left[\mathbf{e}_{\mathrm{k}}\mathbf{e} _{\mathrm{k}}^{\mathrm{T}}\right]Pk=E[ekekT]
This term represents the covariance of the estimated value. Here are two points:

  1. All variables in the above figure are in bold, indicating that they are a vector or a matrix;
  2. The reason why a (column) vector is used instead of a number to represent the state value is because although the weight of a pig can be represented by a value, many states cannot be represented by a number in practical applications (such as The missile's position in space has three coordinates: x, y, and z).

The right side of the figure above represents the state value at time k, which can be passed throughprediction module(i.e. estimating the weight of the pig based on an empirical formula) anderror correction module(That is, directly weigh the pig's weight) to estimate. Similarly, the prediction module and the error correction module have correspondingerrorandError covariance matrix. What Kalman filtering does isAccording to the relevant theory of Bayesian estimation, while considering the covariance of the prediction module and the error correction module, a larger weight is given to the item with a small error, a smaller weight is given to the item with a large error, and the prediction error is minimized.
Insert image description here

The specific implementation process is as follows:
Insert image description here

4. Introduction to Kalman filtering

1 Introduction

Kalman filtering is described by a series of recursive mathematical formulas, which provide an efficient and computable method to estimate the state of the process and minimize the estimated mean square error. The Kalman filter is widely used and powerful and can estimate the past state, current state, and future state of a signal .

Kalman Filter (KF) is aOptimization Estimation Algorithm(Optimal Estimation Algorithm), commonly used in drones, automatic driving, satellite navigation, computer vision, signal processing and other fields, including robot navigation control systems, guidance systems, sensor data fusion systems, and military radar systems, missile tracking systems wait. The actual functions are mainly:Update predictions based on sensor measurements to achieve more accurate estimates

2. Applicable scenarios

The Kalman filter is suitable for estimating a random variableThe optimal state of a dynamic system. Even if the observed system state parameters contain noise and the observation values ​​are inaccurate, Kalman filtering can still complete the optimal estimate of the true value of the state.

The essence of the Kalman filter algorithm is to iterate using the characteristic that the fusion of two normal distributions is still a normal distribution.

3. State Observer

State Observers: Used to optimize and estimate some states that cannot be measured directly but can be measured indirectly.

Scenario: Observing the internal temperature of a rocket injector

Now there is a rocket flying from the earth to Mars. The rocket uses liquid hydrogen as fuel, but excessive temperature will cause damage to the mechanical parts of the rocket injector. Therefore, it is very important to monitor the internal temperature of the injector. However, the sensor does not support placement in the injector. inside, otherwise it will melt away. But we can place a temperature sensor outside the injector to indirectly measure the internal temperature . But there is an error between the indirect measurement value and the real value. For this reason, we can use the state observer toEstimate the true value based on indirect measurements
Insert image description here

4. Best state estimator

Optimal State Estimator: Estimates the optimal system state from error-affected sensor measurements.

The Kalman filter itself is an optimal state estimator, which is similar to the state observer mentioned above. The difference is that KF is designed for stochastic systems. The predicted value formula of the state observer and the best state estimator (KF) is as follows:
State observer x ^ k + 1 = A x ^ k + B uk + K ( yk − C x ^ k ) Deterministic system \text{State observer }\quad\quad\hat{x}_{k+1}=A\hat{x}_{k}+Bu_{k}+K(y_{k}-C\hat{x}_{k} )\quad\quad\text{Deterministic system}State observerx^k+1=Ax^k+B uk+K ( andkCx^k)Deterministic system

Kalman filter x ^ k = A x ^ k − 1 + B u k + K k ( y k − C ( A x ^ k − 1 + B u k ) ) Stochastic system \text{Kalman filter}\quad\quad\hat{x}_k=A\hat{x}_{k-1}+Bu_k+K_k(y_k-C(A\hat{x}_{k-1}+Bu_k))\quad\quad\text{Stochastic system} Kalman filterx^k=Ax^k1+B uk+Kk(ykC(Ax^k1+B uk))Stochastic system

Scenario: Estimating the position of a car in a tunnel

If a car enters a tunnel, it will be difficult for the GPS receiver on the car to receive satellite signals. If you are driving on a street blocked by tall buildings, the positioning noise will be very large due to the multipath effect (signal distortion). We can simply think of this as a case where sensor measurements are affected by errors. If we want to estimate the true position of the car, it is recommended to use Kalman filtering.
Insert image description here

The goal of KF is to combine the GPS measurement value (Measurement) and the mathematical model prediction value (Prediction) to find the optimal estimated state of the car position (ie, the car position).

5. Kalman gain

For example, the distance between the robot and the obstacle measured by lidar is 7m, and the accuracy is 90%. Based on the speed, the distance between the robot and the obstacle is estimated to be 6m, and the accuracy is 80%. The final distance estimation result is: result = ( 1
− 0.9 0.8 + 0.9 ) ∗ 6 + 0.9 0.8 + 0.9 ∗ 7 = 6.52 meters. result=(1-\frac{0.9}{0.8+0.9})*6+\frac{0.9}{0.8+0.9}*7= 6.52\text{ m}.result=(10.8+0.90.9)6+0.8+0.90.97=6.52 meters . 
Among them,0.9 0.8 + 0.9 \frac{0.9}{0.8+0.9}0.8+0.90.9known asKalman gain, representing this sensor data relative to an estimate calculated from speedReliability

6. Other Kalman filters

Kalman filter is linear, but in practice, many data models are nonlinear. At this time, you can consider extended Kalman filter. In addition, interested friends can also learn about unscented Kalman filter and particle filter, as shown in Figure 12 Show.
Insert image description here

Five, Kalman filter library

simdkalman

simdkalman

Python implements Kalman filtering

[Easy to understand] How to understand the Kalman filter algorithm that sent Chang'e to heaven?

Assuming that the missile is moving horizontally, the speed of the missile at each moment is dv, the standard deviation of the speed measuring instrument is v_std, and the position_noise of the missile measured by the GPS sensor at each moment and the variance of the GPS are predict_var.

# -*- coding: utf-8 -*-

import numpy as np


t = np.linspace(1,100,100) # 在1~100s内采样100次
a = 0.5 # 加速度值
position = (a * t**2)/2

position_noise = position+np.random.normal(0,120,size=(t.shape[0])) # 模拟生成GPS位置测量数据(带噪声)
import matplotlib.pyplot as plt
plt.plot(t,position,label='truth position')
plt.plot(t,position_noise,label='only use measured position')


#---------------卡尔曼滤波----------------
# 初始的估计导弹的位置就直接用GPS测量的位置
predicts = [position_noise[0]]
position_predict = predicts[0]

predict_var = 0
odo_var = 120**2 #这是我们自己设定的位置测量仪器的方差,越大则测量值占比越低
v_std = 50 # 测量仪器的方差(这个方差在现实生活中是需要我们进行传感器标定才能算出来的,可搜Allan方差标定)
for i in range(1,t.shape[0]):
  
    dv =  (position[i]-position[i-1]) + np.random.normal(0,50) # 模拟从IMU读取出的速度
    position_predict = position_predict + dv # 利用上个时刻的位置和速度预测当前位置
    predict_var += v_std**2 # 更新预测数据的方差
    # 下面是Kalman滤波
    position_predict = position_predict*odo_var/(predict_var + odo_var)+position_noise[i]*predict_var/(predict_var + odo_var)
    # 高斯分布乘法公式
    predict_var = (predict_var * odo_var)/(predict_var + odo_var)
    predicts.append(position_predict)

    
plt.plot(t,predicts,label='kalman filtered position')

plt.legend()
plt.show()

Insert image description here

Guess you like

Origin blog.csdn.net/m0_37605642/article/details/132439372