What can we do with Kalman filtering?



What can we do with Kalman filtering?
Let's take a toy example: You develop a small robot that can move autonomously in the woods, and the robot needs to know where it is in order to navigate.

We can pass a set of state variables

to describe the state of the robot, including position and velocity:

Note that this state is only a part of all the states of the system, you can choose any data variable as the observed state. In our example it's position and velocity, but it could be the water level in a water tank, the temperature of a car's engine, the position of a user's finger across a tablet, or whatever data you want to track.
Our robot also has a GPS sensor with an accuracy of 10m. This is fine, but for our robot it needs to locate itself with an accuracy much better than 10m. There are many valleys and cliffs in the forest where the robot is located. If the robot misjudged the position even if it was only a few steps away, it might fall into the pit. So GPS alone is not enough.

At the same time we can get some information about the robot's movement: the motor commands that drive the wheels are also useful to us. If there is no external interference and only moves in one direction, then the position at the next moment is only moved by a fixed distance in that direction compared with the position at the previous moment. Of course, we cannot obtain all the information that affects the movement: the robot may be affected by the wind, the wheels may slip, or encounter some special road conditions; so the distance that the wheels turn does not fully represent the distance that the robot moves, which makes it difficult to predict the position of the robot through wheel rotation.
The GPS sensor will also tell us some information about the state of the robot, but it will contain some uncertainty. We can predict how the robot will move by turning the wheels, but there is also a certain degree of inaccuracy.
What if we combine information from both? Can more accurate results be obtained than relying solely on a single source of information? The answer is of course YES, this is the problem to be solved by Kalman filtering.


How does Kalman filter look at your problem
Let's look at the problem that needs to be solved. It is also the above system. The system state includes position and velocity.

We don't know the exact values ​​for position and velocity; but we can list an interval where the exact values ​​might fall. Within this range, some combinations of values ​​are more likely than others.

Kalman filtering assumes that all variables (position and velocity in our case) are random and follow a Gaussian distribution (normal distribution). Each variable has a mean u , which represents the center value of the random distribution (also means that this is the most likely value), and a variance, which represents the uncertainty.
In reality, velocity and position are related. If a value for position has already been determined, then there is a higher probability that certain velocity values ​​exist.
If we know the position value of the previous state, now we want to predict the position value of the next state. If our speed value is high, we will travel a little further. Conversely, if the speed is slow, the robot will not go very far.
This relationship is important when tracking the state of a system because it gives us more information: one measurement tells us what another measurement might look like. This is the purpose of Kalman filtering, we try to extract valuable information from all uncertain information!
This relationship can be represented by a matrix called covariance. In short, each element in the matrix represents the relationship between the i-th state variable and the j-th state variable. (You probably guessed that the covariance matrix is ​​symmetric, i.e. swapping the indices i and j has no effect). The covariance matrix is ​​usually denoted as Σ, and its elements are denoted as

 

(Here we only record the position and velocity, but we can put any data variable into our system state)
Next, we need to predict the state at time k from the state at time k-1 . Note that we don't know the exact value of the state, but our prediction function doesn't care. It just predicts and transfers the range of all possible values ​​at time k-1, and then obtains a range of new values ​​at time k.

We can pass a state transition matrix F k;

To describe this transformation, transfer all possible state values ​​at time k-1 to a new range, this new range represents the possible range of new state values ​​of the system, if the range of estimated values ​​at time k-1 is accurate.
This process of predicting the next state is represented by a motion formula: 

We now have a state transition matrix and can easily predict the next state, but still don't know how to update the covariance matrix.
Here we need another formula. If we transform matrix A for each point, what happens to its covariance matrix Σ?

 External forces
We have not considered all influencing factors. The change of the system state does not only depend on the previous system state, and the external force may affect the change of the system state.
For example, tracking the movement of a train, the train driver may have stepped on the gas pedal to increase the speed of the train. Also, in our robot example, the navigation software might issue some commands to start or brake the wheels. If we know this additional information, we can describe this information through a vector, adding it to our prediction equation as a correction. If we get the expected acceleration a
through the command issued , the above equation of motion can be changed to:

 

Let's add another detail, what happens if our predictive transition matrix is ​​not 100% accurate?
External Uncertainty
There would be no problem if the state would only evolve according to the properties of the system itself. If we can calculate the influence of all external forces on the system, there will be no problem.
But what if there are some external forces that we cannot predict? If we are tracking a quadcopter, it will be affected by the wind. If we were tracking a wheeled robot, the wheels might slip, or a bump in the ground could slow it down. We can't track these factors, and the above predictive equations may fail when these things happen.
We can model these uncertainties in the "world" together, adding an uncertainty term to the prediction equation.

Kalman filtering can also handle sensor noise. In other words, our sensor has its own range of accuracy. For a real position and velocity, the readings of the sensor are affected by Gaussian noise, which will make the readings fluctuate within a certain range.

Every data we observe can be considered as corresponding to a real state. But because of the uncertainty, some states are more likely than others.

So now we have two Gaussian distributions, one from our predicted values ​​and one from our measured values.

Two distributions are independent and identically distributed, how to fuse two Gaussian distributions? Then we have come into contact with probability theory.

 

Notice how you're dealing with the previous predictions, simply multiplying the two together to get the new predictions. Now see how simple this formula is.
What if it is a multidimensional matrix? We express (12) and (13) in matrix form. Σ denotes the covariance matrix and denotes the mean vector:

 

 

 

 

 

The above is a comprehensive and simple understanding of Kalman, but what if it is a nonlinear relationship? ( This part of the description is relatively brief, there are a lot of things, I am too lazy to sort them out, as long as you understand the above part, the following should be easy to understand )

One of the assumptions of KF is that the x of the Gaussian distribution still obeys the Gaussian distribution after prediction, and the x of the Gaussian distribution still obeys the Gaussian distribution after being transformed into the measurement space. However, if F, HF, H are nonlinear transformations, then the above conditions are not established.

Obviously, linear algebra is awesome, allowing us to express complex algorithms like Kalman filtering in a very compact form. However, linear algebra is not everything. As the name suggests, linear algebra is limited to representing linear relationships, that is, relationships characterized by straight lines. To see this, let's look again at the simple example of matrix and vector multiplication:

 

 

 

 

Extended Kalman Filter Theory

We once again sacrificed the precious wealth left to us by Mr. Kalman, that is, the seven formulas used in the prediction and measurement update of state estimation, as shown in the figure below. The theory and programming of the extended Kalman filter still need to use these formulas. Compared with the original Kalman filter, it is only different in a few places.

When we are doing Kalman, the most important thing is to find the mean and variance of the two Gaussian distributions of state transition and measurement, and then we can set the formula.

 

 

 

Initialization

The initialization of the extended Kalman filter needs to set various variables. For different motion models, the state vector is different. When initializing the extended Kalman filter, an initial state quantity x_in needs to be input to represent the initial position and velocity information of the obstacle. Generally, the first measurement result is directly used.

Prediction

After completing the initialization, we start to write the code of the Prediction part. First the formula

Here x is the state vector , and the predicted state vector x' is obtained by multiplying a matrix F to the left and adding the external influence u . Here F is called the state transition matrix (state transition matrix). Take the 2-dimensional uniform motion as an example, where x is

 

As an introductory course, too complex models are not discussed here, so the formula

According to the formula s1 = s0 + vt in middle school physics textbooks, the predicted state vector after time △t should be

For a two-dimensional uniform motion model, the acceleration is 0, which will not affect the predicted state, so

If you change to an acceleration or deceleration motion model, you can introduce acceleration ax and ay, according to s1 = s0 + vt + at^2/2, u here will become:

As an introductory course, too complex models are not discussed here, so the formula

will eventually be written as

Look at the second formula of the prediction module

In this formula, P represents the degree of uncertainty of the system. This degree of uncertainty will be large when the Kalman filter is initialized. As more and more data are injected into the filter, the degree of uncertainty will become smaller. The technical term for P is the state covariance matrix (state covariance matrix); here, Q represents the process noise (process covariance matrix), that is, noise that cannot be represented by x'= Fx + u .

The millimeter-wave radar measures the position and velocity of obstacles in the radial direction relatively accurately and with low uncertainty, so the state covariance matrix P can be initialized as follows:

 

Because Q has an impact on the entire system, but it is not too sure how much impact it has on the system. For simple models, the identity matrix or null values ​​can be directly used for calculation here, namely

According to the above content and formula

Observation

The first formula observed is

The purpose of this formula is to calculate the difference y between the observed observed value z and the predicted value x' .

 

The data characteristics of the millimeter-wave radar observation value z were mentioned earlier, as shown in the following figure:

Image credit: Udacity (Udacity) self-driving engineer degree

It can be seen from the figure that the data dimension of the observation value z is 3×1. In order to realize the matrix operation, the data dimensions of y and Hx’ are also 3×1.

Using basic mathematical operations can complete the coordinate transformation of the predicted value x' from the Cartesian coordinate system to the polar coordinate system, obtain Hx', and then calculate it with the observed value z. It should be noted that in this step of calculating the difference y, we avoid the calculation of the unknown rotation matrix H by means of coordinate transformation, and directly obtain the value of Hx'.

In order to simplify the expression, we use px, py and vx and vy to represent the predicted position and velocity, as follows:

The measured value z and the predicted position x' are both known quantities, so we can easily calculate the difference y between the observation and the prediction.

The conversion of millimeter-wave radar involves the conversion of position and speed between the Cartesian coordinate system and the polar coordinate system. This conversion process is nonlinear. Therefore, when dealing with nonlinear models like millimeter-wave radar, it is customary to write the formula for calculating the difference y as follows to distinguish between linear and nonlinear models.

Corresponding to the above formula, h(x') here is:

Look at the next two formulas of the Kalman filter

These two formulas calculate a very important quantity in the Kalman filter - the Kalman gain K (Kalman Gain) , which in human terms is to calculate the weight of the difference y. R in the first formula is the measurement noise matrix (measurement covariance matrix) , which represents the difference between the measured value and the true value. In general, the sensor manufacturer will provide it. If the manufacturer does not provide it, we can also get it through testing and debugging. S is just a temporary variable written to simplify the formula, so don't worry too much about it.

Since the measurement matrix H is required to obtain the Kalman gain K, the next task is to obtain H.

The millimeter-wave radar observation z is a 3x1 column vector containing position, angle and radial velocity, and the state vector x' is a 4x1 column vector containing position and velocity information. According to the formula y=z-Hx', it can be known that the dimension of the measurement matrix (Measurement Matrix) H is 3 rows and 4 columns. Right now:

It is easy to see from the above formula that the transformation on both sides of the equation is nonlinear, and there is no constant matrix H that can make both sides of the equation hold.

If the Gaussian distribution is used as an input to a nonlinear function, the result obtained will no longer conform to the Gaussian distribution, which will cause the formula of the Kalman filter to no longer apply. Therefore, we need to convert the above nonlinear function into an approximate linear function for solution.

Image credit: Udacity (Udacity) self-driving engineer degree

In the university course "Advanced Mathematics", we have learned that the nonlinear function y=h(x) can be expanded into a Taylor series at the point (x0,y0) through the Taylor formula:

Neglecting the higher-order terms above the second order, an approximate linearized equation can be obtained to replace the nonlinear function h(x), namely:

Extend the nonlinear function h(x) to multi-dimensional, that is, find the partial derivative of each variable, namely:

The term corresponding to the partial derivative with respect to x is called the Jacobian.

We compare the formula for partial derivatives with our previously derived formula and look at the coefficient of x, and we will find that the measurement matrix H here is actually the Jacobian in Taylor's formula.

Students who are interested in the multidimensional Jacobian derivation process can study it by themselves. Here we directly use its conclusion:

Find the first-order partial derivatives of the nonlinear function h(x') with respect to px, py, vx, vy, and arrange them into a matrix, and finally get the Jacobian matrix H:

in

Next is the time to test your advanced mathematics partial derivative skills!

After a series of calculations, the measurement matrix H is finally obtained as:

According to the above formula, after each prediction of the state of the obstacle, the corresponding measurement matrix H needs to be calculated according to the predicted position and velocity. This measurement matrix is ​​the result of derivation of the nonlinear function h(x') at the position of x'.

There are some doubts here: does the H in the following formula directly use the Jacobian matrix, or do other terms in the Taylor expansion also need to be calculated?

Look at the last two formulas of the Kalman filter

These two formulas actually complete the closed loop of the Kalman filter. The first formula completes the update of the current state vector x, taking into account not only the predicted value at the previous moment, but also the measured value, and the noise of the entire system. The second formula updates the system uncertainty P according to the Kalman gain K, which is used for the next cycle operation. I in this formula is an identity matrix with the same dimension as the state vector .

In this regard, Kalman and Extended Kalman have all been explained, and I believe that after reading it, I will have a good understanding.

Guess you like

Origin blog.csdn.net/m0_65075758/article/details/129199092