Constructing and applying Kalman filters (KF)--Extended Kalman filters (EKF)

As a data scientist, we occasionally encounter situations where we need to model trends to predict future values. While people tend to focus on statistics or machine learning based algorithms, I present here a different option: the Kalman Filter (KF).

In the early 1960s, Rudolf E. Kalman revolutionized the way complex systems were modeled using KF. From guiding an airplane or spacecraft to its destination to letting your smartphone find its place in the world, the algorithm blends data and mathematics to provide estimates of future states with incredible accuracy.

In this blog, we’ll take a closer look at how the Kalman filter works and show examples in Python to highlight the true power of this technique. Starting with a simple 2D example, we'll see how to modify the code to fit it into a more advanced 4D space, and finally cover the extended Kalman filter (its complex predecessor). Join me on this journey as we enter the world of predictive algorithms and filters.

Basic knowledge of Kalman filter

KF provides an estimate of the state of a system by constructing and continuously updating a set of covariance matrices (representing noise and statistical distributions of past states) collected from observations and other timely measurements. Unlike other out-of-the-box algorithms, solutions can be directly extended and refined by defining mathematical relationships between the system and external sources. While this may sound quite complex and intricate, the process can be summarized into two steps: prediction and updating. These stages work together to iteratively correct and refine the system's state estimate.

Prediction steps:

This stage mainly predicts the next future state of the system based on the known posterior estimate of the model and the time step Δk. Mathematically, we express the estimate of the state space as:

where F is our state transition matrix, which models how states evolve from one step to another independent of control inputs and process noise. Our matrix B models the effect of the control input uₖ on the state.

In addition to our estimate of the next state, the algorithm also computes the uncertainty of the estimate represented by the covariance matrix P:

The prediction state covariance matrix represents the confidence and accuracy of our predictions, which is affected by the process noise covariance matrixQ of the system itself. We apply this matrix to subsequent equations in the update step to correct the information saved on the system by the Kalman filter, thereby improving future state estimates.

Update steps:

In the update step, the algorithm updates the Kalman gain, state estimate, and covariance matrix. The Kalman gain determines how much influence new measurements have on the state estimate. The calculation consists of the observation model matrix H, which relates the states to the measurements we expect to receive, and R< /span>Measurement noise covariance matrix of measurement error:

Essentially,K attempts to balance the uncertainty in predictions with the uncertainty present in measurements. As mentioned above, the Kalman gain is used to correct the state estimate and covariance, respectively represented by the following equations:

The calculation result in the brackets of the state estimate is the residual, that is, the difference between the actual measured value and the model predicted value.

The real beauty of how the Kalman filter works is its recursive nature, constantly updating the state and covariance as new information is received. This allows the model to be refined over time in a statistically optimal way, which is a particularly powerful approach for modeling systems that receive a stream of noisy observations.

The role of Kalman filter

The equations underpinning the Kalman filter can be overwhelming, and fully understanding how it works from a mathematical perspective requires an understanding of state space (which is beyond the scope of this blog), but I'll try to introduce it through some Python examples become vivid. In its simplest form, we can define the Kalman filter object as:

import numpy as np

class KalmanFilter:
    """
    An implementation of the classic Kalman Filter for linear dynamic systems.
    The Kalman Filter is an optimal recursive data processing algorithm which
    aims to estimate the state of a system from noisy observations.

    Attributes:
        F (np.ndarray): The state transition matrix.
        B (np.ndarray): The control input marix.
        H (np.ndarray): The observation matrix.
        u (np.ndarray): the control input.
        Q (np.ndarray): The process noise covariance matrix.
        R (np.ndarray): The measurement noise covariance matrix.
        x (np.ndarray): The mean state estimate of the previous step (k-1).
        P (np.ndarray): The state covariance of previous step (k-1).
    """
    def __init__(self, F, B, u, H, Q, R, x0, P0):
        """
        Initializes the Kalman Filter with the necessary matrices and initial state.

        Parameters:
            F (np.ndarray): The state transition matrix.
            B (np.ndarray): The control input marix.
            H (np.ndarray): The observation matrix.
            u (np.ndarray): the control input.
            Q (np.ndarray): The process noise covariance matrix.
            R (np.ndarray): The measurement noise covariance matrix.
            x0 (np.ndarray): The initial state estimate.
            P0 (np.ndarray): The initial state covariance matrix.
        """
        self.F = F  # State transition matrix
        self.B = B  # Control input matrix
        self.u = u  # Control vector
        self.H = H  # Observation matrix
        self.Q = Q  # Process noise covariance
        self.R = R  # Measurement noise covariance
        self.x = x0  # Initial state estimate
        self.P = P0  # Initial estimate covariance

    def predict(self):
        """
        Predicts the state and the state covariance for the next time step.
        """
        self.x = self.F @ self.x + self.B @ self.u
        self.P = self.F @ self.P @ self.F.T + self.Q
        return self.x

    def update(self, z):
        """
        Updates the state estimate with the latest measurement.

        Parameters:
            z (np.ndarray): The measurement at the current step.
        """
        y = z - self.H @ self.x
        S = self.H @ self.P @ self.H.T + self.R
        K = self.P @ self.H.T @ np.linalg.inv(S)
        self.x = self.x + K @ y
        I = np.eye(self.P.shape[0])
        self.P = (I - K @ self.H) @ self.P
        
        return self.xChallenges with Non-linear Systems

We will use the predict() and update() functions to iterate over the steps outlined earlier. The above filter design works for any time series, to show how our estimates compare to reality, let's build a simple example:

import numpy as np
import matplotlib.pyplot as plt

# Set the random seed for reproducibility
np.random.seed(42)

# Simulate the ground truth position of the object
true_velocity = 0.5  # units per time step
num_steps = 50
time_steps = np.linspace(0, num_steps, num_steps)
true_positions = true_velocity * time_steps

# Simulate the measurements with noise
measurement_noise = 10  # increase this value to make measurements noisier
noisy_measurements = true_positions + np.random.normal(0, measurement_noise, num_steps)

# Plot the true positions and the noisy measurements
plt.figure(figsize=(10, 6))
plt.plot(time_steps, true_positions, label='True Position', color='green')
plt.scatter(time_steps, noisy_measurements, label='Noisy Measurements', color='red', marker='o')

plt.xlabel('Time Step')
plt.ylabel('Position')
plt.title('True Position and Noisy Measurements Over Time')
plt.legend()
plt.show()

Actually, the "true position" is unknown, but we will plot it here for reference, and the "noise measurement" is the observation point input to the Kalman filter. We will perform a very basic instantiation of the matrix, to some extent this does not matter since the Kalman model will converge quickly by applying a Kalman gain, but in some cases it may be reasonable to perform a warm start on the model .

# Kalman Filter Initialization
F = np.array([[1, 1], [0, 1]])   # State transition matrix
B = np.array([[0], [0]])          # No control input
u = np.array([[0]])               # No control input
H = np.array([[1, 0]])            # Measurement function
Q = np.array([[1, 0], [0, 3]])    # Process noise covariance
R = np.array([[measurement_noise**2]]) # Measurement noise covariance
x0 = np.array([[0], [0]])         # Initial state estimate
P0 = np.array([[1, 0], [0, 1]])   # Initial estimate covariance

kf = KalmanFilter(F, B, u, H, Q, R, x0, P0)

# Allocate space for estimated positions and velocities
estimated_positions = np.zeros(num_steps)
estimated_velocities = np.zeros(num_steps)

# Kalman Filter Loop
for t in range(num_steps):
    # Predict
    kf.predict()
    
    # Update
    measurement = np.array([[noisy_measurements[t]]])
    kf.update(measurement)
    
    # Store the filtered position and velocity
    estimated_positions[t] = kf.x[0]
    estimated_velocities[t] = kf.x[1]

# Plot the true positions, noisy measurements, and the Kalman filter estimates
plt.figure(figsize=(10, 6))
plt.plot(time_steps, true_positions, label='True Position', color='green')
plt.scatter(time_steps, noisy_measurements, label='Noisy Measurements', color='red', marker='o')
plt.plot(time_steps, estimated_positions, label='Kalman Filter Estimate', color='blue')

plt.xlabel('Time Step')
plt.ylabel('Position')
plt.title('Kalman Filter Estimation Over Time')
plt.legend()
plt.show()

Even with this very simple solution design, the model does a good job of cutting through the noise to find the true position. This may be fine for simple applications, but trends are often more subtle and affected by external events. To solve this problem we usually need to modify the state space representation. We also need to modify the way we calculate the estimates and correct the covariance matrix when new information arrives. Let's explore this further with another example.

Track 4D moving objects

Suppose we want to track the movement of an object in space and time. To make this example more realistic, we will simulate some forces acting on the object, causing angular rotation. To demonstrate the adaptability of the algorithm to higher dimensional state space representations, we will assume linear forces, although in practice this is not the case (after this we will explore a more realistic example). The code below shows an example of how we could modify the Kalman filter for this specific scenario.

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

class KalmanFilter:
    """
    An implementation of the classic Kalman Filter for linear dynamic systems.
    The Kalman Filter is an optimal recursive data processing algorithm which
    aims to estimate the state of a system from noisy observations.

    Attributes:
        F (np.ndarray): The state transition matrix.
        B (np.ndarray): The control input marix.
        H (np.ndarray): The observation matrix.
        u (np.ndarray): the control input.
        Q (np.ndarray): The process noise covariance matrix.
        R (np.ndarray): The measurement noise covariance matrix.
        x (np.ndarray): The mean state estimate of the previous step (k-1).
        P (np.ndarray): The state covariance of previous step (k-1).
    """
    def __init__(self, F=None, B=None, u=None, H=None, Q=None, R=None, x0=None, P0=None):
        """
        Initializes the Kalman Filter with the necessary matrices and initial state.

        Parameters:
            F (np.ndarray): The state transition matrix.
            B (np.ndarray): The control input marix.
            H (np.ndarray): The observation matrix.
            u (np.ndarray): the control input.
            Q (np.ndarray): The process noise covariance matrix.
            R (np.ndarray): The measurement noise covariance matrix.
            x0 (np.ndarray): The initial state estimate.
            P0 (np.ndarray): The initial state covariance matrix.
        """
        self.F = F  # State transition matrix
        self.B = B  # Control input matrix
        self.u = u  # Control input
        self.H = H  # Observation matrix
        self.Q = Q  # Process noise covariance
        self.R = R  # Measurement noise covariance
        self.x = x0  # Initial state estimate
        self.P = P0  # Initial estimate covariance

    def predict(self):
        """
        Predicts the state and the state covariance for the next time step.
        """
        self.x = np.dot(self.F, self.x) + np.dot(self.B, self.u)
        self.P = np.dot(np.dot(self.F, self.P), self.F.T) + self.Q

    def update(self, z):
        """
        Updates the state estimate with the latest measurement.

        Parameters:
            z (np.ndarray): The measurement at the current step.
        """
        y = z - np.dot(self.H, self.x)
        S = np.dot(self.H, np.dot(self.P, self.H.T)) + self.R
        K = np.dot(np.dot(self.P, self.H.T), np.linalg.inv(S))
        self.x = self.x + np.dot(K, y)
        self.P = self.P - np.dot(np.dot(K, self.H), self.P)

# Parameters for simulation
true_angular_velocity = 0.1  # radians per time step
radius = 20
num_steps = 100
dt = 1  # time step

# Create time steps
time_steps = np.arange(0, num_steps*dt, dt)

# Ground truth state
true_x_positions = radius * np.cos(true_angular_velocity * time_steps)
true_y_positions = radius * np.sin(true_angular_velocity * time_steps)
true_z_positions = 0.5 * time_steps  # constant velocity in z

# Create noisy measurements
measurement_noise = 1.0
noisy_x_measurements = true_x_positions + np.random.normal(0, measurement_noise, num_steps)
noisy_y_measurements = true_y_positions + np.random.normal(0, measurement_noise, num_steps)
noisy_z_measurements = true_z_positions + np.random.normal(0, measurement_noise, num_steps)

# Kalman Filter initialization
F = np.array([[1, 0, 0, -radius*dt*np.sin(true_angular_velocity*dt)],
              [0, 1, 0, radius*dt*np.cos(true_angular_velocity*dt)],
              [0, 0, 1, 0],
              [0, 0, 0, 1]])
B = np.zeros((4, 1))
u = np.zeros((1, 1))
H = np.array([[1, 0, 0, 0],
              [0, 1, 0, 0],
              [0, 0, 1, 0]])
Q = np.eye(4) * 0.1  # Small process noise
R = measurement_noise**2 * np.eye(3)  # Measurement noise
x0 = np.array([[0], [radius], [0], [true_angular_velocity]])
P0 = np.eye(4) * 1.0

kf = KalmanFilter(F, B, u, H, Q, R, x0, P0)

# Allocate space for estimated states
estimated_states = np.zeros((num_steps, 4))

# Kalman Filter Loop
for t in range(num_steps):
    # Predict
    kf.predict()
    
    # Update
    z = np.array([[noisy_x_measurements[t]],
                  [noisy_y_measurements[t]],
                  [noisy_z_measurements[t]]])
    kf.update(z)
    
    # Store the state
    estimated_states[t, :] = kf.x.ravel()

# Extract estimated positions
estimated_x_positions = estimated_states[:, 0]
estimated_y_positions = estimated_states[:, 1]
estimated_z_positions = estimated_states[:, 2]

# Plotting
fig = plt.figure(figsize=(10, 8))
ax = fig.add_subplot(111, projection='3d')

# Plot the true trajectory
ax.plot(true_x_positions, true_y_positions, true_z_positions, label='True Trajectory', color='g')
# Plot the start and end markers for the true trajectory
ax.scatter(true_x_positions[0], true_y_positions[0], true_z_positions[0], label='Start (Actual)', c='green', marker='x', s=100)
ax.scatter(true_x_positions[-1], true_y_positions[-1], true_z_positions[-1], label='End (Actual)', c='red', marker='x', s=100)


# Plot the noisy measurements
ax.scatter(noisy_x_measurements, noisy_y_measurements, noisy_z_measurements, label='Noisy Measurements', color='r')

# Plot the estimated trajectory
ax.plot(estimated_x_positions, estimated_y_positions, estimated_z_positions, label='Estimated Trajectory', color='b')

# Plot settings
ax.set_xlabel('X position')
ax.set_ylabel('Y position')
ax.set_zlabel('Z position')
ax.set_title('3D Trajectory Estimation with Kalman Filter')
ax.legend()

plt.show()

There are some interesting points to note here, in the image above we can see how the model quickly corrects to the estimated true state as we start iterating over the observations. The model also performs reasonably well at identifying the true state of the system, with estimates intersecting the true state (the "true trajectory"). This design may be suitable for some real-world applications, but not for applications where the forces acting on the system are nonlinear. Instead, we need to consider a different application of the Kalman filter: the extended Kalman filter, a precursor to the nonlinearities in linearizing input observations that we have explored so far.

Extended Kalman Filter

When trying to model the observations or dynamics of a system that is nonlinear, we need to apply the Extended Kalman Filter (EKF). This algorithm differs from the previous one by incorporating the Jacobian matrix into the solution and performing a Taylor series expansion to find a first-order linear approximation of the state transition and observation model. To express this extension mathematically, our key algorithmic calculation now becomes:

For state prediction, where f is the nonlinear state transition function applied to the previous state estimate, and u is the control input from the previous time step .

is used for error covariance prediction, where F is the state transition function relative to P (process noise covariance matrix). Q (prior error covariance) and

Observation of measurementz at time stepk, whereh is a nonlinear observation function applied to state prediction, with some additive observation noisev.

An update to our calculation of the Kalman gain, whereH is the Jacobian of the observed function with respect to the state, is the measurement noise covariance matrix. R

The modified calculation of the state estimate combines the Kalman gain and the nonlinear observation function, and finally our equation for updating the error covariance:

In the last example, this will use Jocabian to linearize the non-linear effects of angular rotation on our object, modifying the code appropriately. Designing EKFs is more challenging than KFs because our assumption of a first-order linear approximation may inadvertently introduce errors into our state estimates. Additionally, Jacobian calculations can become complex, computationally expensive, and difficult to define in some cases, which can also lead to errors. However, if designed correctly, EKF often outperforms KF implementations.

Building on our last Python example, I introduced the implementation of EKF:

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

class ExtendedKalmanFilter:
    """
    An implementation of the Extended Kalman Filter (EKF).
    This filter is suitable for systems with non-linear dynamics by linearising
    the system model at each time step using the Jacobian.
    
    Attributes:
        state_transition (callable): The state transition function for the system.
        jacobian_F (callable): Function to compute the Jacobian of the state transition.
        H (np.ndarray): The observation matrix.
        jacobian_H (callable): Function to compute the Jacobian of the observation model.
        Q (np.ndarray): The process noise covariance matrix.
        R (np.ndarray): The measurement noise covariance matrix.
        x (np.ndarray): The initial state estimate.
        P (np.ndarray): The initial estimate covariance.
    """
    def __init__(self, state_transition, jacobian_F, observation_matrix, jacobian_H, Q, R, x, P):
        """
        Constructs the Extended Kalman Filter.

        Parameters:
            state_transition (callable): The state transition function.
            jacobian_F (callable): Function to compute the Jacobian of F.
            observation_matrix (np.ndarray): Observation matrix.
            jacobian_H (callable): Function to compute the Jacobian of H.
            Q (np.ndarray): Process noise covariance.
            R (np.ndarray): Measurement noise covariance.
            x (np.ndarray): Initial state estimate.
            P (np.ndarray): Initial estimate covariance.
        """
        self.state_transition = state_transition  # Non-linear state transition function
        self.jacobian_F = jacobian_F  # Function to compute Jacobian of F
        self.H = observation_matrix  # Observation matrix
        self.jacobian_H = jacobian_H  # Function to compute Jacobian of H
        self.Q = Q  # Process noise covariance
        self.R = R  # Measurement noise covariance
        self.x = x  # Initial state estimate
        self.P = P  # Initial estimate covariance

    def predict(self, u):
        """
        Predicts the state at the next time step.

        Parameters:
            u (np.ndarray): The control input vector.
        """
        self.x = self.state_transition(self.x, u)
        F = self.jacobian_F(self.x, u)
        self.P = F @ self.P @ F.T + self.Q

    def update(self, z):
        """
        Updates the state estimate with a new measurement.

        Parameters:
            z (np.ndarray): The measurement vector.
        """
        H = self.jacobian_H()
        y = z - self.H @ self.x
        S = H @ self.P @ H.T + self.R
        K = self.P @ H.T @ np.linalg.inv(S)
        self.x = self.x + K @ y
        self.P = (np.eye(len(self.x)) - K @ H) @ self.P

# Define the non-linear transition and Jacobian functions
def state_transition(x, u):
    """
    Defines the state transition function for the system with non-linear dynamics.

    Parameters:
        x (np.ndarray): The current state vector.
        u (np.ndarray): The control input vector containing time step and rate of change of angular velocity.

    Returns:
        np.ndarray: The next state vector as predicted by the state transition function.
    """
    dt = u[0]
    alpha = u[1]
    x_next = np.array([
        x[0] - x[3] * x[1] * dt,
        x[1] + x[3] * x[0] * dt,
        x[2] + x[3] * dt,
        x[3],
        x[4] + alpha * dt
    ])
    return x_next

def jacobian_F(x, u):
    """
    Computes the Jacobian matrix of the state transition function.

    Parameters:
        x (np.ndarray): The current state vector.
        u (np.ndarray): The control input vector containing time step and rate of change of angular velocity.

    Returns:
        np.ndarray: The Jacobian matrix of the state transition function at the current state.
    """
    dt = u[0]
    # Compute the Jacobian matrix of the state transition function
    F = np.array([
        [1, -x[3]*dt, 0, -x[1]*dt, 0],
        [x[3]*dt, 1, 0, x[0]*dt, 0],
        [0, 0, 1, dt, 0],
        [0, 0, 0, 1, 0],
        [0, 0, 0, 0, 1]
    ])
    return F

def jacobian_H():
    # Jacobian matrix for the observation function is simply the observation matrix
    return H

# Simulation parameters
num_steps = 100
dt = 1.0
alpha = 0.01  # Rate of change of angular velocity

# Observation matrix, assuming we can directly observe the x, y, and z position
H = np.eye(3, 5)

# Process noise covariance matrix
Q = np.diag([0.1, 0.1, 0.1, 0.1, 0.01])

# Measurement noise covariance matrix
R = np.diag([0.5, 0.5, 0.5])

# Initial state estimate and covariance
x0 = np.array([0, 20, 0, 0.5, 0.1])  # [x, y, z, v, omega]
P0 = np.eye(5)

# Instantiate the EKF
ekf = ExtendedKalmanFilter(state_transition, jacobian_F, H, jacobian_H, Q, R, x0, P0)

# Generate true trajectory and measurements
true_states = []
measurements = []
for t in range(num_steps):
    u = np.array([dt, alpha])
    true_state = state_transition(x0, u)  # This would be your true system model
    true_states.append(true_state)
    measurement = true_state[:3] + np.random.multivariate_normal(np.zeros(3), R)  # Simulate measurement noise
    measurements.append(measurement)
    x0 = true_state  # Update the true state

# Now we run the EKF over the measurements
estimated_states = []
for z in measurements:
    ekf.predict(u=np.array([dt, alpha]))
    ekf.update(z=np.array(z))
    estimated_states.append(ekf.x)

# Convert lists to arrays for plotting
true_states = np.array(true_states)
measurements = np.array(measurements)
estimated_states = np.array(estimated_states)

# Plotting the results
fig = plt.figure(figsize=(12, 9))
ax = fig.add_subplot(111, projection='3d')

# Plot the true trajectory
ax.plot(true_states[:, 0], true_states[:, 1], true_states[:, 2], label='True Trajectory')
# Increase the size of the start and end markers for the true trajectory
ax.scatter(true_states[0, 0], true_states[0, 1], true_states[0, 2], label='Start (Actual)', c='green', marker='x', s=100)
ax.scatter(true_states[-1, 0], true_states[-1, 1], true_states[-1, 2], label='End (Actual)', c='red', marker='x', s=100)

# Plot the measurements
ax.scatter(measurements[:, 0], measurements[:, 1], measurements[:, 2], label='Measurements', alpha=0.6)
# Plot the start and end markers for the measurements
ax.scatter(measurements[0, 0], measurements[0, 1], measurements[0, 2], c='green', marker='o', s=100)
ax.scatter(measurements[-1, 0], measurements[-1, 1], measurements[-1, 2], c='red', marker='x', s=100)

# Plot the EKF estimate
ax.plot(estimated_states[:, 0], estimated_states[:, 1], estimated_states[:, 2], label='EKF Estimate')


ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_zlabel('Z')
ax.legend()

plt.show()

Brief summary

In this blog, we take a deep dive into how to build and apply a Kalman Filter (KF), and how to implement an Extended Kalman Filter (EKF). Finally, we summarize the use cases, advantages, and disadvantages of these models.

KF:This model is suitable for linear systems, and we can assume that the state transition and observation matrices are linear functions of the state (with some Gaussian noise). You might consider applying this algorithm to:

  • Track the position and velocity of an object moving at a constant speed
  • Signal processing applications if the noise is random or can be represented by a linear model
  • Economic forecasting is possible if the underlying relationship can be modeled linearly

The main advantages of KF are that (once matrix calculations are followed) the algorithm is very simple, less computationally intensive than other methods, and can provide very accurate predictions and estimates of the true state of the system in time. The disadvantage is that the linearity assumption often does not hold in complex real-world scenarios.

EKF:We can essentially think of EKF as the nonlinear equivalent of KF, supported by the use of the Jacobian. You will consider EKF if you are dealing with:

  • Measurements and system dynamics of often non-linear robotic systems
  • Tracking and navigation systems typically involve non-constant velocity or angular rate changes, such as those tracking aircraft or spacecraft.
  • Automotive systems when implementing cruise control or lane keeping in most modern "smart" cars.

EKF generally produces better estimates than KF, especially for nonlinear systems, but it can be computationally more expensive due to the calculation of the Jacobian matrix. The method also relies on a first-order linear approximation of the Taylor series expansion, which may not be an appropriate assumption for highly nonlinear systems. Adding the Jacobian may also make the design of the model more challenging, so despite its advantages, implementing KF may be more appropriate for simplicity and interoperability.

Guess you like

Origin blog.csdn.net/qq_41929396/article/details/134527288