There seems to be a lot of confusion about what Kalman filters are for in this t...

There seems to be a lot of confusion about what Kalman filters are for in this thread. Perhaps that’s what happens when you seek a non-mathematical introduction to a mathematical topic but nevertheless I’m going to try and clear this up.

Specifically, the Kalman filter is a recursive way to estimate the state of a dynamical system. That is, specifically, the thing you want to estimate varies is a function of time. It doesn’t matter if that thing is the position and momentum of a robot or a stock price. What does matter are the following:

1. The dynamics are linear with additive Gaussian noise. That is, the next state is a linear function of the current state plus a sample from a Gaussian distribution. Optionally, if your system is controlled (i.e., there is a variable at each moment in time you can set exactly or at least with very high accuracy), the dynamics can include a linear function of that term as well.

2. The sensor feeding you data at each time step is a linear function of the state plus a second Gaussian noise variable independent of the first.

3. You know the dynamics and sensor specification. That is, you know the matrices specifying the linear functions as well as the mean and covariances of the noise models. For a mechanical system, this knowledge could be acquired using some combination of physics, controlled experimentation in a lab, reading data sheets, and good old fashioned tuning. For other systems, you apply a similarly appropriate methodology or guess.

4. The initial distribution of your state when you start running the filter is Gaussian and you know it’s mean and covariance (if you don’t know those, you can guess because given the filter runs for a long enough time they become irrelevant)

The Kalman filter takes in the parameters of the model described in (3) and gives you a new linear dynamical system that incorporates a new measurement at each time step and outputs the distribution of the current state. Since we assumed everything is linear and Gaussian, this will be a Gaussian distribution.

From a Bayesian perspective, the state estimate is the posterior distribution given your sensor data, model, and initial condition. From a frequentist / decision theory perspective, you get the least squares estimate of the state subject to the constraints imposed by your dynamics.

If your dynamics and sensor are not linear, you either need to linearize them, which produces the “extended Kalman filter” that gives you a “local” estimate of the state or you need another method. A common choice is a particle filter.

If your model is garbage, then model based methods like the Kalman filter will give you garbage. If your sensor is garbage (someone mentioned the case where it outputs a fixed number), the Kalman filter will just propagate your uncertainty about the initial condition through the dynamics.