Derivation of the Kalman Filter (part one)

This is the part one of three on the Kalman filter. The first part will be the basic setup for the derivation of the filter. The second part will entail completion of the derivation and then finally we will have some coding and application of the filter to economic problems in part three. The reason for doing this derivation is that I have not found one consolidated resource that presents the derivation in a way that I would like to present to students. I want them to see some of the ‘scary math’ and appreciate how amazing this filter really is. I am sure there are many great places to find the derivation since the Kalman Filter is used in so many different disciplines. I will try and list some of the resources that I have used as I progress toward putting this filter companion together.

Resources
Books consulted for this derivation include Hamilton (1994), Harvey (1989), Durbin and Koopman (2012) and notes by Paul Soderlind and Kristoffer Niemark.

Linear projections

The first thing that we need to understand before we attempt the derivation is linear projection. I will give a brief explanation of my understanding of linear projection and why it is useful in the context of the filter. A generalised way to think about linear projections is in terms of projections on Banach spaces, but here some functional analysis is required. I am assuming that most people reading this don’t have much experience with this type of analysis.

However, for those interested, a good resource for the discussion of linear projections in terms of normed vector spaces is the third chapter of ‘Optimization by vector space methods’, written by David Luenberger (1969). Furthermore, an overall great book on functional analysis for beginners (like myself) is ‘Introductory functional analysis with applications’ by Erwin Kreyszig (1978). I am not a mathematician, so recommendations are welcome. I just like reading math in my spare time.