My ultimate goal is to make a robot that maps its environment, knows where it is and can navigate its environment on its own. Previous experiments have learned me that lots of obstacles have to be overcome to achieve this goal. One of them is combining sensory input. For example, I want to know which way my robot is facing. I do have both a compass and a gyro sensor. The readings of the compass sensor are not very precise and do suffer from magnetical disturbances (motors, NXT, metal objects). The gyro readings do not suffer from magnetical disturbances. But it only gives me an idea of the change in direction. Not the direction itself. One can integrate these changes to an overall change in respect to the start position. But then small errors in every reading will add up to a forever growing error. It would be nice to combine both sensors so that one sensor corrects the error from the other sensor. The Kalman filter promises to do so. This filter however is hard to understand and implement. As a consequence there is a lot of talk about it in the NXT community but it is seldom used. I’ve set myself the goal of understanding the filter and implement it for the set up described above. This and later posts will document my quest. Sometimes I’ll take things for granted, sometimes I’ll simplify things when a statistician wouldn’t. Sometimes I might be wrong.
The basic ideas behind the filter are simple.
- Sensors are never very precise. Each sensor has an error. This tells you how precise this sensor is. When the compass reads 15 degrees north it could also be 14 or 16.
- Two sensors combined give more precise information than a single sensor. When two compasses give me different reading the truth might be somewhere in the middle.
- If you know how things were, then you have a better understanding of how things are. When the last observation from the compass was 13 and the current one is 15 then the true heading is more likely to be below 15 than over 15.
- If you know how things changed from the last observation to the current one you can even better predict the current situation. If the robot is turning clockwise it is likely that the current heading should be more than the last time it was calculated.
It is not hard to find the formulas used with the Kalman filter, check Wikipedia (although I do suspect there is an error in the formulas there) or this excellent course. It was hard for me to understand those formulas however. There were two things I had to understand. One was matrix calculus, the second was the meaning of all the scalars and matrices that are in the equations. But there is some good news. Matrix calculus is not too difficult, some of us will have had it in high school, if not, there are just a few tricks to learn, addition (+), multiplication (*) and inversion (1/matrix). There is even more good news, not all matrices have to be understood. Some can be ignored in the simple situation I am dealing with, others can be taken for granted without understanding them.
The only formulas of importance are those under Predict and Update in the Wikipedia article. The other formulas will not make it to code. Lets examine what goes into the formulas first and what the formulas actually do afterwards.
he first thing to understand is the state. This is the x with a hat. The state describes the important properties of your robot. Important in this context I mean, what is it we want to know about the robot? In my case it is the heading of the robot (what way is it pointing at) and its rate of turn (how fast is it turning and in what direction). So the state contains two items in my case. But in other situations it can be totally different things that go into the state. This is very important to understand. The Kalman filter is not a ready made filter that accepts a number as input and gives another number as output. You have to design your Kalman filter yourself, fitting it for your situation and problem.
The state is a scalar (a vector with just one column) and in my case it has just two items heading and rate of turn.
With the Kalman filter the state comes in different flavours.
– there are state scalars associated with the current and with the previous observations. These are denoted as Xk and Xk-1 respectively.
– The a priory state vector. This holds the predicted state. The prediction is based on the previous state (and transition) only. It does not take sensory input nor control input in account. It is denoted as Xk-1|k-1 for the previous observation and as Xk|k-1 for the current observation.
– The a posteriori state vector. This one describes the prediction of the filter when taking sensory input and control input into account. It is denoted as Xk-1|k for the previous observation and as Xk|k for the current observation. This is my state vector:
|Rate of turn|
(When one knows the number of rows in the state vector one also knows the dimensions of most of the other vectors. These all have the same number of rows, also the number of columns equals the number of rows in the state vector. (this probably isn’t always true but in this simple case it is.))
There is another element in the filter that looks very much like the state vector. This element is called the observation and written down as Z. It holds the input from the sensors. In my case it holds compass and gyro readings.
As my sensors return heading and rate of turn, this vector is very much the same as the state vector. This is not by definition so! If measurements are indirect or use different units than the state (think of degrees Celsius versus degrees Fahrenheit) then both scalars will differ from each other. We then have to provide formulas to translate measurements into state using an observation model (H). My observation model doesn’t have to change anything and therefore can be ignored in the formulas. But officially it is:
The next element to understand is the transition matrix, Wikipedia uses F to denote this element. The transition matrix describes how the state changes from observation to observation. In my example one can assume that the rate of turn does not change from observation to observation. If the robot was turning 5 degrees a second then we assume it is now still turning at the same rate. The transition matrix should tell that the rate of turn does not change over time, Rate of turnk=Rate of turnk-1. The heading however changes from observation to observation because the robot is turning. The change in heading being related to the rate of turn. So the new heading equals the old heading plus the rate of turn times the time interval, Headingk=Headingk-1 + Rat of turnK-1*deltaT. This is my transition matrix:
One can argue that the rate of turn is not constant over time. If the robot changes its steering then the rate of turn also changes. This is true and if one wishes this can be modelled in the control model, Bu. To keep things simple I pretend not to know anything about steering or speed. By ignoring control model I can simplify the first Kalman formula quite a bit.
I cannot totally ignore the fact that my robot does steer and the rate of turn does change. I consider changes in rate of turn to be a unknown factor, a kind of noise in the predictions the filter makes. This is called the process noise, Q. The heading is also affected of the steering of the robot changes. But as the interval between two predictions is very small I think it can be ignored. So I set the process noise of the heading to zero. The process noice is a covariance matrix, this means that there could also be a relation between the noise in rate of turn and the noise in heading. But because the noise in heading is zero there cannot be a relation between the two, the covariance is zero as well. I do not jet know the magnitude of the proces noise. In other words I do not know how often and how fast the rate of turn can change between two predictions. I’ll find out later. My process noise vector is:
The last vector to understand describes how accurate the sensors are. It is called the observation noise and written down as R. The accuracy of a sensor can be measured, guessed or found in the data sheet of the sensor. In data sheets it is denoted as ±number. This number is officially called the variance or error. This matrix also holds information about the relationship between the accuracy, or covariance, of the different sensors. In my case a have two different sensors, the errors in their readings will not be related. So the covariance can be set to zero. In other cases errors between sensors can very well be related, for example sensors might both suffer in a similar way from vibrations in the robot. Then covariance cannot be ignored. I do not jet know the error of my sensors. I hope to find out later. For the moment my observation noise matrix is:
Not only sensors are inaccurate, the predictions from the filter are also inaccurate. The Kalman filter knows how inaccurate the a prediction is, this is called the state error or P, and knowing this gives the filter its strength. The state error is calculated by the filter, I don’t have to bother giving it values. The state error comes in four flavours, just like the state itself.
I now understand al the elements that go into the filter and have modelled them for my goal. The other elements that are in the formulas as calculated by the filter. Their meaning can be ignored for the moment. It is time to look at the formulas themselves.
Each time the filter makes a prediction it takes two steps or phases. The first step is called the predict phase. In this step the filter calculates a new prediction from the previous prediction and the transition matrix. In my case it calculates a new heading. But it does not yet take readings from my sensor into account. It also calculates how good this prediction is (the state error P) taking the process noise into account.
In the second step the filter gets the input from the sensors. These readings will not be the same as the filter predicted (the difference is called the innovation or Y). The truth will be somewhere between the prediction and the measurement. The filter then calculates where between the two values the truth is most likely to be, this is called the Kalman gain or K. Using the Kalman gain it calculates a new prediction. And this prediction is the value I am after!
In this post I discussed the basic ideas behind a Kalman filter, I described the elements that go into the filter. Also I modelled this elements for my purpose, although I still have to fill in some values. Then I briefly described the steps the filter takes to make a prediction. In my next post I will describe the implementation of the filter in RobotC.