Check out http://www.cs.ubc.ca/spider/poole/cs522/2000/mdpapplet/vi.htm which shows how the value iteration works pictorially (you an set the rewards, change the discount rates and step through the iterations of value iteration to see how value stabilize etc). Very useful if you are still not completely clear on how this works.. rao