Notebook

Exercises Chapter 3 Markov Decision Processes¶

p51 Ex 3.1 Devise three tasks in MDP framework¶

Ex 3.2 Exceptions to MDP¶

Ex 3.3 The problem of driving¶

Ex. 3.6 Pole balancing (Cartpole!)¶

Ex 3.7 robot in maze¶

Ex 3.8 calculate returns G¶

Ex 3.9 calculate return G¶

Ex 3.12 v_pi¶

Ex 3.14 verify bellman equation for gridworld example¶

Ex 3.17 bellman equation for action values¶

Ex 3.20 optimal state-value for golf example¶

Ex 3.21 optimal action-value for putting in golf example¶

Ex 3.22 optimal policies for MDP¶

Ex 3.24 compute optimal value for best state in the Gridworld¶

In [ ]: