
Approximate Dynamic Programming
Under the 2 sources of error (estimation + function approximation), what can we say about resulting estimates?
The Bellman Optimality Operator

The Bellman Expectation Operator

Dynamic Programming with Bellman Operators

Approximate DP

Approximate Value Iteration

q-value version:

…
…
Some concrete instances of AVI
Fitted Q-iteration with Linear Approximation:


Fitted Q-iteration with other Approximations:

Fitted Q-iteration (General recipe)

Fitted Q-iteration (General recipe: DQN)
Fitted Q-iteration (General recipe: Batch RL - 1)
Fitted Q-iteration (General recipe: Batch RL - 2)
Fitted Q-iteration (General recipe: Dyna)
…
- Title:
- Author: wy
- Created at : 2023-07-23 17:24:25
- Updated at : 2023-07-23 17:51:41
- Link: https://yuuee-www.github.io/blog/2023/07/23/RL/step8/RLstep8/
- License: This work is licensed under CC BY-NC-SA 4.0.
Comments