wy Lv3

Approximate Dynamic Programming

Under the 2 sources of error (estimation + function approximation), what can we say about resulting estimates?

The Bellman Optimality Operator

The Bellman Expectation Operator

Dynamic Programming with Bellman Operators

Approximate DP

Approximate Value Iteration

q-value version:

Some concrete instances of AVI

Fitted Q-iteration with Linear Approximation:

Fitted Q-iteration with other Approximations:

Fitted Q-iteration (General recipe)

Fitted Q-iteration (General recipe: DQN)

Fitted Q-iteration (General recipe: Batch RL - 1)

Fitted Q-iteration (General recipe: Batch RL - 2)

Fitted Q-iteration (General recipe: Dyna)

  • Title:
  • Author: wy
  • Created at : 2023-07-23 17:24:25
  • Updated at : 2023-07-23 17:51:41
  • Link: https://yuuee-www.github.io/blog/2023/07/23/RL/step8/RLstep8/
  • License: This work is licensed under CC BY-NC-SA 4.0.
Comments