wy Lv3

Deep RL

Recap: Value function approximation

Deep value function approximation

JAX

Deep Q-learning

Deep Q-learning in JAX

General Value Functions

The reward hypothesis (Sutton and Barto 2018)

General value functions (Sutton et al. 2011)

Example: Simple predictive questions

GVFs as Auxiliary Tasks

Trade-offs in multi-task learning

Open problems in GVF learning

Distributional RL

  • Title:
  • Author: wy
  • Created at : 2023-07-23 18:59:37
  • Updated at : 2023-07-23 20:24:13
  • Link: https://yuuee-www.github.io/blog/2023/07/23/RL/step10/RLstep10/
  • License: This work is licensed under CC BY-NC-SA 4.0.
Comments