wy Lv3

Soft Actor-Critic (SAC)

Maximum Entropy Reinforcement learning

off policy

stochastic policy and not deterministic policy (Only one action is considered optimal in each state)

Codes:

  • Title:
  • Author: wy
  • Created at : 2023-07-23 20:16:31
  • Updated at : 2023-07-24 17:09:37
  • Link: https://yuuee-www.github.io/blog/2023/07/23/RL/step11/RLstep11/
  • License: This work is licensed under CC BY-NC-SA 4.0.
Comments