Soft Actor-Critic (SAC)
Soft Actor-Critic (SAC)
Maximum Entropy Reinforcement learning
off policy
stochastic policy and not deterministic policy (Only one action is considered optimal in each state)
Codes:
rail-berkeley/softlearning(tenserslow)
[rail-berkeley/rlkit]https://github.com/rail-berkeley/rlkit (pytorch)
- Title: Soft Actor-Critic (SAC)
- Author: wy
- Created at : 2023-07-23 12:16:31
- Updated at : 2023-07-24 09:09:37
- Link: https://yue-ruby-w.site/2023/07/23/2023-07-23-RL-step11-RLstep11/
- License: This work is licensed under CC BY-NC-SA 4.0.