Soft Actor-Critic (SAC)

Maximum Entropy Reinforcement learning

off policy

stochastic policy and not deterministic policy (Only one action is considered optimal in each state)

Codes:

rail-berkeley/softlearning(tenserslow)
[rail-berkeley/rlkit]https://github.com/rail-berkeley/rlkit (pytorch)
vitchyr/rlkit
openai/spinningup
hill-a/stable-baselines
Title:
Author: wy
Created at
: 2023-07-23 20:16:31

          **Updated at
              :** 2023-07-24 17:09:37

      **Link:** https://yuuee-www.github.io/blog/2023/07/23/RL/step11/RLstep11/

      **
          License:
      **
      

      
          This work is licensed under [CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0).
      
  


      
  

  

  

  
      
          
              
                  [Prev posts](/2023/07/24/RL/step12/RLstep12/)
              
          
          
              
                  [Next posts](/2023/07/23/RL/step10/RLstep10/)
              
          
      
  

  
      
          


  Comments



  
      



  


      
  



  
      

  On this page

Soft Actor-Critic (SAC)

     ©
     
       2022
       -
     
     2024    [wy](/)
     
         
         

             
                 24 posts in total
             
             
         

     
 
 
     
     
         
             
                 VISITOR COUNT
                 
             
         
         
             
                 TOTAL PAGE VIEWS
                 
             
         
     
 
 
     POWERED BY [Hexo](https://hexo.io)
     THEME [Redefine v2.6.4](https://github.com/EvanNotFound/hexo-theme-redefine)
 
 
 
     
         Blog up for  days  hrs  Min  Sec

hola

Soft Actor-Critic (SAC)

Soft Actor-Critic (SAC)