Debugging on a Specialized GPU Node with VS CodeIn this post, we’ll explore how to connect to a GPU node on a cluster server via SSH in Visual Studio Code (VS Code) and debug a Python file in that ...
Using nvidia-smi on Integrated ServersWhen logged into an integrated server and attempting to use the nvidia-smi command directly in the command line, you might encounter the following error:
“bash...
Using a Jump Server for Internal Network AccessIn scenarios where direct access to internal network resources is restricted due to security policies, a jump server (also known as a bastion host) ac...
Learning Notebook on Hugging Face and TOFUThis notebook is used to explore the functionalities of Hugging Face, a leading platform for natural language processing (NLP), and TOFU, a framework focus...
Learning Notes on Optimal Transport1. IntroductionOT allows to definemeaningful distancesbetween point clouds (ordatasets), hence isapplicable in most ML settings.
2. Mathematical Formulation2.1 Mo...
Hexo Blog Setup Issues SummaryIssue 1: Web Page Fails to Render After DeploymentDescriptionAfter deploying the Hexo blog to a hosting platform such as GitHub Pages, the web page may not render prop...
envTensorFlow - GPU - dockerdocker
https://docs.docker.com/engine/install/ubuntu/#set-up-the-repository
tensorflow gpu
https://www.tensorflow.org/install/docker?hl=zh-cn
https://www.tensorflow.org/...
Soft Actor-Critic (SAC)Maximum Entropy Reinforcement learning
off policy
stochastic policy and not deterministic policy (Only one action is considered optimal in each state)
Codes:
rail-berkeley...
Deep RLRecap: Value function approximation
Deep value function approximation
JAX
Deep Q-learning
Deep Q-learning in JAX
General Value FunctionsThe reward hypothesis (Sutton and Barto 2018)...
Off-policy and multi-step learningOne-step off-policy
Multi-step off-policy
Off-policy corrections for policy gradients
Title:
Author: wy
Created at : 2023-07-23 18:50:21
...