Skip to content

GeneHit/drl_practice

Repository files navigation

drl_practice

Practice the Deep Reinforcement Learning (DRL) with the gymnasium.

  • Easy hands-on on our laptop (like Mac/window/linux).
  • No long-time training.

How to practice

Check the Command Guide for the step-by-step commands:

  • Create the conda env with pip.
  • Exercise
    1. For a exercise, implement all NotImplementedErrors in the *_exercise.py file .
    2. then train it with the provided command.
    3. [Optional] generate the video and push the video/result to the HuggingFace.

Exercises

Don't choose too hard game and big neural network. But you can try it by yourself.

Exercise Algorithm Verification Game For Challenge State Action
1. q_learning Q Table FrozenLake Taxi 📊 📊
2. dqn Deep Q Network -> Rainbow 1D LunarLander-v3 img LunarLander-v3 🌊 📊
3. reinforce Reinforce (Monte Carlo) CartPole-v1 - 🌊 📊
4. curiosity Curiosity (Reinforce, baseline, shaping reward) - MountainCar-v0 🌊 📊
5. A2C A2C+GAE (or A2C+TD-n) CartPole-v1 LunarLander-v3 🌊 📊
6. A3C A3C (using A2C+GAE) CartPole-v1 LunarLander-v3 🌊 📊
7. PPO PPO CartPole-v1 LunarLander-v3 🌊 📊
8. TD3 Twin Delayed DDPG (TD3) Pendulum-v1 Walker2d-v5 🌊 🌊
9. SAC SAC (Soft Actor-Critic) Pendulum-v1 Walker2d-v5 🌊 🌊
10. PPO+DDP PPO+Curiosity Reacher-v5 Pusher-v5 🌊 🌊
11. SAC+DDP SAC+PER Reacher-v5 Pusher-v5 🌊 🌊
12. MBPO Model-based Policy Optim. Pusher-v5 Walker2d-v5 🌊 🌊

where, 🌊: Continuous, 📊: Discrete

Motivation

After studying the HuggingFace's DRL course and Pieter Abbeel's The Foundations of Deep RL in 6 Lectures, I want to have a deeper and broader understanding through the coding.

Other

  1. RL Algorithms
  2. OpenAI's Spining Up
  3. Stable Baseline3

About

Practice the Deep Reinforcement Learning (DRL) with the gym on laptop.

Topics

Resources

License

Stars

Watchers

Forks

Contributors 2

  •  
  •  

Languages