Skip to content

Latest commit

 

History

History
78 lines (55 loc) · 2.58 KB

File metadata and controls

78 lines (55 loc) · 2.58 KB

Learning References

A curated collection of resources for learning Reinforcement Learning, from fundamentals to advanced topics.


Courses & Tutorials

Comprehensive Courses


Blog Series & Articles


Research Papers

Foundational Papers

  • DQN Paper - Playing Atari with Deep RL

    • The paper that started the deep RL revolution
    • Introduces experience replay and target networks
  • PPO Paper - Proximal Policy Optimization

    • Modern policy gradient method
    • Used for training LLMs with RLHF

LLM-Specific Papers

  • InstructGPT Paper - RLHF for LLMs

    • How OpenAI trained ChatGPT with human feedback
    • Foundation for modern LLM alignment
  • DPO Paper - Direct Preference Optimization

    • Alternative to PPO for LLM training
    • Simpler and more stable

Libraries & Tools

Core RL Libraries

  • Gymnasium

    • Standard RL environment library
    • Successor to OpenAI Gym
  • Stable Baselines3

    • Production-ready RL algorithm implementations
    • Easy to use, well-documented

LLM Training