Skip to content

Shridipa/RL-Pytorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€– AI Educator RL Dashboard

A completely self-contained, interactive Mini Reinforcement Learning (RL) simulation built for Hackathons and educational demonstrations. This project provides a real-time, highly visual Streamlit dashboard where users can watch an AI agent iteratively learn to solve a 1D grid world from scratch.

✨ Features

  • Custom RL Environment: A 1D boundary-contained framework styled akin to OpenAI Gymnasium (πŸŸ₯ πŸ€– 🟩 🏁). Features built-in penalty and reward systems.
  • Deep RL Agent (REINFORCE): Implements PyTorch-based Policy Gradient (REINFORCE) algorithm with dynamic Epsilon-Greedy exploration parameters.
  • Live "Educator" Commentary: The dashboard explicitly surfaces Neural Network probability matrices to the frontend at every step. It highlights how the agent is thinking and why it made its specific moves.
  • Dynamic Training Metrics: Tracks multi-episode averages, visual plotting updates, and progressive 'phase' indicators (Exploration > Learning > Mastery).

πŸ“ Repository Structure

RL-Pytorch/
β”‚
β”œβ”€β”€ streamlit_app.py   # Primary dashboard UI & Streamlit frontend execution
β”œβ”€β”€ train.py           # Alternate CLI/headless training pipeline
β”œβ”€β”€ agent.py           # Core Policy Gradient algorithm & Action Extractor logic
β”œβ”€β”€ model.py           # PyTorch Multi-layer Perceptron (Policy Network)
β”œβ”€β”€ custom_env.py      # The custom 1D grid interaction environment rules engine
β”œβ”€β”€ config.py          # Centralized Global Configuration and Hyperparameters
β”œβ”€β”€ utils.py           # Supplemental chart & logging helper functions
└── README.md          # Project roadmap

πŸš€ Running Locally

Assuming you have python and standard data science libraries installed, running this project is a breeze.

All UI controls are baked into the system visually, completely averting command-line fiddling for non-technical evaluators.

1. Install Core Dependencies

pip install torch numpy pandas matplotlib streamlit gymnasium

2. Start the AI Dashboard

streamlit run streamlit_app.py

🧠 What The Architecture Does

  1. The MiniGridEnv forces the Agent to start at coordinate 0.
  2. The Policy Neural Network evaluates the state natively. In early episodes, random "Exploration" takes over.
  3. If it hits the start boundary (wall), the step yields a -5 penalty. If it takes a standard step, it yields a general -1 timeout drain. When it reaches the Flag, it is granted a +10 reward and the simulation halts.
  4. Using Gradient Ascent, the Pytorch agent isolates the highest reward pathways backwards via discounted cumulative gains and rewires the probability bias locally.
  5. You instantly witness the improvement curve visually over X episodes live in the browser dynamically!

About

πŸ€– AI Educator RL Dashboard A completely self-contained, interactive Mini Reinforcement Learning (RL) simulation built for Hackathons and educational demonstrations. This project provides a real-time, highly visual Streamlit dashboard where users can watch an AI agent iteratively learn to solve a 1D grid world from scratch.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages