Real-time AI coaching for interviews and presentations using YOLO pose detection and computer vision.
- YOLO Pose Detection: Tracks 17 body keypoints in real-time
- Body Language Analysis: Monitors posture, gestures, and movement
- Real-time Feedback: Analyzes your presentation skills live
- Privacy First: All processing happens locally on your machine
git clone <your-repo-url>
cd InterviewPresentationCoach# Create virtual environment with uv
uv venv
# Activate it
source .venv/bin/activate # Mac/Linux
# OR
.venv\Scripts\activate # Windowsuv pip install -r requirements_direct.txtCopy env_template.txt to .env and add your keys:
cp env_template.txt .envThen edit .env with your API keys:
- STREAM_API_KEY: Get from Stream Dashboard
- STREAM_API_SECRET: Same place
- GEMINI_API_KEY: Get from Google AI Studio
python coach_direct.py- Your webcam opens
- YOLO draws green boxes around your body
- 17 keypoints tracked: shoulders, elbows, wrists, hips, knees, ankles
- Real-time pose detection at 30 FPS
- Press 'q' to quit
- Python 3.10+
- Webcam
- API keys (Stream, Gemini)
- YOLO v11: Pose detection
- OpenCV: Video processing
- Ultralytics: YOLO implementation
- Stream: Video infrastructure (future)
- Gemini AI: Coaching intelligence (future)
- YOLO pose detection
- Real-time video processing
- Stream Video SDK integration
- Gemini AI voice coaching
- Web interface
- Session recording
- Analytics dashboard
- Check System Settings → Privacy & Security → Camera
- Grant permissions to Terminal
- Make sure no other app is using the camera
- Check internet connection
- Model auto-downloads on first run (~12MB)
- Manual download: https://github.com/ultralytics/assets/releases
- Make sure virtual environment is activated
- Run:
uv pip install -r requirements_direct.txt
MIT
Contributions welcome! Please open an issue or PR.