Automated short-form video pipeline. One trigger in, an upload-ready 1080×1920 MP4 out. Fully local, fully open-source, zero ongoing cost.
Press a button. A few minutes later, a finished reel lands in your Telegram. The pipeline discovers a trending topic, researches it, writes a script with a local LLM, generates narration, fetches/generates visuals, assembles a captioned 9:16 video, and delivers it.
Apple M4, 16GB unified memory, macOS 15+. Sequential model execution keeps peak RAM under ~12GB.
- Server: FastAPI + Uvicorn on port 8420
- DB: SQLite (aiosqlite)
- LLM: Ollama (Llama 3.1 8B Q4)
- TTS: Piper / edge-tts / gtts (fallback chain)
- Timestamps: faster-whisper
- AI video: ComfyUI headless (AnimateDiff+SDXL, Wan 1.3B, LTX 2B) + MLX on Apple Silicon
- Stock: Pexels API (free tier)
- Assembly: MoviePy + ffmpeg + Pillow
- Delivery: Telegram Bot API
Trigger → Discover → Research → Script → Audio → Visuals → Assembly → Validate → Deliver.
Each stage writes output to disk before marking complete, so any run is resumable after a crash or kill. Only one run executes at a time.
./setup.shInstalls Homebrew deps (python@3.11, ffmpeg), pulls the Llama model via Ollama, creates a venv, installs Python deps, and initializes the SQLite DB.
For AI video generation on Apple Silicon:
./setup_ai_video.shThen configure config.yaml with your Telegram bot token, chat ID, and Pexels API key.
source .venv/bin/activate
python -m server.mainDashboard at http://localhost:8420.
curl -X POST http://localhost:8420/api/trigger \
-H "Content-Type: application/json" \
-d '{}' # auto-discover trending topic
# or
-d '{"topic": "india australia critical minerals deal"}'Works from an iPhone Shortcut on the local network too — same endpoint, POST the same JSON.
| Method | Path | Description |
|---|---|---|
| POST | /api/trigger |
Start a new run. Optional { "topic": "..." }. |
| GET | /api/runs |
List all runs. |
| GET | /api/runs/:id |
Run detail with per-step status. |
| POST | /api/runs/:id/kill |
Kill active run. |
| POST | /api/runs/:id/resume |
Resume from last completed step. |
| GET | /api/runs/:id/output |
Download final MP4. |
| GET | /api/health |
Health, model availability, disk space. |
autoreels/
├── server/ FastAPI app, routes, DB, Telegram bot
├── pipeline/ Orchestrator + one module per stage
├── comfyui/workflows/ JSON workflow templates per video model
├── dashboard/ Single-page dashboard (index.html)
├── assets/ SFX, music, fonts, flags, pronunciation.json
├── scripts/ Asset + model setup helpers
├── output/<run_id>/ Per-run artifacts (script.json, narration.wav, segments/, final.mp4)
├── db/autoreels.db SQLite DB
├── config.yaml All tunable params
└── setup.sh One-command setup
Audio drives timing: narration is generated first, Whisper extracts word timestamps, visuals cut to match. Stock assets are the backbone; AI generation is the accent — used only when stock doesn't fit. Script quality is the bottleneck, so most prompt-engineering effort lives there.
Work in progress. See AUTOREELS-PRD.md for the full spec.
Open source. All dependencies are free and open-source.