Whisperlater 🎙️

A Dockerized, self-hosted web platform for high-performance transcription and translation. Leveraging faster-whisper (CTranslate2), Whisperlater provides professional-grade subtitles with optimized readability.

🚀 Key Features

GPU Accelerated: Optimized for NVIDIA GPUs using faster-whisper and CTranslate2.
Subtitle Safety Cap: Each subtitle cue is limited to a maximum of 15 words.
Readability Pass: Automatic word-wrapping (42 chars) and minimum display duration (1.2s) for better viewer experience.
Smart VAD: Built-in Voice Activity Detection (Silero VAD) to eliminate hallucinations in silent audio segments.
Suppression Defaults: Uses suppress_tokens=[-1,50364] to reduce repetitive hallucinations.
Long Media Support: Media longer than 4 hours is split into 4-hour chunks for stable processing.
Modern Web UI: Responsive Vue 3 + Tailwind CSS interface with advanced job configuration.
Persistence: Mountable volumes for permanent storage of outputs and temporary uploads.

🛠️ Tech Stack

Backend: FastAPI, Uvicorn, Python 3.10
Engine: faster-whisper, ctranslate2
Frontend: Vue.js 3 (CDN), Tailwind CSS (CDN)
Containerization: Docker with NVIDIA CUDA 12.2

📋 Prerequisites

Linux OS (Ubuntu 22.04 recommended)
NVIDIA GPU with 8GB+ VRAM (for large-v3)
Docker & NVIDIA Container Toolkit installed

📥 Installation & Quick Start

1. Clone the Repository

git clone https://github.com/jeric-n/Whisperlater.git
cd Whisperlater

2. Build the Docker Image

docker build -t whisperlater -f docker/Dockerfile .

3. Run the Container

docker run -d \
  --name whisperlater \
  --gpus all \
  -p 8000:8000 \
  -v $(pwd)/outputs:/app/outputs \
  -v $(pwd)/temp:/app/temp \
  --restart unless-stopped \
  whisperlater

Access the UI at http://localhost:8000.

⚙️ Configuration

Model

Whisperlater uses a single model:

whisper-large-v3 (internally mapped to large-v3)

Runtime Defaults

Temperature: 0.0,0.2,0.4,0.6,0.8
Suppress tokens: [-1,50364]
No-repeat n-gram guard: 3 for translate tasks (0 for transcribe)
Long-media split threshold: 4 hours
Long-media chunk size: 4 hours
Max subtitle words on screen: 15 per cue

Volumes

/app/outputs: Finished .srt and .txt files are stored here.
/app/temp: Temporary upload storage (auto-cleaned after processing).

🧪 Development & Testing

Run the app locally:

Install dependencies: pip install -r requirements.txt
Start server: uvicorn app.main:app --host 0.0.0.0 --port 8000

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
app		app
docker		docker
docs		docs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Whisperlater 🎙️

🚀 Key Features

🛠️ Tech Stack

📋 Prerequisites

📥 Installation & Quick Start

1. Clone the Repository

2. Build the Docker Image

3. Run the Container

⚙️ Configuration

Model

Runtime Defaults

Volumes

🧪 Development & Testing

📄 License

About

Uh oh!

Releases 7

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Whisperlater 🎙️

🚀 Key Features

🛠️ Tech Stack

📋 Prerequisites

📥 Installation & Quick Start

1. Clone the Repository

2. Build the Docker Image

3. Run the Container

⚙️ Configuration

Model

Runtime Defaults

Volumes

🧪 Development & Testing

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages