Skip to content

jeric-n/Whisperlater

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Whisperlater 🎙️

A Dockerized, self-hosted web platform for high-performance transcription and translation. Leveraging faster-whisper (CTranslate2), Whisperlater provides professional-grade subtitles with optimized readability.

🚀 Key Features

  • GPU Accelerated: Optimized for NVIDIA GPUs using faster-whisper and CTranslate2.
  • Subtitle Safety Cap: Each subtitle cue is limited to a maximum of 15 words.
  • Readability Pass: Automatic word-wrapping (42 chars) and minimum display duration (1.2s) for better viewer experience.
  • Smart VAD: Built-in Voice Activity Detection (Silero VAD) to eliminate hallucinations in silent audio segments.
  • Suppression Defaults: Uses suppress_tokens=[-1,50364] to reduce repetitive hallucinations.
  • Long Media Support: Media longer than 4 hours is split into 4-hour chunks for stable processing.
  • Modern Web UI: Responsive Vue 3 + Tailwind CSS interface with advanced job configuration.
  • Persistence: Mountable volumes for permanent storage of outputs and temporary uploads.

🛠️ Tech Stack

  • Backend: FastAPI, Uvicorn, Python 3.10
  • Engine: faster-whisper, ctranslate2
  • Frontend: Vue.js 3 (CDN), Tailwind CSS (CDN)
  • Containerization: Docker with NVIDIA CUDA 12.2

📋 Prerequisites

  • Linux OS (Ubuntu 22.04 recommended)
  • NVIDIA GPU with 8GB+ VRAM (for large-v3)
  • Docker & NVIDIA Container Toolkit installed

📥 Installation & Quick Start

1. Clone the Repository

git clone https://github.com/jeric-n/Whisperlater.git
cd Whisperlater

2. Build the Docker Image

docker build -t whisperlater -f docker/Dockerfile .

3. Run the Container

docker run -d \
  --name whisperlater \
  --gpus all \
  -p 8000:8000 \
  -v $(pwd)/outputs:/app/outputs \
  -v $(pwd)/temp:/app/temp \
  --restart unless-stopped \
  whisperlater

Access the UI at http://localhost:8000.

⚙️ Configuration

Model

Whisperlater uses a single model:

  • whisper-large-v3 (internally mapped to large-v3)

Runtime Defaults

  • Temperature: 0.0,0.2,0.4,0.6,0.8
  • Suppress tokens: [-1,50364]
  • No-repeat n-gram guard: 3 for translate tasks (0 for transcribe)
  • Long-media split threshold: 4 hours
  • Long-media chunk size: 4 hours
  • Max subtitle words on screen: 15 per cue

Volumes

  • /app/outputs: Finished .srt and .txt files are stored here.
  • /app/temp: Temporary upload storage (auto-cleaned after processing).

🧪 Development & Testing

Run the app locally:

  1. Install dependencies: pip install -r requirements.txt
  2. Start server: uvicorn app.main:app --host 0.0.0.0 --port 8000

📄 License

This project is licensed under the MIT License. See the LICENSE file for details.

About

A Dockerized, self-hosted web platform for high-performance transcription and translation. Using faster_whisper, fastapi, uvicorn, python, vue.js, tailwind.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors