Skip to content

jguida941/voiceterm

VoiceTerm

Rust macOS Linux Whisper Ratatui

VoiceTerm Version MIT License CI Clippy Warnings Mutation Score Coverage

VoiceTerm is a voice-first terminal overlay for Codex and Claude. Gemini preset support remains experimental and is currently nonfunctional. It runs Whisper on your machine and types what you say into your existing CLI. Your tools still run in a normal PTY; VoiceTerm just adds a HUD on top. Use push-to-talk or wake phrases (hey codex, hey claude), then say send / submit for hands-free delivery.

Whisper runs locally by default. No cloud API keys required. Release history: dev/CHANGELOG.md.

Quick Nav

Install and Start

Install one supported AI CLI first:

Codex:

npm install -g @openai/codex

Claude Code:

curl -fsSL https://claude.ai/install.sh | bash

Then choose one VoiceTerm setup path:

Homebrew (recommended)
brew tap jguida941/voiceterm
brew install voiceterm
cd ~/your-project
voiceterm

If needed, authenticate once:

voiceterm --login --codex
voiceterm --login --claude
PyPI (pipx / pip)
pipx install voiceterm
# or: python3 -m pip install --user voiceterm

cd ~/your-project
voiceterm

If needed, authenticate once:

voiceterm --login --codex
voiceterm --login --claude
From source

Requires Rust toolchain. See Install Guide for details.

git clone https://github.com/jguida941/voiceterm.git
cd voiceterm
./scripts/install.sh

If you are running from source while developing, run:

python3 dev/scripts/devctl.py check --profile ci
macOS App

Double-click app/macos/VoiceTerm.app, pick a folder, and it opens Terminal with VoiceTerm running.

For model options and startup/IDE tuning:

How It Works

VoiceTerm listens to your mic, converts speech to text on your machine, and types the result into your AI CLI input.

Recording

Requirements

  • macOS or Linux (Windows needs WSL2)
  • Microphone access
  • ~0.5 GB disk for the default small model (base is ~142 MB, medium is ~1.5 GB)

Features

Main features

Feature What it does
Local speech-to-text Whisper runs on your machine (no cloud calls)
Fast voice-to-text Local Whisper turns speech into text quickly
Keep your CLI as-is Your backend CLI layout and behavior stay the same
Auto voice mode Keep listening on so you can talk instead of typing
Wake mode + voice send Say hey codex/hey claude, then say send/submit in insert mode
Image prompts Use Ctrl+X for one-shot screenshot prompts, or enable persistent image mode for HUD [rec] (IMG badge)
Transcript queue If the CLI is busy, VoiceTerm waits and sends text when ready
Codex + Claude support Primary support for Codex and Claude Code

Everyday tools

  • Voice macros: expand phrases from .voiceterm/macros.yaml (toggle in Settings)
  • Voice navigation: spoken scroll, send, show last error, copy last error, explain last error
  • Dev mode tools: launch with --dev first (look for DEV badge), then use Ctrl+D for Dev panel tools; add --dev-log for JSONL diagnostics
  • Prompt-safe HUD: VoiceTerm suppresses HUD rows for high-confidence Codex/Claude approval prompts and fences PTY scrolling above the HUD so the active input row stays visible
  • Latency clarity: latency badges show completed-turn STT timing and hide while actively recording/processing
  • Transcript history: Ctrl+H to search and replay past text
  • Notification history: Ctrl+N to review recent status messages
  • Saved settings: stored in ~/.config/voiceterm/config.toml
  • Built-in themes: 11 themes including ChatGPT, Catppuccin, Dracula, Nord, Tokyo Night, and Gruvbox
  • Style-pack border settings: VOICETERM_STYLE_PACK_JSON supports components.overlay_border and components.hud_border (HUD applies when border mode is theme)

For full behavior details and controls, see guides/USAGE.md.

Important: if you did not launch with --dev, Ctrl+D is forwarded to the wrapped CLI as EOF (0x04) and can close/exit that CLI session.

Dev panel usage guide: guides/DEV_MODE.md

Supported AI CLIs

VoiceTerm is optimized for Codex and Claude Code. For full backend status and setup details, see Usage Guide -> Backend Support.

Codex

Use the same workflow and controls documented for backend support in guides/USAGE.md.

Claude Code

Claude Backend

IDE Support

Active verified hosts are Cursor terminal and JetBrains terminals. AntiGravity is deferred and not supported in current releases.

IDE host Codex Claude Code Status
Cursor terminal Fully supported Fully supported Recommended primary host
JetBrains terminals (IntelliJ, PyCharm, WebStorm, CLion) Fully supported Fully supported Supported on current release; see troubleshooting for rare host-specific edge cases
AntiGravity Not supported Not supported Deferred until runtime fingerprint evidence exists (not supported in current releases)
Other IDE terminals Unverified Unverified Treat as experimental until listed here

JetBrains + Claude rare edge case (long parallel turns): after very long parallel tool calls or parallel web-search turns, HUD/transcript overlap can appear at turn completion. Quick workaround: resize the terminal once (even by 1 row/column) to force layout recalculation. During these high-churn turns, VoiceTerm already applies a single-line full-HUD fallback for JetBrains+Claude to keep controls reachable while redraw settles. Details: Troubleshooting -> JetBrains + Claude overlay overlap after long parallel output.

Canonical matrix: Usage Guide -> IDE Compatibility.

Hands-Free Quick Start

voiceterm --auto-voice --wake-word --voice-send-mode insert

Think of this like Alexa for your terminal:

  1. Say the wake phrase (hey codex or hey claude)
  2. Speak your prompt
  3. Say send (or submit)

UI Tour

Theme Picker

Theme Picker Press Ctrl+Y to open Theme Studio and choose Theme picker. Use Ctrl+G to cycle themes quickly. Use Tab / Shift+Tab to move between Theme Studio pages (Home, Colors, Borders, Components, Preview, Export). For editor details, see Themes. For theme-file flags/env vars, see CLI Flags.

Settings Menu

Settings

Mouse control is enabled by default. Open Settings with Ctrl+O. Cursor note: when Mouse is ON, wheel/touchpad scrolling may not move chat history, but the scrollbar can still be dragged. If you prefer touchpad/wheel scrolling, set Mouse to OFF and use keyboard focus (Tab/arrows) + Enter for HUD buttons. For details, use:

Transcript History

Use Ctrl+H to open transcript history, type to filter, and press Enter to replay into the active CLI input. Mouse click selection is also supported. History rows are labeled by source (mic, you, ai); only mic and you rows are replayable, and ai rows are output-only. Detailed behavior: Transcript History.

Help Overlay

Press ? to open grouped shortcuts (Recording, Mode, Appearance, Sensitivity, Navigation) with clickable Docs/Troubleshooting links on terminals that support clickable links. Details: Core Controls.

Shortcuts Overlay

Controls

For shortcuts and behavior, see:

For CLI flags and command-line options:

  • voiceterm --help (or voiceterm -h)
  • CLI Flags

Voice Macros

Voice macros are project-local shortcuts in .voiceterm/macros.yaml. Turn macros on in Settings when you want phrase expansion. Setup and examples: Project Voice Macros.

Documentation

Audience Document
User Quick Start
User Guides Index
User Install Guide
User Usage Guide
User CLI Flags
User Troubleshooting
Developer Developer Index
Developer Project Integrations Playbook
Developer Engineering History

Support

Contributing

PRs welcome. See CONTRIBUTING.md. Before opening a PR, run:

  • python3 dev/scripts/devctl.py check --profile prepush
  • python3 dev/scripts/devctl.py hygiene

License

MIT - LICENSE

About

Low-latency Rust terminal overlay for Codex and Claude Code with local Whisper STT, PTY passthrough, wake words, macros, memory tools, and a customizable HUD.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors