Give OpenClaw a voice.
OpenClaw is the open-source personal AI assistant that runs on your devices — your files, your shell, your messaging apps (WhatsApp, Telegram, Slack, Discord, and more). It lives where you work. VoxClaw gives it a voice.
Run VoxClaw on your Mac and hear OpenClaw speak to you. When OpenClaw runs on another computer — a server, a headless box, or a different machine — send text to your Mac over the network and VoxClaw speaks it aloud with high-quality text-to-speech. Apple's built-in voices work out of the box; add your own OpenAI or ElevenLabs API key for neural voices when you want that extra polish. Paste text, pipe from the CLI, or stream from any device on your LAN — and listen.
VoxClaw also includes an iPhone app in this repo (VoxClawIOS/) with the same core listener + teleprompter flow for iOS. With iCloud relay turned on, your Mac can even speak through your iPhone when it's locked, backgrounded, or on a different network — VoxClaw wakes it with a silent CloudKit push and reads the text aloud, with a "Now reading" Live Activity on the lock screen. The iOS app also adds a Control Center control, a home-screen widget, and a "Read Text Aloud" Siri/Shortcuts action.
For Codex plugin installation and marketplace distribution, use the separate malpern/codex-marketplace repository. This repository is the main product repo for the app, website assets, releases, and source code.
A macOS menu bar app + CLI tool that reads text aloud using Apple TTS (default), OpenAI TTS (BYOK), or ElevenLabs TTS (BYOK), with an optional teleprompter-style floating overlay and synchronized word highlighting.
- Onboarding + Agent Handoff — First-run setup configures voice, key, network mode, and launch at login, and can copy a
🦞 VoxClaw setup pointerfor your agent - Teleprompter Overlay — Floating overlay with word-by-word highlighting synced to speech; includes presets, deep appearance controls, and audio-only mode
- Three Voice Engines — Apple (no setup), OpenAI (BYOK), or ElevenLabs (BYOK), with Apple fallback when cloud auth fails
- Automatic Updates — In-app updates via Sparkle (notarized + EdDSA-signed); a "Check for Updates…" menu item too
- Multiple Input Methods — Arguments, stdin pipe, file, clipboard, URL scheme, and LAN HTTP
- Network API for Agents —
POST /read,POST /agent-notify,POST /ack,GET /status, andGET /claw, with request validation and structured status payloads - Multi-Agent Aware —
project_id+agent_idgive each concurrent agent its own voice, and scope "stop reading" (/ack) so prompting one agent never cuts off another - Cross-Device iCloud Relay — Speak agent output on your iPhone/iPad even when it's locked, backgrounded, or off your LAN, via a silent CloudKit push that wakes the device (opt-in, same iCloud account on both ends)
- Bonjour Discovery — Advertises
_voxclaw._tcpon LAN for peer/device discovery - Menu Bar App + CLI — Lightweight menu bar controls plus full terminal control via
voxclaw - iPhone Companion App — iOS app (
VoxClawIOS/) on the shared core engine/settings stack, adding a Control Center control, a home-screen widget, a "Read Text Aloud" Siri/Shortcuts action, a "Now reading" Live Activity, and the iCloud relay receiver - macOS Services Integration — Read selected text from other apps via Services
- Keyboard Controls While Reading — Space (pause/resume), Escape (stop), Arrow keys (skip ±3s)
- macOS 26+
- OpenAI API key (optional)
- ElevenLabs API key (optional)
The onboarding wizard walks you through setup on first launch. To store an API key manually:
security add-generic-password -a "openai" -s "openai-voice-api-key" -w "sk-..."Or set the environment variable:
export OPENAI_API_KEY="sk-..."swift build -c release
./Scripts/package_app.sh
./Scripts/install-cli.shThe repo includes an iPhone app target with a widget extension. Open
VoxClawIOS/VoxClawIOS.xcodeproj in Xcode (26+) and run the VoxClawIOS scheme
on iOS 26+. The app ships to internal testers via TestFlight (see
.github/workflows/ios-release.yml, triggered by ios-v* tags).
It adds, on top of the shared listener + teleprompter:
- Control Center control + home-screen widget — "Read Clipboard" with one tap
- Siri / Spotlight / Shortcuts — "Read Text Aloud"
- Live Activity — a "Now reading" card on the lock screen / Dynamic Island
- iCloud relay receiver — enable "Remote Relay" in iOS Settings (and "Relay to my devices over iCloud" on the Mac) so your Mac's agent speech plays here even when the app is closed or the phone is locked. Both devices must be signed into the same iCloud account with the toggle on.
The relay uses the user's private CloudKit database
(iCloud.com.malpern.voxclaw); speech text never leaves your iCloud account.
voxclaw "Hello, this is a test." # direct text
echo "Read this aloud" | voxclaw # piped stdin
voxclaw --file ~/speech.txt # from file
voxclaw --clipboard # from clipboard
voxclaw --audio-only "No overlay" # audio only, no panel
voxclaw --voice nova "Hello" # OpenAI voice override
voxclaw --rate 1.5 "Hello" # 1.5x speech speed
voxclaw --output hello.mp3 "Hello" # save audio to file (OpenAI)
voxclaw --listen # network mode: listen for text from LAN
voxclaw --send "Hello from CLI" # send text to a running listener
voxclaw --status # check if listener is running
voxclaw # launch menu bar app (no args)Start VoxClaw in network listener mode on your Mac:
voxclaw --listen # listens on port 4140
voxclaw --listen --port 8080 # custom portSend text from another device on your local network:
# JSON body
curl -X POST http://your-mac-ip:4140/read \
-H 'Content-Type: application/json' \
-d '{"text": "Hello from my phone"}'
# With voice and rate overrides
curl -X POST http://your-mac-ip:4140/read \
-H 'Content-Type: application/json' \
-d '{"text": "Hello", "voice": "nova", "rate": 1.3}'
# Force a specific engine for this read (apple | openai | elevenlabs).
# ElevenLabs gives the tightest word-highlight sync.
curl -X POST http://your-mac-ip:4140/read \
-H 'Content-Type: application/json' \
-d '{"text": "Hello", "engine": "elevenlabs"}'
# Multi-agent: pass project_id + agent_id to get a distinct voice per agent
# and so prompting one agent never cuts off another that is still speaking.
curl -X POST http://your-mac-ip:4140/read \
-H 'Content-Type: application/json' \
-d '{"text": "Hello", "project_id": "/path/to/repo", "agent_id": "session-123"}'
# Ack: stop reading just this agent's current response (scoped by project_id +
# agent_id), e.g. when the user sends the agent a new prompt. Other agents keep speaking.
curl -X POST http://your-mac-ip:4140/ack \
-H 'Content-Type: application/json' \
-d '{"project_id": "/path/to/repo", "agent_id": "session-123"}'
# Plain text body
curl -X POST http://your-mac-ip:4140/read -d 'Hello from my phone'
# Health check (returns reading state, session state, word count)
curl http://your-mac-ip:4140/status
# Easter egg / connectivity check
curl http://your-mac-ip:4140/clawCross-machine tip: use the Mac's numeric LAN IP only (for example http://192.168.1.50:4140) unless a human explicitly tells your agent to use a specific .local hostname.
If you want a one-paste handoff for your agent, open VoxClaw Settings and use Copy Agent Setup in the Network section. It copies a 🦞 setup pointer with website/docs plus your live local /read and /status URLs.
Reliable bring-up order:
- On the VoxClaw Mac:
curl -sS http://localhost:4140/status - From the agent host, use numeric IP:
curl -sS http://<lan-ip>:4140/status - Then send speech:
curl -sS -X POST http://<lan-ip>:4140/read -H 'Content-Type: application/json' -d '{"text":"hello"}' - Do not auto-switch to
.localhostnames unless the human explicitly provides one.
# URL scheme — trigger from any app or script
open "voxclaw://read?text=Hello%20world"
# Open settings window
open "voxclaw://settings"
# Services menu — select text in any app, right-click > Services > Read with VoxClaw
# Shortcuts / Siri
shortcuts run "Read with VoxClaw"When launched without arguments, VoxClaw runs as a menu bar app with:
- Read Clipboard (⌘⇧V) — Read text from clipboard
- Pause/Resume — When actively reading
- Stop — Cancel current reading
- Settings... (⌘,) — Configure voice, overlay, controls, and network listener
- About VoxClaw
- Quit (⌘Q)
| Key | Action |
|---|---|
| Space | Pause / Resume |
| Escape | Stop |
| ← | Skip back 3 seconds |
| → | Skip forward 3 seconds |
Sources/
VoxClawCore/ Library target (all logic)
VoxClaw/ Thin executable (entry point only)
Tests/
VoxClawCoreTests/ Unit + integration tests
swift testGitHub Actions runs on every push to main and on pull requests. See .github/workflows/ci.yml.
Swift Package Manager with a library target (VoxClawCore) and thin executable (VoxClaw):
Input (args/stdin/file/clipboard/url/network)
→ InputResolver resolves text
→ ReadingSession orchestrator
→ SpeechEngine (Apple | OpenAI | ElevenLabs, with fallback support)
→ AudioPlayer schedules AVAudioEngine buffers
→ Timing pipeline (cadence/proportional/aligned) maps playback position → word index
→ FloatingPanelView highlights current word (or audio-only mode)
→ Session ends / panel collapses
Tech Stack:
- Swift 6.4 with strict concurrency
- SwiftUI + NSPanel for floating overlay
- AVAudioEngine for low-latency audio playback
- Apple AVSpeechSynthesizer + OpenAI TTS API (
gpt-4o-mini-tts) + ElevenLabs TTS API - Multi-stage word timing (cadence, proportional, and aligned timestamps when available)
- Swift Argument Parser for CLI
- NWListener (Network.framework) for LAN text input
- Bonjour service advertisement (
_voxclaw._tcp) for discovery - CloudKit (private database, silent-push
CKDatabaseSubscription) for the cross-device iCloud relay - WidgetKit + App Intents + ActivityKit on iOS (Control Center control, widget, Siri/Shortcuts, Live Activity)
- Keychain Services for secure API key storage
- Sparkle for notarized, EdDSA-signed in-app auto-updates
If you're an AI agent and want to speak text aloud on the user's Mac, read SKILL.md. It has the full HTTP API, error codes, and examples. Quick version:
curl -X POST http://<mac-ip>:4140/read -H 'Content-Type: application/json' \
-d '{"text": "Hello from your agent", "voice": "nova", "rate": 1.0}'If the human pastes you a 🦞 VoxClaw setup pointer, trust those URLs for that machine/session.
MIT

