Voice Realtime

Real-time conversational AI for macOS with push-to-talk interface and multiple personas.

Features

Push-to-Talk: Hold Cmd+Shift+T to speak, release to get AI response
Push-to-Dictate: Hold Cmd+Shift+D to speak, release to type at cursor
Read Selection: Select text, press Cmd+Shift+S to hear it read aloud
Multiple Personas: Switch between Assistant, Tutor, Creative, and Casual modes
Hybrid LLM: RedPill GPU TEE models with cryptographic attestation
Fast STT: Lightning Whisper MLX - optimized for Apple Silicon
Natural TTS: Moshi neural speech synthesis
Continuous Conversation: Maintains context across turns

Requirements

macOS with Apple Silicon (M1/M2/M3/M4)
Python 3.12+
Hammerspoon (for hotkeys)
RedPill API key (get one at redpill.ai)

Getting Started

1. Clone and Setup

git clone https://github.com/HashWarlock/nobody.git
cd nobody
./setup.sh

The setup script will:

Create a Python virtual environment at ~/voice-env
Install all dependencies
Symlink Hammerspoon config (auto-updates when you pull changes)
Download required ML models (~1GB)
Create a symlink at ~/voice-realtime

2. Configure API Key

Create a .env file with your RedPill API key:

cp .env.example .env
# Edit .env and add your REDPILL_API_KEY

Or manually:

echo "REDPILL_API_KEY=your-key-here" > .env

3. Grant Permissions

System Settings > Privacy & Security > Accessibility
- Add Hammerspoon and grant access
System Settings > Privacy & Security > Microphone
- Grant access to Terminal (or your terminal app)
System Settings > Privacy & Security > Automation (for dictation)
- Allow Hammerspoon to control "System Events"

4. Start Using

Open Hammerspoon (you should see "Voice Realtime ready!" alert)
Hold Cmd+Shift+T and speak
Release to hear the AI response

Hotkeys

Hotkey	Action
`Cmd+Shift+T`	Push-to-talk (hold to speak, release to get AI response)
`Cmd+Shift+D`	Push-to-dictate (hold to speak, release to type at cursor)
`Cmd+Shift+S`	Read selection (highlight text, press to hear it spoken)
`Cmd+Shift+X`	Stop/Cancel
`Cmd+Shift+M`	Model picker (searchable menu to switch LLM)
`Cmd+Shift+1`	Switch to Assistant persona
`Cmd+Shift+2`	Switch to Tutor persona
`Cmd+Shift+3`	Switch to Creative persona
`Cmd+Shift+4`	Switch to Casual persona
`Cmd+Shift+R`	Reload Hammerspoon config

Personas

Configure in personas.yaml:

Persona	Provider	Description
Assistant	RedPill	Fast, concise answers
Tutor	RedPill	Patient explanations with examples
Creative	RedPill	Brainstorming and ideation
Casual	Ollama	Friendly local conversation

Voice Pipeline

Component	Technology	Description
STT	Lightning Whisper MLX	10x faster than whisper.cpp
TTS	Moshi MLX	Neural speech synthesis
LLM	RedPill API	GPU TEE models with cryptographic attestation

STT Models

The default STT model is distil-medium.en. Available models:

Model	Size	Speed	Accuracy
tiny	39M	Fastest	Basic
small	244M	Fast	Good
distil-medium.en	769M	Fast	Great (English)
large-v3	1.5B	Slower	Best

Change in main.py:

transcriber = WhisperTranscriber(model="distil-medium.en")

Configuration

Environment Variables

Create a .env file (see .env.example):

# Required for cloud personas
REDPILL_API_KEY=your-key-here

# Optional: Override Ollama host for local personas
OLLAMA_HOST=http://localhost:11434

Changing LLM Models

Edit personas.yaml to change which model each persona uses:

personas:
  assistant:
    llm:
      provider: "redpill"
      model: "deepseek/deepseek-v3.2"  # Default, strong reasoning

GPU TEE Models (18 Total)

All models run in hardware-secured GPU TEE environments with cryptographic attestation.

Phala Network (9 models)

Model ID	Name	Features
`deepseek/deepseek-v3.2`	DeepSeek v3.2	Reasoning
`phala/uncensored-24b`	Uncensored 24B	Uncensored
`phala/glm-4.7-flash`	GLM 4.7 Flash	Multilingual, fast
`phala/qwen3-vl-30b-a3b-instruct`	Qwen3 VL 30B	Vision
`phala/qwen2.5-vl-72b-instruct`	Qwen 2.5 VL 72B	Vision
`phala/qwen-2.5-7b-instruct`	Qwen 2.5 7B	General, fast
`phala/gemma-3-27b-it`	Gemma 3 27B	General
`phala/gpt-oss-120b`	GPT OSS 120B	General
`phala/gpt-oss-20b`	GPT OSS 20B	General, fast

Tinfoil (5 models)

Model ID	Name	Features
`moonshotai/kimi-k2.5`	Kimi K2.5	Default, reasoning
`moonshotai/kimi-k2-thinking`	Kimi K2 Thinking	Reasoning
`deepseek/deepseek-r1-0528`	DeepSeek R1	Reasoning
`qwen/qwen3-coder-480b-a35b-instruct`	Qwen3 Coder 480B	Code
`meta-llama/llama-3.3-70b-instruct`	Llama 3.3 70B	General

Chutes (1 model)

Model ID	Name	Features
`minimax/minimax-m2.1`	MiniMax M2.1	General

Near-AI (4 models)

Model ID	Name	Features
`deepseek/deepseek-chat-v3.1`	DeepSeek Chat v3.1	General
`qwen/qwen3-30b-a3b-instruct-2507`	Qwen3 30B	General
`z-ai/glm-4.7`	GLM 4.7	Multilingual
`z-ai/glm-4.7-flash`	GLM 4.7 Flash	Multilingual, fast

Which Model Should I Use?

Use Case	Recommended Model	Why
General chat	`moonshotai/kimi-k2.5`	Default, strong reasoning
Complex reasoning	`deepseek/deepseek-r1-0528`	Reasoning-optimized with R1 architecture
Long context	`moonshotai/kimi-k2-thinking`	Deep reasoning, extended thinking
Coding	`qwen/qwen3-coder-480b-a35b-instruct`	Code-specialized, 262k context
Vision tasks	`phala/qwen2.5-vl-72b-instruct`	Best vision model
Fast + balanced	`meta-llama/llama-3.3-70b-instruct`	Llama 3.3, good all-around
Uncensored	`phala/uncensored-24b`	No content restrictions
Multilingual	`z-ai/glm-4.7`	Strong multilingual support

Troubleshooting

"launch path not accessible" error

The Python path or script path is wrong. Run ./setup.sh again or check that ~/voice-env/bin/python exists.

No response from AI

Check that REDPILL_API_KEY is set in .env file.

Transcription is wrong/gibberish

Try a larger Whisper model (e.g., small or distil-medium.en).

Hammerspoon not responding

Reload config with Cmd+Shift+R or restart Hammerspoon.

Dictation not typing

Check System Settings > Privacy & Security > Automation - Hammerspoon needs permission to control "System Events"
Make sure a text field is focused when you release the hotkey
Try Cmd+V manually after dictating - if text pastes, it's a permission issue

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
docs/plans		docs/plans
tests		tests
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
audio_capture.py		audio_capture.py
audio_playback.py		audio_playback.py
config.py		config.py
conversation.py		conversation.py
hotkeys.lua		hotkeys.lua
llm_router.py		llm_router.py
main.py		main.py
model_manager.py		model_manager.py
models.yaml		models.yaml
persona_manager.py		persona_manager.py
personas.yaml		personas.yaml
recorder.py		recorder.py
requirements.txt		requirements.txt
setup.sh		setup.sh
stt.py		stt.py
tts.py		tts.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice Realtime

Features

Requirements

Getting Started

1. Clone and Setup

2. Configure API Key

3. Grant Permissions

4. Start Using

Hotkeys

Personas

Voice Pipeline

STT Models

Configuration

Environment Variables

Changing LLM Models

GPU TEE Models (18 Total)

Phala Network (9 models)

Tinfoil (5 models)

Chutes (1 model)

Near-AI (4 models)

Which Model Should I Use?

Troubleshooting

"launch path not accessible" error

No response from AI

Transcription is wrong/gibberish

Hammerspoon not responding

Dictation not typing

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

HashWarlock/nobody

Folders and files

Latest commit

History

Repository files navigation

Voice Realtime

Features

Requirements

Getting Started

1. Clone and Setup

2. Configure API Key

3. Grant Permissions

4. Start Using

Hotkeys

Personas

Voice Pipeline

STT Models

Configuration

Environment Variables

Changing LLM Models

GPU TEE Models (18 Total)

Phala Network (9 models)

Tinfoil (5 models)

Chutes (1 model)

Near-AI (4 models)

Which Model Should I Use?

Troubleshooting

"launch path not accessible" error

No response from AI

Transcription is wrong/gibberish

Hammerspoon not responding

Dictation not typing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages