This guide covers setting up a development environment, contributing to MacBot, and understanding the codebase architecture.
# Install system dependencies
xcode-select --install
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
brew install cmake ffmpeg portaudio python@3.11 git git-lfs
# Clone the repository
git clone https://github.com/lukifer23/MacBot.git
cd MacBot
# Initialize submodules
git submodule update --init --recursive# Create virtual environment
python3.11 -m venv macbot_env
source macbot_env/bin/activate
# Install dependencies
pip install -r requirements.txt
# Install Piper TTS (recommended)
pip install piper-tts
# Download Piper voice models
mkdir -p piper_voices/en_US-lessac-medium
curl -L "https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/lessac/medium/en_US-lessac-medium.onnx" \
-o piper_voices/en_US-lessac-medium/model.onnx
curl -L "https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json" \
-o piper_voices/en_US-lessac-medium/model.onnx.json
# Install development dependencies
pip install -r requirements-dev.txt# Build llama.cpp
make build-llama
# Build whisper.cpp
make build-whisper
# Build all dependencies
make build-allMacBot/
├── src/macbot/ # Main package
│ ├── cli.py # Command-line interface
│ ├── __init__.py # Package initialization
│ ├── voice_assistant.py # Voice assistant with interruption
│ ├── audio_interrupt.py # TTS interruption handler
│ ├── conversation_manager.py # Conversation state management
│ ├── message_bus.py # Real-time communication
│ ├── message_bus_client.py # Message bus client
│ ├── orchestrator.py # Service orchestration
│ ├── web_dashboard.py # Web interface
│ ├── health_monitor.py # Health monitoring & resilience
│ ├── rag_server.py # RAG knowledge base
│ ├── config.py # Configuration management
│ ├── auth.py # JWT authentication system
│ ├── validation.py # Input validation and sanitization
│ ├── resource_manager.py # Resource lifecycle management
│ ├── error_handler.py # Centralized error handling
│ └── logging_utils.py # Structured logging utilities
├── scripts/ # Shell scripts
│ ├── bootstrap_mac.sh # Bootstrap script
│ └── start_macbot.sh # Startup script
├── tests/ # Test files
│ ├── test_interruptible_conversation.py
│ ├── test_message_bus.py
│ └── test_message_bus_client.py
├── config/ # Configuration files
│ └── config.yaml # Main configuration
├── docs/ # Documentation
│ ├── API_REFERENCE.md
│ ├── CONFIGURATION.md
│ ├── TROUBLESHOOTING.md
│ └── DEVELOPMENT.md
├── data/ # Data directories
│ ├── rag_data/ # Knowledge base data
│ │ ├── documents.json
│ │ └── chroma_db/
│ └── rag_database/ # Vector database
│ └── chroma.sqlite3
├── models/ # Model directories
│ ├── llama.cpp/ # LLM inference engine
│ └── whisper.cpp/ # Speech recognition
├── logs/ # Log files
│ └── macbot.log # Application logs
├── requirements.txt # Python dependencies
├── requirements-dev.txt # Development dependencies
├── pyproject.toml # Modern Python packaging
├── setup.py # Legacy packaging
├── Makefile # Build and run commands
├── docker-compose.yml # Docker orchestration
├── Dockerfile # Container definition
└── README.md
- Piper is the sole TTS engine. Voices are discovered from
piper_voices/*/model.onnx. - Control API allows preview/apply of voices and selecting output devices.
- Mic can be automatically muted during TTS to avoid feedback.
- WebSocket Communication (primary): Cross-process, resilient client with auto-reconnect
- In-Process Fallback: Lightweight queue bus remains available for same-process usage
- Cross-Service Integration: Voice assistant and orchestrator can exchange events via WS
- Thread-Safe Operations: Safe handler dispatch and send locking
- Audio Handler Integration: Fixed interrupt flag reset issues
- State Coordination: Conversation manager and audio handler stay synchronized
- Race Condition Prevention: Protected against double interruption calls
- Circuit Breaker Fix: Corrected datetime comparison logic
- Automatic Recovery: Services properly recover after failures
- Better Error Handling: Comprehensive exception handling throughout
- Purpose: Manages all services and their lifecycle
- Features:
- Automatic startup and shutdown
- Health monitoring
- Service dependency management
- Graceful error handling
- Purpose: Main voice interaction interface
- Features:
- Voice activity detection
- Speech-to-text (Whisper)
- LLM processing (llama.cpp)
- Text-to-speech (Piper)
- Tool integration
- Purpose: Web-based monitoring and control interface
- Features:
- Real-time system statistics
- Service status monitoring
- Chat interface
- API endpoints
- Purpose: Knowledge base and document search
- Features:
- Document ingestion
- Semantic search
- Vector embeddings
- ChromaDB integration
- Purpose: Service health monitoring and resilience
- Features:
- Circuit breaker pattern implementation
- Service health checks with configurable intervals
- Automatic failure detection and recovery
- Graceful degradation support
- Alert system for service failures
Audio Input → VAD → Whisper (STT) → LLM → TTS → Audio Output
↓
Tool Calls → Native macOS Integration
↓
RAG Search → Knowledge Base
↓
Health Monitor → Service Resilience
git checkout -b feature/your-feature-nameFollow the coding standards and add tests for new functionality.
# Run unit tests
python -m pytest tests/
# Test integration
make test
# Manual testing
python -m macbot.voice_assistant --debug- Update relevant documentation files
- Add docstrings to new functions
- Update API documentation if endpoints change
git add .
git commit -m "feat: add your feature description"
git push origin feature/your-feature-name- Follow PEP 8
- Use comprehensive type hints for all function parameters and return values
- Add detailed docstrings to all public functions and classes
- Use meaningful variable names
- Zero type checker errors (mypy/pyright compliant)
- Complete Type Coverage: All modules are fully typed with mypy/pyright validation
- Strict Type Checking: No
Anytypes except where absolutely necessary - Type Imports: Use proper typing imports (
from typing import List, Dict, Optional, etc.) - Generic Types: Use appropriate generic types for collections and containers
- auth.py: JWT token generation, validation, and Flask decorators
- validation.py: Input sanitization, XSS prevention, and request validation
- resource_manager.py: Automatic cleanup of temporary files, threads, and processes
- error_handler.py: Centralized error handling with severity classification
def process_audio_data(audio_data: bytes, sample_rate: int = 16000) -> str:
"""
Process audio data and return transcription.
Args:
audio_data: Raw audio bytes
sample_rate: Audio sample rate in Hz
Returns:
Transcribed text from audio
Raises:
AudioProcessingError: If audio processing fails
"""
# Implementation here
pass- Use custom exceptions for specific error types
- Provide meaningful error messages
- Log errors with appropriate levels
- Don't expose internal errors to users
import logging
logger = logging.getLogger(__name__)
def some_function():
logger.debug("Detailed debug information")
logger.info("General information")
logger.warning("Warning message")
logger.error("Error message")
logger.critical("Critical error")# tests/test_voice_assistant.py
import pytest
from macbot.voice_assistant import VoiceAssistant
class TestVoiceAssistant:
def test_initialization(self):
va = VoiceAssistant()
assert va is not None
def test_process_command(self):
va = VoiceAssistant()
result = va.process_command("test command")
assert isinstance(result, str)# tests/test_integration.py
def test_full_pipeline():
# Test complete audio -> response pipeline
pass# Run all tests
pytest
# Run specific test file
pytest tests/test_voice_assistant.py
# Run with coverage
pytest --cov=src --cov-report=html# tools/custom_tools.py
def take_screenshot(save_path: str = "~/Desktop") -> str:
"""Take a screenshot and save to specified path."""
import subprocess
import os
path = os.path.expanduser(save_path)
filename = f"screenshot_{datetime.now().strftime('%Y%m%d_%H%M%S')}.png"
filepath = os.path.join(path, filename)
subprocess.run(["screencapture", "-x", filepath])
return f"Screenshot saved to {filepath}"# src/macbot/voice_assistant.py
from .tools.custom_tools import take_screenshot
class VoiceAssistant:
def __init__(self):
self.tools = {
"take_screenshot": take_screenshot,
# ... other tools
}# config.yaml
tools:
enabled:
- take_screenshot
take_screenshot:
save_path: "~/Desktop"# config.yaml
prompts:
system: |
You can use these tools:
- take_screenshot: Take a screenshot
# ... other toolsimport cProfile
import pstats
def profile_function():
profiler = cProfile.Profile()
profiler.enable()
# Code to profile
your_function()
profiler.disable()
stats = pstats.Stats(profiler)
stats.sort_stats('cumulative').print_stats(10)- Use generators for large data processing
- Implement proper cleanup in
__del__methods - Monitor memory usage with
tracemalloc
- Use multiprocessing for CPU-intensive tasks
- Implement caching for expensive operations
- Profile with
line_profiler
# Enable debug logging
export DEBUG=1
python -m macbot.voice_assistant
# Or set in config
logging:
level: DEBUG# Add to code for remote debugging
import pdb; pdb.set_trace()# config.yaml
logging:
level: INFO
format: '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
file: macbot.log- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Update documentation
- Submit pull request
type(scope): description
[optional body]
[optional footer]
Types:
feat: New featurefix: Bug fixdocs: Documentationstyle: Code style changesrefactor: Code refactoringtest: Adding testschore: Maintenance
- Code follows style guidelines
- Tests are included
- Documentation is updated
- No breaking changes
- Performance impact assessed
# Full deployment
./scripts/start_macbot.sh
# Individual services
make run-assistant
make run-llama
python -m macbot.web_dashboard- Set up log rotation
- Configure monitoring
- Set appropriate resource limits
- Implement backup strategies
- All services run locally - no external data exposure
- Microphone permissions required
- File system access limited to user directories
- Keep dependencies updated
- Use virtual environments
- Don't commit sensitive configuration
- Regular security audits
- Check existing issues on GitHub
- Review documentation
- Join community discussions
Include:
- System information
- Steps to reproduce
- Expected vs actual behavior
- Log excerpts
- Configuration used
source macbot_env/bin/activate
# Speak via VA control API
curl -s -X POST http://localhost:8123/speak \
-H 'Content-Type: application/json' \
-d '{"text":"Hello, this is a test"}'
# List Piper voices and apply one
curl -s http://localhost:8123/voices | jq
curl -s -X POST http://localhost:8123/set-voice \
-H 'Content-Type: application/json' \
-d '{"voice_path":"piper_voices/en_US-lessac-medium/model.onnx"}'
# Choose output device
curl -s http://localhost:8123/devices | jq
curl -s -X POST http://localhost:8123/set-output -H 'Content-Type: application/json' -d '{"device":5}'