Skip to content

Latest commit

 

History

History
538 lines (440 loc) · 14.2 KB

File metadata and controls

538 lines (440 loc) · 14.2 KB

MacBot Development Guide

Overview

This guide covers setting up a development environment, contributing to MacBot, and understanding the codebase architecture.

Development Setup

Prerequisites

# Install system dependencies
xcode-select --install
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
brew install cmake ffmpeg portaudio python@3.11 git git-lfs

# Clone the repository
git clone https://github.com/lukifer23/MacBot.git
cd MacBot

# Initialize submodules
git submodule update --init --recursive

Python Environment

# Create virtual environment
python3.11 -m venv macbot_env
source macbot_env/bin/activate

# Install dependencies
pip install -r requirements.txt

# Install Piper TTS (recommended)
pip install piper-tts
# Download Piper voice models
mkdir -p piper_voices/en_US-lessac-medium
curl -L "https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/lessac/medium/en_US-lessac-medium.onnx" \
     -o piper_voices/en_US-lessac-medium/model.onnx
curl -L "https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json" \
     -o piper_voices/en_US-lessac-medium/model.onnx.json

# Install development dependencies
pip install -r requirements-dev.txt

Build Dependencies

# Build llama.cpp
make build-llama

# Build whisper.cpp
make build-whisper

# Build all dependencies
make build-all

Project Structure

MacBot/
├── src/macbot/              # Main package
│   ├── cli.py              # Command-line interface
│   ├── __init__.py         # Package initialization
│   ├── voice_assistant.py  # Voice assistant with interruption
│   ├── audio_interrupt.py  # TTS interruption handler
│   ├── conversation_manager.py # Conversation state management
│   ├── message_bus.py      # Real-time communication
│   ├── message_bus_client.py # Message bus client
│   ├── orchestrator.py     # Service orchestration
│   ├── web_dashboard.py    # Web interface
│   ├── health_monitor.py   # Health monitoring & resilience
│   ├── rag_server.py      # RAG knowledge base
│   ├── config.py           # Configuration management
│   ├── auth.py             # JWT authentication system
│   ├── validation.py       # Input validation and sanitization
│   ├── resource_manager.py # Resource lifecycle management
│   ├── error_handler.py    # Centralized error handling
│   └── logging_utils.py    # Structured logging utilities
├── scripts/                # Shell scripts
│   ├── bootstrap_mac.sh   # Bootstrap script
│   └── start_macbot.sh    # Startup script
├── tests/                  # Test files
│   ├── test_interruptible_conversation.py
│   ├── test_message_bus.py
│   └── test_message_bus_client.py
├── config/                 # Configuration files
│   └── config.yaml        # Main configuration
├── docs/                   # Documentation
│   ├── API_REFERENCE.md
│   ├── CONFIGURATION.md
│   ├── TROUBLESHOOTING.md
│   └── DEVELOPMENT.md
├── data/                   # Data directories
│   ├── rag_data/          # Knowledge base data
│   │   ├── documents.json
│   │   └── chroma_db/
│   └── rag_database/      # Vector database
│       └── chroma.sqlite3
├── models/                 # Model directories
│   ├── llama.cpp/         # LLM inference engine
│   └── whisper.cpp/       # Speech recognition
├── logs/                   # Log files
│   └── macbot.log         # Application logs
├── requirements.txt        # Python dependencies
├── requirements-dev.txt    # Development dependencies
├── pyproject.toml          # Modern Python packaging
├── setup.py               # Legacy packaging
├── Makefile               # Build and run commands
├── docker-compose.yml     # Docker orchestration
├── Dockerfile            # Container definition
└── README.md

Recent Architecture Changes (Phase 6)

TTS System (Piper only)

  • Piper is the sole TTS engine. Voices are discovered from piper_voices/*/model.onnx.
  • Control API allows preview/apply of voices and selecting output devices.
  • Mic can be automatically muted during TTS to avoid feedback.

Message Bus Architecture

  • WebSocket Communication (primary): Cross-process, resilient client with auto-reconnect
  • In-Process Fallback: Lightweight queue bus remains available for same-process usage
  • Cross-Service Integration: Voice assistant and orchestrator can exchange events via WS
  • Thread-Safe Operations: Safe handler dispatch and send locking

Conversation State Synchronization

  • Audio Handler Integration: Fixed interrupt flag reset issues
  • State Coordination: Conversation manager and audio handler stay synchronized
  • Race Condition Prevention: Protected against double interruption calls

Health Monitoring Improvements

  • Circuit Breaker Fix: Corrected datetime comparison logic
  • Automatic Recovery: Services properly recover after failures
  • Better Error Handling: Comprehensive exception handling throughout

Architecture Overview

Core Components

1. Orchestrator (src/macbot/orchestrator.py)

  • Purpose: Manages all services and their lifecycle
  • Features:
    • Automatic startup and shutdown
    • Health monitoring
    • Service dependency management
    • Graceful error handling

2. Voice Assistant (src/macbot/voice_assistant.py)

  • Purpose: Main voice interaction interface
  • Features:
    • Voice activity detection
    • Speech-to-text (Whisper)
    • LLM processing (llama.cpp)
    • Text-to-speech (Piper)
    • Tool integration

3. Web Dashboard (src/macbot/web_dashboard.py)

  • Purpose: Web-based monitoring and control interface
  • Features:
    • Real-time system statistics
    • Service status monitoring
    • Chat interface
    • API endpoints

4. RAG Server (rag_server.py)

  • Purpose: Knowledge base and document search
  • Features:
    • Document ingestion
    • Semantic search
    • Vector embeddings
    • ChromaDB integration

5. Health Monitor (src/macbot/health_monitor.py)

  • Purpose: Service health monitoring and resilience
  • Features:
    • Circuit breaker pattern implementation
    • Service health checks with configurable intervals
    • Automatic failure detection and recovery
    • Graceful degradation support
    • Alert system for service failures

Data Flow

Audio Input → VAD → Whisper (STT) → LLM → TTS → Audio Output
                      ↓
                Tool Calls → Native macOS Integration
                      ↓
                RAG Search → Knowledge Base
                      ↓
           Health Monitor → Service Resilience

Development Workflow

1. Create Feature Branch

git checkout -b feature/your-feature-name

2. Make Changes

Follow the coding standards and add tests for new functionality.

3. Test Changes

# Run unit tests
python -m pytest tests/

# Test integration
make test

# Manual testing
python -m macbot.voice_assistant --debug

4. Update Documentation

  • Update relevant documentation files
  • Add docstrings to new functions
  • Update API documentation if endpoints change

5. Commit and Push

git add .
git commit -m "feat: add your feature description"
git push origin feature/your-feature-name

Coding Standards

Python Style

  • Follow PEP 8
  • Use comprehensive type hints for all function parameters and return values
  • Add detailed docstrings to all public functions and classes
  • Use meaningful variable names
  • Zero type checker errors (mypy/pyright compliant)

Type Safety

  • Complete Type Coverage: All modules are fully typed with mypy/pyright validation
  • Strict Type Checking: No Any types except where absolutely necessary
  • Type Imports: Use proper typing imports (from typing import List, Dict, Optional, etc.)
  • Generic Types: Use appropriate generic types for collections and containers

Security Modules

  • auth.py: JWT token generation, validation, and Flask decorators
  • validation.py: Input sanitization, XSS prevention, and request validation
  • resource_manager.py: Automatic cleanup of temporary files, threads, and processes
  • error_handler.py: Centralized error handling with severity classification

Example Function

def process_audio_data(audio_data: bytes, sample_rate: int = 16000) -> str:
    """
    Process audio data and return transcription.

    Args:
        audio_data: Raw audio bytes
        sample_rate: Audio sample rate in Hz

    Returns:
        Transcribed text from audio

    Raises:
        AudioProcessingError: If audio processing fails
    """
    # Implementation here
    pass

Error Handling

  • Use custom exceptions for specific error types
  • Provide meaningful error messages
  • Log errors with appropriate levels
  • Don't expose internal errors to users

Logging

import logging

logger = logging.getLogger(__name__)

def some_function():
    logger.debug("Detailed debug information")
    logger.info("General information")
    logger.warning("Warning message")
    logger.error("Error message")
    logger.critical("Critical error")

Testing

Unit Tests

# tests/test_voice_assistant.py
import pytest
from macbot.voice_assistant import VoiceAssistant

class TestVoiceAssistant:
    def test_initialization(self):
        va = VoiceAssistant()
        assert va is not None

    def test_process_command(self):
        va = VoiceAssistant()
        result = va.process_command("test command")
        assert isinstance(result, str)

Integration Tests

# tests/test_integration.py
def test_full_pipeline():
    # Test complete audio -> response pipeline
    pass

Running Tests

# Run all tests
pytest

# Run specific test file
pytest tests/test_voice_assistant.py

# Run with coverage
pytest --cov=src --cov-report=html

Adding New Tools

1. Define Tool Function

# tools/custom_tools.py
def take_screenshot(save_path: str = "~/Desktop") -> str:
    """Take a screenshot and save to specified path."""
    import subprocess
    import os

    path = os.path.expanduser(save_path)
    filename = f"screenshot_{datetime.now().strftime('%Y%m%d_%H%M%S')}.png"
    filepath = os.path.join(path, filename)

    subprocess.run(["screencapture", "-x", filepath])
    return f"Screenshot saved to {filepath}"

2. Register Tool

# src/macbot/voice_assistant.py
from .tools.custom_tools import take_screenshot

class VoiceAssistant:
    def __init__(self):
        self.tools = {
            "take_screenshot": take_screenshot,
            # ... other tools
        }

3. Add to Configuration

# config.yaml
tools:
  enabled:
    - take_screenshot

  take_screenshot:
    save_path: "~/Desktop"

4. Update Prompts

# config.yaml
prompts:
  system: |
    You can use these tools:
    - take_screenshot: Take a screenshot
    # ... other tools

Performance Optimization

Profiling

import cProfile
import pstats

def profile_function():
    profiler = cProfile.Profile()
    profiler.enable()

    # Code to profile
    your_function()

    profiler.disable()
    stats = pstats.Stats(profiler)
    stats.sort_stats('cumulative').print_stats(10)

Memory Optimization

  • Use generators for large data processing
  • Implement proper cleanup in __del__ methods
  • Monitor memory usage with tracemalloc

CPU Optimization

  • Use multiprocessing for CPU-intensive tasks
  • Implement caching for expensive operations
  • Profile with line_profiler

Debugging

Debug Mode

# Enable debug logging
export DEBUG=1
python -m macbot.voice_assistant

# Or set in config
logging:
  level: DEBUG

Remote Debugging

# Add to code for remote debugging
import pdb; pdb.set_trace()

Logging Configuration

# config.yaml
logging:
  level: INFO
  format: '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
  file: macbot.log

Contributing

Pull Request Process

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests
  5. Update documentation
  6. Submit pull request

Commit Message Format

type(scope): description

[optional body]

[optional footer]

Types:

  • feat: New feature
  • fix: Bug fix
  • docs: Documentation
  • style: Code style changes
  • refactor: Code refactoring
  • test: Adding tests
  • chore: Maintenance

Code Review Checklist

  • Code follows style guidelines
  • Tests are included
  • Documentation is updated
  • No breaking changes
  • Performance impact assessed

Deployment

Local Deployment

# Full deployment
./scripts/start_macbot.sh

# Individual services
make run-assistant
make run-llama
python -m macbot.web_dashboard

Production Considerations

  • Set up log rotation
  • Configure monitoring
  • Set appropriate resource limits
  • Implement backup strategies

Security Considerations

Local Security

  • All services run locally - no external data exposure
  • Microphone permissions required
  • File system access limited to user directories

Best Practices

  • Keep dependencies updated
  • Use virtual environments
  • Don't commit sensitive configuration
  • Regular security audits

Support

Getting Help

  • Check existing issues on GitHub
  • Review documentation
  • Join community discussions

Reporting Bugs

Include:

  • System information
  • Steps to reproduce
  • Expected vs actual behavior
  • Log excerpts
  • Configuration used

Local TTS/Audio Quick Checks

source macbot_env/bin/activate

# Speak via VA control API
curl -s -X POST http://localhost:8123/speak \
  -H 'Content-Type: application/json' \
  -d '{"text":"Hello, this is a test"}'

# List Piper voices and apply one
curl -s http://localhost:8123/voices | jq
curl -s -X POST http://localhost:8123/set-voice \
  -H 'Content-Type: application/json' \
  -d '{"voice_path":"piper_voices/en_US-lessac-medium/model.onnx"}'

# Choose output device
curl -s http://localhost:8123/devices | jq
curl -s -X POST http://localhost:8123/set-output -H 'Content-Type: application/json' -d '{"device":5}'