Complete developer guide for contributing to Discogsography
π Back to Main | π Documentation Index | π Quick Start
This guide covers the development workflow, tools, and best practices for working on Discogsography. Whether you're fixing bugs, adding features, or improving performance, this guide will help you get started.
Discogsography leverages cutting-edge Python tooling for maximum developer productivity and code quality.
| Tool | Purpose | Configuration |
|---|---|---|
| uv | 10-100x faster package management | pyproject.toml |
| ruff | Lightning-fast linting & formatting | pyproject.toml |
| mypy | Strict static type checking | pyproject.toml |
| bandit | Security vulnerability scanning | pyproject.toml |
| pre-commit | Git hooks for code quality | .pre-commit-config.yaml |
| just | Task runner (like make, but better) | justfile |
uv: 10-100x faster than pip, with better dependency resolution ruff: Replaces flake8, isort, pyupgrade, and more - all in one fast tool mypy: Catch type errors before runtime bandit: Find security vulnerabilities automatically pre-commit: Ensure code quality before every commit just: Simple, cross-platform task automation
discogsography/
βββ π api/ # User auth, graph queries, OAuth, sync
β βββ api.py # FastAPI application entry point
β βββ auth.py # JWT helpers and OAuth token encryption
β βββ limiter.py # Shared slowapi rate-limiter instance
β βββ setup.py # discogs-setup CLI tool
β βββ routers/ # FastAPI routers (auth, explore, sync, user, snapshot, oauth, nlq, rarity, etc.)
β βββ README.md
β βββ __init__.py
βββ π§ brainzgraphinator/ # MusicBrainz Neo4j enrichment service
β βββ brainzgraphinator.py # Enriches Neo4j nodes with MusicBrainz metadata
β βββ README.md
β βββ __init__.py
βββ 𧬠brainztableinator/ # MusicBrainz PostgreSQL storage service
β βββ brainztableinator.py # Stores MusicBrainz data in PostgreSQL
β βββ README.md
β βββ __init__.py
βββ π¦ common/ # Shared utilities and configuration
β βββ config.py # Centralized configuration management
β βββ health_server.py # Health check endpoint server
β βββ __init__.py
βββ π dashboard/ # Real-time monitoring dashboard
β βββ dashboard.py # FastAPI backend with WebSocket
β βββ admin_proxy.py # Admin panel proxy to API service
β βββ tailwind.config.js # Tailwind CLI configuration (content paths, plugins)
β βββ tailwind.input.css # Tailwind source directives (@tailwind base/β¦)
β βββ static/ # Frontend HTML/CSS/JS (Tailwind, SVG gauges)
β β βββ index.html
β β βββ tailwind.css # Generated at Docker build time by css-builder stage
β β βββ styles.css
β β βββ dashboard.js
β βββ README.md
β βββ __init__.py
βββ π₯ extractor/ # Rust-based high-performance extractor
β βββ src/
β β βββ main.rs # Rust processing logic
β βββ benches/ # Rust benchmarks
β βββ tests/ # Rust unit tests
β βββ Cargo.toml # Rust dependencies
β βββ README.md
βββ π explore/ # Static frontend for graph exploration UI
β βββ explore.py # FastAPI static file server (health check only)
β βββ tailwind.config.js # Tailwind CLI configuration (content paths, plugins)
β βββ tailwind.input.css # Tailwind source directives (@tailwind base/β¦)
β βββ static/ # Frontend HTML/CSS/JS (Tailwind, Alpine.js, D3.js, Plotly.js)
β β βββ index.html
β β βββ tailwind.css # Generated at Docker build time by css-builder stage
β β βββ css/styles.css
β β βββ js/ # Modular JS (app, graph, trends, auth, etc.)
β βββ README.md
β βββ __init__.py
βββ π graphinator/ # Neo4j graph database service
β βββ graphinator.py # Graph relationship builder
β βββ README.md
β βββ __init__.py
βββ π insights/ # Precomputed analytics and music trends
β βββ insights.py # Insights service entry point (scheduler + endpoints)
β βββ computations.py # Computation orchestration (fetches from API over HTTP)
β βββ cache.py # Redis cache-aside layer
β βββ models.py # Pydantic response models
β βββ README.md
β βββ __init__.py
βββ π€ mcp-server/ # AI assistant MCP server
β βββ server.py # FastMCP server exposing knowledge graph
β βββ README.md
β βββ __init__.py
βββ π§ schema-init/ # One-shot database schema initializer
β βββ schema_init.py # Entry point β creates Neo4j + PostgreSQL schema
β βββ neo4j_schema.py # Neo4j constraints and indexes
β βββ postgres_schema.py # PostgreSQL tables and indexes
β βββ Dockerfile
β βββ __init__.py
βββ π tableinator/ # PostgreSQL storage service
β βββ tableinator.py # Relational data management
β βββ README.md
β βββ __init__.py
βββ π§ utilities/ # Operational tools
β βββ check_errors.py # Log analysis
β βββ monitor_queues.py # Real-time queue monitoring
β βββ system_monitor.py # System health dashboard
β βββ __init__.py
βββ π§ͺ tests/ # Comprehensive test suite
β βββ api/ # API service tests
β βββ brainzgraphinator/ # Brainzgraphinator tests
β βββ brainztableinator/ # Brainztableinator tests
β βββ common/ # Common module tests
β βββ dashboard/ # Dashboard tests (including E2E)
β βββ explore/ # Explore service tests
β βββ graphinator/ # Graphinator tests
β βββ insights/ # Insights service tests
β βββ mcp-server/ # MCP server tests
β βββ schema-init/ # Schema initializer tests
β βββ tableinator/ # Tableinator tests
βββ π docs/ # Documentation
βββ π scripts/ # Utility scripts
β βββ update-project.sh # Dependency upgrade automation
β βββ README.md
βββ π docker-compose.yml # Container orchestration
βββ π .env.example # Environment variable template
βββ π .pre-commit-config.yaml # Pre-commit hooks configuration
βββ π justfile # Task automation
βββ π pyproject.toml # Project configuration (root)
βββ π README.md # Project overview
# Install uv (package manager)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install just (task runner)
brew install just # macOS
# or: cargo install just
# or: https://just.systems/install.sh
# Verify installations
uv --version
just --version# Clone repository
git clone https://github.com/SimplicityGuy/discogsography.git
cd discogsography
# Install all dependencies (including dev dependencies)
just install
# Or using uv directly
uv sync --all-extras# Install pre-commit hooks
just init
# Or using uv directly
uv run pre-commit install# Start databases and message queue
docker-compose up -d neo4j postgres rabbitmq redis
# Verify they're running
docker-compose ps# Copy example environment file
cp .env.example .env
# Edit for local development (or use defaults)
nano .envSee Configuration Guide for all environment variables.
# Dashboard (monitoring UI)
just dashboard
# Explore (graph exploration & trends)
just explore
# Extractor (Rust-based data ingestion - requires cargo)
just extractor-run
# Graphinator (Neo4j builder)
just graphinator
# Insights (precomputed analytics & trends)
just insights
# Tableinator (PostgreSQL builder)
just tableinator
# Brainzgraphinator (MusicBrainz β Neo4j enrichment)
just brainzgraphinator
# Brainztableinator (MusicBrainz β PostgreSQL)
just brainztableinator
# MCP Server (AI assistant integration)
just mcp-server# Run all quality checks
just lint # Linting with ruff
just format # Code formatting with ruff
just lint-python # Linting with ruff + type checking with mypy
just security # Security scan with bandit
# Or run everything at once
uv run pre-commit run --all-files-
Create a branch:
git checkout -b feature/my-feature
-
Make your changes:
- Follow coding standards (see below)
- Add type hints
- Write docstrings
- Update tests
-
Test your changes:
just test # Run tests just test-cov # With coverage
-
Check code quality:
just lint just format just typecheck just security
-
Commit changes:
git add . git commit -m "feat: add amazing feature" # Pre-commit hooks will run automatically
-
Push and create PR:
git push origin feature/my-feature # Create pull request on GitHub
tests/
βββ api/ # API service tests (auth, routers, queries)
βββ brainzgraphinator/ # Brainzgraphinator tests
βββ brainztableinator/ # Brainztableinator tests
βββ common/ # Common module tests
βββ dashboard/ # Dashboard tests
β βββ test_dashboard_ui.py # E2E tests with Playwright
βββ explore/ # Explore service tests
βββ graphinator/ # Graphinator tests
βββ insights/ # Insights service tests
βββ mcp-server/ # MCP server tests
βββ schema-init/ # Schema initializer tests
βββ tableinator/ # Tableinator tests
Tests run in parallel by default using pytest-xdist (-n auto --dist loadfile is set in pyproject.toml). This reduces the full suite from ~15 minutes to ~5 minutes.
# All tests (excluding E2E) β runs in parallel automatically
just test
# With coverage report (parallel)
just test-cov
# Specific test file
uv run pytest tests/api/test_neo4j_queries.py
# Specific test function
uv run pytest tests/api/test_neo4j_queries.py::test_search_artists
# Sequential execution (for debugging, shows cleaner output)
uv run pytest -n 0 -s
# With verbose output
uv run pytest -v# One-time setup
uv run playwright install chromium
uv run playwright install-deps chromium
# Run E2E tests
just test-e2e
# Or directly
uv run pytest tests/dashboard/test_dashboard_ui.py -m e2e
# With specific browser
uv run pytest tests/dashboard/test_dashboard_ui.py -m e2e --browser firefox
# Run in headed mode (see browser)
uv run pytest tests/dashboard/test_dashboard_ui.py -m e2e --headedThe Explore frontend's modular JavaScript files are tested using Vitest:
# Install JS dependencies (one-time)
just install-js
# Run JS tests
just test-js
# Run JS tests with coverage
just test-js-covJavaScript tests are also included in the CI pipeline (test.yml) and in just test-parallel.
See Testing Guide for comprehensive testing documentation.
Follow PEP 8 with these tools:
- ruff: Linting and formatting (replaces flake8, isort, pyupgrade, black β 150 character line length)
- mypy: Static type checking
# Auto-format code
just format
# Check for issues
just lint
# Type check
just typecheckAlways use type hints for function parameters and return values:
# β
Good
def process_artist(artist_id: str, data: dict) -> bool:
"""Process artist data."""
...
# β Bad
def process_artist(artist_id, data):
...Write docstrings for all public functions and classes:
def calculate_similarity(artist1: str, artist2: str) -> float:
"""Calculate similarity score between two artists.
Args:
artist1: Name of first artist
artist2: Name of second artist
Returns:
Similarity score between 0.0 and 1.0
Raises:
ValueError: If artist names are empty
"""
...Use emoji-prefixed logging for consistency (with structlog β see Logging Guide):
import structlog
logger = structlog.get_logger(__name__)
# Startup
logger.info("π Starting service...")
# Success
logger.info("β
Operation completed successfully")
# Error
logger.error("β Failed to connect to database")
# Warning
logger.warning("β οΈ Connection timeout, retrying...")
# Progress
logger.info("π Processed 1000 records")See Logging Guide and Emoji Guide for complete logging standards.
Always handle errors gracefully:
# β
Good
try:
result = perform_operation()
except ValueError as e:
logger.error(f"β Invalid value: {e}")
raise
except ConnectionError as e:
logger.warning(f"β οΈ Connection failed: {e}, retrying...")
retry_operation()
# β Bad
try:
result = perform_operation()
except: # Don't use bare except
pass # Don't silently ignore errorsNever log sensitive data:
# β
Good
logger.info(f"π Connecting to database at {host}")
# β Bad
logger.info(f"π Connecting with password: {password}")Use parameterized queries:
# β
Good
cursor.execute(
"SELECT * FROM artists WHERE name = %s",
(artist_name,)
)
# β Bad (SQL injection vulnerability)
cursor.execute(f"SELECT * FROM artists WHERE name = '{artist_name}'")# Set environment variable
export LOG_LEVEL=DEBUG
# Run service
uv run python dashboard/dashboard.py
# Or with Docker
LOG_LEVEL=DEBUG docker-compose up# Add breakpoint
import pdb; pdb.set_trace()
# Or Python 3.7+
breakpoint()Create .vscode/launch.json:
{
"version": "0.2.0",
"configurations": [
{
"name": "Python: Dashboard",
"type": "python",
"request": "launch",
"program": "${workspaceFolder}/dashboard/dashboard.py",
"console": "integratedTerminal",
"env": {
"LOG_LEVEL": "DEBUG"
}
}
]
}import cProfile
import pstats
# Profile function
profiler = cProfile.Profile()
profiler.enable()
# Your code here
process_data()
profiler.disable()
stats = pstats.Stats(profiler)
stats.sort_stats('cumulative')
stats.print_stats(20) # Top 20 functions# Install memory profiler
uv add --dev memory-profiler
# Run with profiler
uv run python -m memory_profiler script.py# Scan for vulnerabilities
just security
# Or directly
uv run bandit -r . -ll# Check for known vulnerabilities
uv run pip-audit
# Update dependencies
./scripts/update-project.sh- Use Markdown for all documentation
- Follow the Emoji Guide for consistency
- Add Mermaid diagrams where helpful
- Include code examples
- Keep documentation up-to-date
# Add new documentation to docs/
# Update docs/README.md
# Link from main README.md- Write tests first (TDD when possible)
- Keep functions small and focused
- Use descriptive variable names
- Avoid magic numbers - use constants
- Handle errors explicitly
- Log important events
- Document complex logic
- Optimize only when needed (measure first)
Use Conventional Commits:
feat: add new feature
fix: correct bug
docs: update documentation
style: format code
refactor: restructure code
test: add tests
chore: update dependencies
- Code follows style guide
- Tests are included and pass
- Type hints are complete
- Documentation is updated
- No security vulnerabilities
- Performance is acceptable
- Error handling is robust
- Logging is appropriate
The project uses GitHub Actions for CI/CD:
- Build: Verify Docker images build correctly
- Code Quality: Run linters and type checkers
- Tests: Run unit and integration tests
- E2E Tests: Run Playwright tests
- Security: Scan for vulnerabilities
See GitHub Actions Guide for details.
Local checks before commit:
# .pre-commit-config.yaml
repos:
- repo: https://github.com/astral-sh/ruff-pre-commit
hooks:
- id: ruff
args: [--fix]
- id: ruff-format
- repo: https://github.com/pre-commit/mirrors-mypy
hooks:
- id: mypy# Clear cache
uv cache clean
# Reinstall dependencies
rm -rf .venv
uv sync --all-extras# Update pre-commit
uv run pre-commit autoupdate
# Re-install hooks
uv run pre-commit install --install-hooks# Run single test with verbose output
uv run pytest tests/path/to/test.py::test_name -vv
# Show stdout
uv run pytest -s
# Debug with pdb
uv run pytest --pdb- Testing Guide - Comprehensive testing documentation
- Contributing Guide - How to contribute
- GitHub Actions Guide - CI/CD workflows
- Logging Guide - Logging standards
- Emoji Guide - Emoji conventions
Last Updated: 2026-03-27