Skip to content

Add examples, tests, documentation, and CI/CD pipeline#1

Merged
craftsangjae merged 10 commits into
mainfrom
feat/add-examples
Feb 16, 2026
Merged

Add examples, tests, documentation, and CI/CD pipeline#1
craftsangjae merged 10 commits into
mainfrom
feat/add-examples

Conversation

@craftsangjae
Copy link
Copy Markdown
Owner

Add examples, tests, documentation, and CI/CD pipeline

Summary

This PR adds comprehensive examples, real-world testing, documentation, and CI/CD infrastructure to the muvera-python project.

🎯 Major Changes

1. Examples & Benchmarks

  • examples/basic_usage.py: Demonstrates core API with random data and Recall@N evaluation
  • examples/colbert_nanobeir.py: Full ColBERT + NanoBEIR benchmark comparing native MaxSim vs FDE
    • Fixed critical qrels loading bug (59.3% data loss due to dict comprehension)
    • Improved Recall@10: 0.52 → 0.72 (native), 0.32 → 0.50 (FDE)

2. Real-World Testing

  • tests/test_real_colbert.py: 6 tests using actual ColBERT embeddings from NanoBEIR
    • Validates FDE performance on real data (35 docs, 5 queries)
    • Tests correlation (>0.7), Recall@K, ranking quality
    • Cached fixtures (~2.2MB) for fast CI execution

3. Code Refactoring

  • Simplified API: Removed uniform batch support (3D arrays)
    • Single: np.ndarray (N, D)
    • Batch: list[np.ndarray] (variable-length, recommended)
  • Improved structure: Split large methods into focused helpers (<20 lines each)
    • _compute_sketches() - SimHash projection
    • _compute_projection() - Identity or AMS sketch
    • _aggregate_single/batch() - Partition aggregation
  • 26% smaller: 560 → 416 lines

4. Documentation

  • CLAUDE.md: Comprehensive guide for Claude Code
    • Development commands (setup, testing, deployment)
    • Architecture overview (algorithm flow, batch processing)
    • Code conventions and testing strategy
    • CI/CD and deployment policy

5. CI/CD Pipeline

  • GitHub Actions:
    • .github/workflows/test.yml: Test on Python 3.9-3.13, run ruff/mypy/pytest
    • .github/workflows/publish.yml: Tag-based PyPI publishing with OIDC
  • Pre-commit hooks:
    • ruff (lint + format)
    • mypy (type checking)
    • pytest (all tests)

6. Comprehensive Testing

  • 70 tests total (was 64):
    • test_helper.py: Low-level utilities (24 tests)
    • test_muvera.py: Core class functionality (25 tests)
    • test_reference.py: Validation vs reference impl (15 tests)
    • test_real_colbert.py: Real-world ColBERT data (6 tests) ✨ NEW

📊 Test Results

pytest tests/ -q
# 70 passed in 2.86s ✓

Real ColBERT Test Results:

  • Fixture loading: ✓
  • FDE encoding shapes: ✓
  • Correlation with native MaxSim: >0.7 ✓
  • Recall@K metrics: ✓
  • Ranking quality: ✓
  • Determinism: ✓

🐛 Bug Fixes

Critical: ColBERT NanoBEIR qrels loading (examples/colbert_nanobeir.py:154)

# Before (59.3% data loss!)
qrels = {str(row["query-id"]): {str(row["corpus-id"]): 1} for row in qrels_ds}
# → Only 50 relevance judgments loaded

# After (correct)
qrels = defaultdict(dict)
for row in qrels_ds:
    qrels[str(row["query-id"])][str(row["corpus-id"])] = 1
# → All 123 relevance judgments loaded

Impact: Native Recall@10 improved from 0.52 → 0.72, FDE from 0.32 → 0.50

📝 Files Changed

  • Added: 13 new files (examples, tests, workflows, docs, fixtures)
  • Modified: 7 files (muvera.py refactored, configs updated)
  • Total: +2,166 / -274 lines

🚀 Deployment

Ready for PyPI release via tag-based workflow:

git tag v0.1.0
git push origin v0.1.0
# → Automatically tests, builds, and publishes to PyPI

✅ Checklist

  • All tests pass (70/70)
  • Code formatted (ruff)
  • Type checked (mypy)
  • Pre-commit hooks configured
  • Documentation complete
  • Examples working
  • CI/CD tested

Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com

Signed-off-by: craftsangjae <craftsangjae@gmail.com>
Signed-off-by: craftsangjae <craftsangjae@gmail.com>
Signed-off-by: craftsangjae <craftsangjae@gmail.com>
Signed-off-by: craftsangjae <craftsangjae@gmail.com>
Signed-off-by: craftsangjae <craftsangjae@gmail.com>
Signed-off-by: craftsangjae <craftsangjae@gmail.com>
Signed-off-by: craftsangjae <craftsangjae@gmail.com>
Signed-off-by: craftsangjae <craftsangjae@gmail.com>
…sistency

Signed-off-by: craftsangjae <craftsangjae@gmail.com>
Signed-off-by: craftsangjae <craftsangjae@gmail.com>
@craftsangjae craftsangjae merged commit 0fe67a3 into main Feb 16, 2026
6 checks passed
@craftsangjae craftsangjae deleted the feat/add-examples branch February 16, 2026 01:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant