Skip to content

Unified SDK + CLI + Asset Manager + Smoke Tests#22

Open
Copilot wants to merge 1 commit intomasterfrom
copilot/fix-20
Open

Unified SDK + CLI + Asset Manager + Smoke Tests#22
Copilot wants to merge 1 commit intomasterfrom
copilot/fix-20

Conversation

Copy link
Copy Markdown

Copilot AI commented Sep 24, 2025

Summary

Refactor the repo into a small, installable SDK with a single CLI, an asset manager for model files, and a smoke-test suite. Keep all current scripts working, but route them through the new common APIs to reduce duplication and make onboarding dead simple.


Why

  • Every classifier has its own entrypoint/IO. A common interface makes demos, docs, and CI far easier.
  • Model weights are scattered (LFS/Drive/manual). A manifest + downloader prevents broken setups.
  • Quick smoke tests in CI catch regressions (OpenCV/TF updates, path issues, missing assets).

Deliverables

  1. SDK layer (installable)

    • New package: ai_ml_classifiers/

    • Base protocol:

      class Classifier(Protocol):
          name: str
          tasks: tuple[str, ...]  # e.g. ("vehicle",)
          def load(self, device: str = "cpu") -> None: ...
          def predict(self, source: str|Path|int, **kwargs) -> "Prediction": ...
    • Registry: from ai_ml_classifiers import get, list_tasks

    • Uniform Prediction dataclass: boxes/labels/scores or text for OCR/ASR, with timestamps for video.

  2. CLI (aimc)

    • aimc run <task> --source webcam|image|video --device cpu|cuda --top-k 5
    • aimc assets sync (download all needed weights)
      aimc assets doctor (checksums, paths)
    • aimc list (available tasks)
  3. Asset Manager

    • assets/manifest.yaml for every weight/file: {id, task, filename, bytes, sha256, urls:[lfs,hf,gdrive]}
    • Downloader with progress + checksum + cache (~/.aimc/assets)
    • Graceful fallbacks (try next URL if one fails)
  4. Smoke tests (pytest)

    • Tiny inputs per task in assets/samples/
    • pytest -m smoke runs one frame/sample per classifier (skip if asset missing)
    • CI: Ubuntu job runs assets sync, then pytest -m smoke -q
  5. Docs

    • Top-level README: “Quickstart (CLI)”, “SDK usage”, “Assets”
    • Per-task mini READMEs become short pages under docs/ or sections in main README
    • Table mapping old scripts → new commands
  6. Back-compat

    • Keep legacy scripts (vehicle_detection.py, etc.) but refactor internals to call the SDK
    • Flask app imports SDK instead of duplicating logic
  7. Nice-to-have (optional if time)

    • --onnx path for 1–2 models to speed up CPU demos
    • --half (FP16) when CUDA is detected
    • Simple benchmark: aimc bench <task> --source image --repeat 50

Suggested file layout

ai-ml-classifiers/
  ai_ml_classifiers/
    __init__.py
    api.py               # registry, base types, Prediction
    assets.py            # downloader, checksums, cache
    utils/io.py          # image/video/webcam loaders
    tasks/
      vehicles.py
      faces.py
      mood.py
      flowers.py
      objects.py
      ocr.py
      animals.py
      speech.py
      sentiment.py
  cli/aimc.py            # click/typer CLI entrypoint
  assets/manifest.yaml
  tests/smoke/
    test_vehicles.py
    ...
  pyproject.toml         # package + console_scripts = ["aimc=cli.aimc:main"]

Acceptance criteria

  • pip install -e . exposes aimc in PATH.
  • aimc list shows at least the 9 current tasks.
  • aimc assets sync downloads required weights; doctor reports OK with checksums.
  • aimc run vehicle --source assets/samples/traffic.mp4 produces labeled frames.
  • pytest -m smoke passes locally and in GitHub Actions.
  • Legacy scripts continue to work (but now import from the SDK).

Task checklist

  • Create ai_ml_classifiers package, base Classifier + Prediction
  • Implement registry + per-task adapters (wrap existing code)
  • Implement assets/manifest.yaml + downloader with checksum verification
  • Add CLI (typer or click) with run/list/assets
  • Add sample inputs and tiny gold outputs for smoke tests
  • Wire GitHub Actions: setup-python, pip install -e ., aimc assets sync, pytest -m smoke
  • Update README (Quickstart, SDK, Assets, CI)
  • Refactor Flask app to call SDK (keep routes/UX unchanged)
  • Mark legacy scripts as “thin wrappers” (one-line import + call)

Notes / risks

  • Large weights: keep LFS references, but manifest should include multiple mirrors.
  • Platform deps: gate PyAudio/Tesseract tests with markers & skips; document install hints per OS.
  • CUDA optional: auto-detect, but default to CPU; no hard runtime dependency on CUDA.

Fixes #20.

Copy link
Copy Markdown
Author

Copilot AI commented Sep 24, 2025

@hoangsonww 👋 This repository doesn't have Copilot instructions. With Copilot instructions, I can understand the repository better, work faster and produce higher quality PRs.

I can generate a .github/copilot-instructions.md file for you automatically. Click here to open a pre-filled issue and assign it to me. I'll write the instructions, and then tag you for review.

@netlify
Copy link
Copy Markdown

netlify bot commented Sep 24, 2025

Deploy Preview for ai-classifiers failed.

Name Link
🔨 Latest commit d26a1f8
🔍 Latest deploy log https://app.netlify.com/projects/ai-classifiers/deploys/68d34bd91b647b0008f88a87

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@hoangsonww hoangsonww added documentation Improvements or additions to documentation enhancement New feature or request good first issue Good for newcomers dependencies Pull requests that update a dependency file labels Sep 24, 2025
Copilot AI requested a review from hoangsonww September 24, 2025 01:40
@hoangsonww hoangsonww marked this pull request as ready for review September 24, 2025 01:42
@hoangsonww hoangsonww changed the title [WIP] Unified SDK + CLI + Asset Manager + Smoke Tests Unified SDK + CLI + Asset Manager + Smoke Tests Sep 24, 2025
@hoangsonww hoangsonww moved this from Ready to Backlog in AI/ML Classifiers Kanban Board Feb 5, 2026
@hoangsonww hoangsonww moved this from Backlog to Ready in AI/ML Classifiers Kanban Board Feb 5, 2026
@hoangsonww hoangsonww moved this from Ready to In progress in AI/ML Classifiers Kanban Board Apr 13, 2026
@hoangsonww hoangsonww moved this from In progress to Ready in AI/ML Classifiers Kanban Board Apr 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file documentation Improvements or additions to documentation enhancement New feature or request good first issue Good for newcomers

Projects

Status: Ready

Development

Successfully merging this pull request may close these issues.

Unified SDK + CLI + Asset Manager + Smoke Tests

3 participants