Enrich library_artists.txt with WXYC cross-reference data#1
Merged
jakebromberg merged 4 commits intomainfrom Feb 11, 2026
Merged
Enrich library_artists.txt with WXYC cross-reference data#1jakebromberg merged 4 commits intomainfrom
jakebromberg merged 4 commits intomainfrom
Conversation
added 4 commits
February 10, 2026 17:23
Stage 1 filtering uses exact artist name matching, which misses releases credited under alternate names (e.g., "Body Count" filed under Ice-T). Add scripts/enrich_library_artists.py to generate library_artists.txt from library.db and optionally enrich it with three WXYC MySQL sources: - LIBRARY_RELEASE.ALTERNATE_ARTIST_NAME (~3,935 alternate names) - LIBRARY_CODE_CROSS_REFERENCE (~189 artist-to-artist links) - RELEASE_CROSS_REFERENCE (~29 artist-to-release collaboration links) Integrate as optional step 2.5 in run_pipeline.py via --wxyc-db-url. Add ruff format check to CI and a pre-commit hook for ruff check+format.
Move __future__ imports after module docstrings to fix E402, sort imports to fix I001, unquote type annotations to fix UP037, and run ruff format. Set packages=[] in pyproject.toml so setuptools does not discover non-package directories (hooks, schema, migrations).
The @DataClass decorator needs cls.__module__ to resolve in sys.modules. Without registration, the importlib-loaded module has no entry and the lookup fails on Python 3.12+.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stage 1 filtering uses exact artist name matching, which misses releases credited under alternate names (e.g., "Body Count" filed under Ice-T).
Add scripts/enrich_library_artists.py to generate library_artists.txt from library.db and optionally enrich it with three WXYC MySQL sources:
Integrate as optional step 2.5 in run_pipeline.py via --wxyc-db-url. Add ruff format check to CI and a pre-commit hook for ruff check+format.