Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 25 additions & 4 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,13 @@ uv run pytest tests/integration/
| `--version` | | Show version and exit | |
| `--help` | | Show help message and exit | |

### Publish Command

| Option | Alias | Description | Default |
|--------|-------|-------------|---------|
| `--path` | `-p` | Path to iTunes library containing `.minkdb/album.json` | Required |
| `--url` | | Lidarr server URL | `http://localhost:8686` |

### Examples

```bash
Expand All @@ -77,20 +84,34 @@ uv run python -m minkdb --path "M:\Music\iTunes" -o ids.json

# Retry matching for previously unmatched albums
uv run python -m minkdb --path "M:\Music\iTunes" --rematch

# Publish curated exact albums to Lidarr
LIDARR_API_KEY="<your-api-key>" uv run python -m minkdb publish --path "M:\Music\iTunes"

# Publish to a non-default Lidarr URL
LIDARR_API_KEY="<your-api-key>" uv run python -m minkdb publish --path "M:\Music\iTunes" --url "http://lidarr.local:8686"
```

### Output

The CLI outputs a JSON array of matched MusicBrainz IDs:
The CLI outputs a JSON array of unique matched MusicBrainz artist IDs:

```json
[
{"MusicBrainzId": "41656317-c512-456f-9fe7-1f7fb8482a34"},
{"MusicBrainzId": "8ccd44fb-1c4a-4c5f-98b5-cf3b35a2aa5c"}
{"MusicBrainzArtistId": "11111111-1111-1111-1111-111111111111"},
{"MusicBrainzArtistId": "22222222-2222-2222-2222-222222222222"}
]
```

### Data Storage

- **User settings**: `~/.minkdb/settings.json`
- **Library catalog**: `<library_path>/.minkdb/catalog.json` (append-only)
- **Album database**: `<library_path>/.minkdb/album.json`
- **Artist database**: `<library_path>/.minkdb/artist.json` (unique by artist MusicBrainz ID)

### Lidarr Publish Behavior

- Reads matched entries from `.minkdb/album.json`
- Requires `LIDARR_API_KEY` to be set
- Adds artists with broad monitoring disabled
- Monitors only exact albums represented in Mink-db catalog
31 changes: 27 additions & 4 deletions MINK.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Overview

CLI tool that reads iTunes library metadata (XML), queries MusicBrainz for Release Group IDs (album IDs), and outputs a JSON array of MusicBrainz IDs.
CLI tool that reads iTunes library metadata (XML), queries MusicBrainz, and outputs a JSON array of unique MusicBrainz artist IDs.

## Architecture

Expand All @@ -18,6 +18,7 @@ minkdb/
├── __init__.py
├── __main__.py
├── cli.py # Click CLI
├── publish.py # Lidarr publish workflow
├── itunes.py # iTunes XML parser
├── musicbrainz.py # MusicBrainz client
├── catalog.py # Orchestration
Expand All @@ -31,7 +32,7 @@ minkdb/
|----------|--------|-----------|
| Package name | `minkdb` | `mink` was taken on PyPI |
| CLI name | `mink-db` | User-friendly with hyphen |
| Output IDs | Release Group IDs | User requirement |
| Output IDs | Artist IDs | User requirement |
| Auth | None (1 req/sec) | YAGNI |
| Matching | Exact only | User requirement |
| Rate limiting | 1 req/second | MusicBrainz requirement |
Expand All @@ -55,6 +56,28 @@ minkdb/
| 4 | Catalog & Matching | ✅ Complete |
| 5 | Output & Integration | ✅ Complete |
| 6 | Fuzzy Matching | ⏳ Deferred |
| 7 | JSON Database v2 (Breaking) | ✅ Complete |
| 8 | Lidarr Publish CLI (Curated Albums) | ✅ Complete |

### Workstream 7: JSON Database v2 (Breaking)

This workstream intentionally introduces breaking data model changes for the local JSON database.

- Rename `catalog.json` to `album.json`
- Introduce new `artist.json` containing unique artists discovered in the iTunes library, keyed by MusicBrainz artist ID
- Add `artist_musicbrainz_id` to each album record as a foreign-key reference to `artist.json`
- Keep this as a proof-of-concept change with no backward compatibility or migration requirement for old files

### Workstream 8: Lidarr Publish CLI (Curated Albums)

Add a new CLI command to publish library metadata from Mink-db storage to Lidarr using `lidarr-py`.

- Add `publish` CLI command that reads Mink-db JSON database from a provided path
- Add optional `--url` argument for Lidarr server URL
- Load `LIDARR_API_KEY` from environment
- If `LIDARR_API_KEY` is missing, exit with a clear user-friendly error message
- Use `lidarr-py` for API integration
- Publish curated exact albums only (no artist-level broad monitoring behavior)

### Future: Fuzzy Matching

Expand All @@ -67,8 +90,8 @@ minkdb/

1. CLI accepts library path as argument
2. Parses iTunes XML metadata correctly
3. Queries MusicBrainz for Release Group IDs
4. Outputs valid JSON array of unique MusicBrainz IDs
3. Queries MusicBrainz for release group and artist identifiers
4. Outputs valid JSON array of unique MusicBrainz artist IDs
5. Respects rate limiting (1 req/sec)
6. Passes ruff linting and ty type checking
7. Handles errors gracefully (skips unmatched, continues)
Expand Down
49 changes: 27 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,16 @@
# Mink-db

A metadata-link between iTunes and MusicBrainz. A CLI tool that catalogs a music library by scanning iTunes libraries and retrieving MusicBrainz Release Group IDs.
A metadata-link between iTunes and MusicBrainz. A CLI tool that catalogs a music library by scanning iTunes libraries and retrieving MusicBrainz artist IDs.

## How-it-works
## How It Works

1. **Reads iTunes XML**: Parses `iTunes Music Library.xml` for album metadata
2. **Deduplicates**: Groups tracks by (artist, album) to avoid duplicate queries
3. **Queries MusicBrainz**: Searches release groups using exact artist + album matching
4. **Caches Results**: Stores album and artist data in `.minkdb/album.json` and `.minkdb/artist.json`
5. **Outputs**: Prints matched IDs to stdout or file

1. **Locates your iTunes Library:** Scans a directory for the `iTunes Music Library.xml` file.
2. **Parses Tracks:** Reads and indexes individual track metadata from the XML library.
3. **Aggregates Albums:** Groups tracks into unique album entities based on tags.
4. **References the Catalog:** Compares found albums against the local `./minkdb` database of existing matches.
5. **Reconciles (MusicBrainz):** Queries the MusicBrainz API to link local albums to official IDs.
6. **Generates Output:** Finalizes the metadata-link and updates the local data store.
On subsequent runs, Mink-db will skip already-matched albums and only query MusicBrainz for new ones.

## Installation

Expand Down Expand Up @@ -50,33 +51,37 @@ minkdb --path "M:\Music\iTunes" -o ids.json

# Retry matching for previously unmatched albums
minkdb --path "M:\Music\iTunes" --rematch

# Publish curated exact albums to Lidarr
LIDARR_API_KEY="<your-api-key>" minkdb publish --path "M:\Music\iTunes"

# Publish to a non-default Lidarr URL
LIDARR_API_KEY="<your-api-key>" minkdb publish --path "M:\Music\iTunes" --url "http://lidarr.local:8686"
```

## Output Format

Mink-db outputs a JSON array of matched MusicBrainz IDs:
Mink-db outputs a JSON array of unique matched MusicBrainz artist IDs:

```json
[
{"MusicBrainzId": "41656317-c512-456f-9fe7-1f7fb8482a34"},
{"MusicBrainzId": "8ccd44fb-1c4a-4c5f-98b5-cf3b35a2aa5c"}
{"MusicBrainzArtistId": "11111111-1111-1111-1111-111111111111"},
{"MusicBrainzArtistId": "22222222-2222-2222-2222-222222222222"}
]
```

## How It Works

1. **Reads iTunes XML**: Parses `iTunes Music Library.xml` for album metadata
2. **Deduplicates**: Groups tracks by (artist, album) to avoid duplicate queries
3. **Queries MusicBrainz**: Searches for Release Group IDs using exact artist + album matching
4. **Caches Results**: Stores results in `.minkdb/catalog.json` (append-only)
5. **Outputs**: Prints matched IDs to stdout or file
## Data Storage

On subsequent runs, Mink-db will skip already-matched albums and only query MusicBrainz for new ones.
- **Album database**: `<library_path>/.minkdb/album.json`
- **Artist database**: `<library_path>/.minkdb/artist.json` (unique by artist MusicBrainz ID)
- **Album FK**: Each album row includes `artist_musicbrainz_id` referencing `artist.json`

## Data Storage
## Lidarr Publish

- **Catalog database**: `<library_path>/.minkdb/catalog.json`
- **Append-only**: Previous entries are preserved and updated
- Uses `lidarr-py` for API integration
- Reads curated matched album metadata from `.minkdb/album.json`
- Requires `LIDARR_API_KEY` in environment
- Adds artists with broad monitoring disabled and monitors only exact albums from Mink-db

## Requirements

Expand Down
20 changes: 12 additions & 8 deletions minkdb/catalog.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

from minkdb.database import AlbumEntry, append_to_catalog, load_catalog
from minkdb.itunes import Track, find_itunes_xml, parse_itunes_xml
from minkdb.musicbrainz import search_release_group
from minkdb.musicbrainz import search_release_group_match


def find_and_parse_itunes(
Expand Down Expand Up @@ -63,14 +63,15 @@ def process_albums(
if (artist, album) in existing:
entry = existing[(artist, album)]
else:
mbid = search_release_group(artist, album)
match = search_release_group_match(artist, album)
timestamp = datetime.now().replace(tzinfo=timezone.utc).isoformat()
entry = AlbumEntry(
artist=artist,
album=album,
musicbrainz_id=mbid,
matched_at=timestamp if mbid else None,
status="matched" if mbid else "unmatched",
musicbrainz_id=match.release_group_id,
matched_at=timestamp if match.release_group_id else None,
status="matched" if match.release_group_id else "unmatched",
artist_musicbrainz_id=match.artist_musicbrainz_id,
)
append_to_catalog(entry, library_path)

Expand All @@ -80,9 +81,12 @@ def process_albums(


def get_catalog_output(entries: list[AlbumEntry]) -> list[dict]:
"""Generate the output list of matched MusicBrainz IDs."""
"""Generate unique MusicBrainz artist ID output records."""
output = []
seen_artist_ids: set[str] = set()
for entry in entries:
if entry.musicbrainz_id:
output.append({"MusicBrainzId": entry.musicbrainz_id})
artist_id = entry.artist_musicbrainz_id
if artist_id and artist_id not in seen_artist_ids:
seen_artist_ids.add(artist_id)
output.append({"MusicBrainzArtistId": artist_id})
return output
Loading