Skip to content

feat(sync): schema v3 nested Y.Map metadata — lazy migration#52

Open
kavinsood wants to merge 1 commit into
fix/sql-storage-migrationfrom
fix/nested-ymaps-metadata
Open

feat(sync): schema v3 nested Y.Map metadata — lazy migration#52
kavinsood wants to merge 1 commit into
fix/sql-storage-migrationfrom
fix/nested-ymaps-metadata

Conversation

@kavinsood
Copy link
Copy Markdown
Owner

Summary

Introduces YAOS schema v3 metadata storage.

File metadata is now written as nested Yjs maps instead of opaque JSON objects. This gives field-level CRDT behavior for metadata updates — renaming a file on device A while device B updates the mtime no longer causes one device's change to silently overwrite the other. mtime-only saves no longer generate a whole-object tombstone for every file save.

The migration is lazy: existing flat v2 metadata remains readable indefinitely and is converted one entry at a time when touched by v3 writes. There is no full-vault metadata migration.

Dependency: This PR targets fix/sql-storage-migration (PR #51). It should only be merged after that PR is merged and baked. After that merge, this branch rebases onto main — it is a single clean commit.


What changed

Data model

  • meta[fileId] entries are now Y.Map { path, mtime?, device? } instead of opaque FileMeta JSON objects
  • Tombstones: Y.Map { path, deletedAt } — same minimal shape, nested
  • Old flat entries remain readable and are upgraded lazily on the next write to that file
  • sys.schemaVersion bumped to 3 on first v3 client connect — no metadata loop

New src/sync/fileMeta.ts

Single authoritative interface for reading and writing metadata. Dual-shape decoders, read helpers (getMetaPath, getMetaMtime etc.), write helpers (createNestedActiveMeta, ensureNestedMetaEntry), incremental semantic diff, and shape statistics. No call site accesses metadata entries directly.

New src/sync/schema.ts

SCHEMA_VERSION = 3 in a pure, Obsidian-free module. Importable in tests and server code without the Obsidian dependency.

Semantic observer (observeMetaChanges)

Single shared meta.observeDeep handler on VaultSync. Uses Yjs event paths for O(k) incremental diff (k = affected file IDs) instead of O(N) full-map scan on every metadata change. Dispatches MetaChangeBatch { origin, isLocal, changes } to consumers.

DiskMirror and the witness tracker subscribe to observeMetaChanges — they no longer use shallow meta.observe(), which would have silently dropped all nested field mutations.

DiskMirror

  • Subscribes to observeMetaChanges for remote nested delete/rename/revive
  • Skips local-origin batches (isLocal: true) — local metadata writes never feed back as remote file operations
  • consumeRemoteRename(newPath): boolean — consume-on-use pattern (matches existing consumeDeleteSuppression) marks rename as DiskMirror-originated before app.fileManager.renameFile is called, consumed by the vault rename handler
  • main.ts vault rename handler skips queueRename when consumeRemoteRename returns true — passive receiver renames never re-enter CRDT

Server

  • SERVER_MIN/MAX_SCHEMA_VERSION bumped to 3 — old v2 clients rejected
  • documentSummary debug response includes flatMetaEntries, nestedMetaEntries, invalidMetaEntries shape counters
  • countActivePathsInDoc / computeDocStats dual-read both v2 and v3 shapes — server can cold-boot a v2 persisted room safely

Analyzer

orphan-after-rename rule gained a remoteOrigin exemption: passive receiver devices call handleRemoteRename which triggers a disk-level rename, emitting disk.rename.observed. This correctly has no crdt.file.renamed counterpart (the CRDT rename was on the other device). The remoteOrigin: true flag in the event data exempts it.


Tests

Suite Assertions
file-meta-decode.ts 112 — decoder, type guards, read/write helpers
file-meta-lazy-write.ts 34 — no-storm proof, concurrent convergence, untouched entries stay flat
meta-observer-integration.ts 40 — nested mutations fire semantic changes; incremental diff; origin filtering
meta-v3-schema-gate-and-stats.ts 47 — server constants imported from real source; mixed metadata stats; SQL round-trip
meta-diskmirror-integration.ts 61 — real DiskMirror, spied handlers; remote nested delete/rename/revive trigger correct ops; local changes are ignored; consumeRemoteRename semantics
Full CI 73 regression suites, 0 failed

QA scenario — S15

Two-vault CDP scenario on ~/temenos + ~/temenos-b against the production server (SQL storage, schema v3):

Phase Result
1. Create hash match ✓
2. Rename old path gone, new hash match ✓
3. Delete file gone from B ✓
4. Revive hash match ✓
5. mtime-only B disk hash unchanged ✓
6. Schema version A=3 B=3 ✓

Both analyzer passes: 0 hard failures. Exit 0.

Server post-run: flatMeta: 1403, nestedMeta: 5 — exactly the 5 files touched during the scenario lazily converted. 1,403 entries remain flat.


Safety

  • Server cold-boots a v2 persisted room without crashing (dual-read verified)
  • No full-vault metadata migration on startup or connect
  • No distributed migration storm — touching 5 files in a 1700-entry room emits < 3KB of CRDT updates
  • Old clients rejected by server with update_required before the room DO is woken
  • Flat v2 entries remain valid forever — only touched by writes, never eagerly

Merge order

  1. Merge PR fix(server): migrate persistence from KV to native DO SQLite #51 (fix/sql-storage-migration) and bake
  2. Rebase this branch onto main (single commit, trivial)
  3. Version bump plugin + server
  4. Release notes: schema v3 is a breaking schema change; old plugin clients will be rejected by the updated server; metadata migration is lazy; no user action required except updating

Review checklist

  • No loop that converts every metadata entry
  • No raw .path, .mtime, .deletedAt access outside helper/tests
  • No entry.set(key, undefined)
  • No shallow-only meta.observe() dependency for nested changes
  • Server can cold-boot a v2 persisted room
  • Snapshot restore writes nested metadata
  • CI green
  • S15 QA green

Introduces schema v3 metadata model: file metadata entries are written
as nested Y.Maps instead of opaque JSON objects, giving field-level
CRDT resolution and eliminating whole-object tombstones on mtime updates.

## Core changes

### src/sync/fileMeta.ts (new)
Unified dual-shape helper module for v2 (flat) and v3 (nested Y.Map)
metadata. Provides type-safe decoders, read helpers, write helpers,
lazy conversion, incremental diff, and semantic change types.
Single authoritative interface — no call site accesses metadata directly.

### src/sync/schema.ts (new)
Pure, Obsidian-free SCHEMA_VERSION = 3 constant. Importable in tests
without dragging in the obsidian dependency.

### Metadata model
- All writes produce nested Y.Maps via ensureNestedMetaEntry + create helpers
- Reads dual-decode both flat (v2) and nested (v3) shapes everywhere
- Lazy on-write conversion: untouched flat entries remain flat indefinitely
- No eager migration, no distributed migration storm
- sys.schemaVersion bumped to 3 on first v3 client connect (markSchemaV3)
- migrateSchemaToV2 reverted to write flat v2 objects (was incorrectly
  writing nested maps, creating v3 shapes under v2 schema marker)

### Semantic observer (observeMetaChanges)
Single shared meta.observeDeep handler on VaultSync. Uses event paths
for O(k) incremental diff instead of O(N) full snapshot on every change.
Dispatches MetaChangeBatch with origin + isLocal to all subscribers.
DiskMirror and witness tracker consume semantic changes directly.

### DiskMirror
- Replaced shallow meta.observe with observeMetaChanges subscription
- Correctly handles nested field mutations (deletedAt, path, mtime)
- Skips local-origin batches (isLocal=true) to prevent local writes
  feeding back as remote file operations
- Suppresses disk.rename.observed remoteOrigin flag via
  _pendingRemoteRenameNewPaths when handleRemoteRename runs
- Normalizes all paths from semantic change events before disk ops

### Server
- SERVER_MIN/MAX_SCHEMA_VERSION bumped to 3
- countActivePathsInDoc / computeDocStats dual-read v2 and v3 metadata
- documentSummary debug response includes flatMetaEntries, nestedMetaEntries,
  invalidMetaEntries shape counters
- readMetaPath / isMetaDeleted helper methods for dual-shape reads

### Analyzer (orphan-after-rename rule)
New remoteOrigin exemption: disk.rename.observed events with
remoteOrigin:true (set by main.ts when DiskMirror's pending rename set
contains the new path) are not flagged as orphans. Passive receiver
devices correctly produce disk renames without CRDT rename events —
the CRDT rename was initiated by the other device.

## Tests

### New test suites (430 assertions across 10 suites):
- tests/file-meta-decode.ts — 112: decoder, type guards, helpers
- tests/file-meta-lazy-write.ts — 34: no-storm proof, concurrent convergence
- tests/meta-observer-integration.ts — 40: nested mutations fire semantic
  changes; origin filtering proven local vs remote; incremental diff correct
- tests/meta-v3-schema-gate-and-stats.ts — 47: schema gate imports real
  server constants; mixed metadata stats; realistic vault round-trip
- tests/meta-diskmirror-integration.ts — 52: real DiskMirror integration
  with spied handlers; proves remote nested delete/rename/revive trigger
  correct disk ops; proves local changes are ignored

### Updated:
- tests/disk-mirror-observer.ts: added observeMetaChanges to fakeVaultSync
- tests/v2-offline-rename-regressions.mjs: use getMetaPath/getMetaDeletedAt
  helpers (restore now writes nested Y.Maps, flat property access broke)
- tests/run-regressions.mjs: added meta-diskmirror-integration to suite

## QA scenario (S15)

Two-vault CDP scenario on ~/temenos + ~/temenos-b against the deployed
kavin-yaos.ripplor.workers.dev server (SQL storage, schema v3):

  Phase 1 create:      hash match ✓ (19037d3bbde3)
  Phase 2 rename:      hash match ✓ (dc952595d28f), old path gone ✓
  Phase 3 delete:      file gone on B ✓
  Phase 4 revive:      hash match ✓ (7593fdbb0b82)
  Phase 5 mtime-only:  B disk hash unchanged (f8f98eccff4a = f8f98eccff4a) ✓
  Phase 6 schema:      schemaVersion A=3 B=3 ✓

Both analyzer passes: 0 hard failures. Exit 0.
Server post-run: flatMeta=1403, nestedMeta=5 (only touched entries converted).

Full CI: npm ci + npm ci --prefix server + npm run build +
npm run test:ci + npm run test:regressions (73 suites) +
npm --prefix server run typecheck — all clean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant