feat(rtmg): accept register_user_lora WS message for user-trained packs#297
Open
hthillman wants to merge 1 commit into
Open
feat(rtmg): accept register_user_lora WS message for user-trained packs#297hthillman wants to merge 1 commit into
hthillman wants to merge 1 commit into
Conversation
Closes the loop between the pipelines registry (live in livepeer/pipelines#2693) and the rtmg pod: a connected client (the rtmg-vst plugin in a follow-up PR) sends a register_user_lora frame carrying presigned Tigris URLs + the expected Ed25519 signing key id and sha256 digest, and the pod downloads, verifies, and registers the pack so the next enable_lora picks it up. The trust chain matches what the orchestrator writes at training time: canonical-JSON manifest signed with Ed25519, sidecar carries { manifest_b64, sig_b64, kid }. The pod fetches its trusted-keys roster from app.daydream.live/api/loras/signing-public-key at module init, with a LORA_SIGNING_PUBLIC_KEYS_PEM env fallback for dev / offline boots. Files: - demos/realtime_motion_graph_web/user_loras.py (new) — pure helpers: download_pack, verify_pack (Ed25519 + sha256 + kid trust check), materialize_pack (writes safetensors + metadata.json + trigger.txt into user_loras_dir, atomic rename). - demos/realtime_motion_graph_web/ws_adapter.py — new dispatch elif next to enable_lora. Heavy I/O fires on a small ThreadPoolExecutor so the WS receive loop keeps consuming frames during the ~200 MB download. Errors surface as {type:"error", code:"register_user_lora_*"}. - demos/realtime_motion_graph_web/protocol.py — register_user_lora CommandSpec so the command name passes the COMMAND_NAMES gate and the wire contract is in the published surface. - acestep/streaming/session.py — Session.register_user_lora(path, name) calls engine_obj.register_lora (idempotent on stem) and publishes LoraCatalogUpdate to refresh every WS subscriber. - acestep/paths.py — user_loras_dir() reads ACESTEP_USER_LORAS_DIR env, defaults to models_dir()/user_loras (separate from the read-only baked catalog so operators can mount persistent storage there). - acestep/lora_metadata.py — LoraMetadata.source field (None for stock, "user_pack" for runtime-registered). user_loras writes source="user_pack" into the sidecar so the catalog event carries it through to UI clients that want to render "My Styles" vs "Stock Styles" sections. - pyproject.toml — explicit cryptography>=42 dep (likely transitive via huggingface_hub today; declaring it so a clean resolver run doesn't surprise us). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the loop between the pipelines registry (live in livepeer/pipelines#2693) and the rtmg pod: a connected client sends a
register_user_loraWS message carrying presigned Tigris URLs + the expected Ed25519 signing key id and sha256 digest. The pod downloads, verifies, and registers the pack so the nextenable_lorapicks it up.The rtmg-vst follow-up PR will send these messages from the JUCE plugin. This PR just lands the receive side so the wire contract is reviewable and the pod is ready to accept the new traffic.
Trust model
Matches what the orchestrator writes at training time (see pipelines#2693 + demon-public-demo#407):
{ v:1, jobId, style, trigger, sha256, createdAt }signed with Ed25519.<style>.signature.jsoncarries{ manifest_b64, sig_b64, kid }.https://app.daydream.live/api/loras/signing-public-keyat module init (5s timeout), with aLORA_SIGNING_PUBLIC_KEYS_PEMenv fallback for dev / offline boots. Cached for the process lifetime — rotation requires a pod restart.Verify chain inside
verify_pack:kidin trusted setmanifest.sha256and theexpected_sha256from the WS bodyEd25519PublicKey.verify(signature, manifest_bytes)succeedsAny failure → no catalog mutation,
{type:"error", code:"register_user_lora_*", message:...}to the calling client.Files
demos/realtime_motion_graph_web/user_loras.pydownload_pack,verify_pack,materialize_pack. Trusted-keys cache.demos/realtime_motion_graph_web/ws_adapter.pyenable_lora. Heavy I/O fires on a smallThreadPoolExecutorso the WS receive loop keeps consuming frames during the ~200MB download.demos/realtime_motion_graph_web/protocol.pyregister_user_loraCommandSpec so the command name passes theCOMMAND_NAMESgate and the wire contract is in the published surface.acestep/streaming/session.pySession.register_user_lora(path, name)callsengine_obj.register_lora(idempotent on filename stem) and publishesLoraCatalogUpdateto refresh every WS subscriber.acestep/paths.pyuser_loras_dir()readsACESTEP_USER_LORAS_DIR, defaults tomodels_dir()/user_loras(separate from the read-only baked catalog).acestep/lora_metadata.pyLoraMetadata.sourcefield.Nonefor stock,"user_pack"for runtime-registered. Surfaced into themetadatablock on eachlora_catalogentry so UIs can section the dropdown.pyproject.tomlcryptography>=42dep (likely transitive via huggingface_hub today; declaring it so a clean resolver run doesn't surprise us).Env required on the pod
LORA_PUBLIC_KEY_REGISTRY_URLapp.daydream.live/api/loras/signing-public-keyendpoint. Optional.LORA_SIGNING_PUBLIC_KEYS_PEMkid<TAB>PEMpairs, blocks separated by blank lines. Fallback when the registry fetch fails.ACESTEP_USER_LORAS_DIR$ACESTEP_MODELS_DIR/user_loras. Mount on persistent storage if you want packs to survive pod restarts.Test plan
LORA_SIGNING_PUBLIC_KEYS_PEMenv and the registry URL reachable →register_user_loraaccepts a valid pack.code:"verify_failed".kidlookup misses →code:"verify_failed".register_user_loratwice → second is a no-op, catalog event still fires.metadata.source == "user_pack".Out of scope
register_user_lorasender — separate PR againstrtmg-vst.rmfromuser_loras_dirand restart for now.register_user_loraper session. A future PR may rate-limit or queue.🤖 Generated with Claude Code