Skip to content

XIP-82 review revisions: typed newtypes, versioned-slot anti-poisoning, max_uses, confidential-components direction#140

Open
tylerhawkes wants to merge 2 commits into
tyler/xip-82-external-commit-invitesfrom
tyler/xip-82-review-fixes
Open

XIP-82 review revisions: typed newtypes, versioned-slot anti-poisoning, max_uses, confidential-components direction#140
tylerhawkes wants to merge 2 commits into
tyler/xip-82-external-commit-invitesfrom
tyler/xip-82-review-fixes

Conversation

@tylerhawkes

@tylerhawkes tylerhawkes commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

What

Review-round revisions to XIP-82 (External commit invites). Base is the original tyler/xip-82-external-commit-invites branch, so this PR's diff is exactly the delta from the reviewed XIP — not the whole document.

Addresses the points raised in review:

1 + 6 — Tampering / "prevent people from joining" → slot poisoning

The old service contract ("strictly-newer epoch overwrites") let anyone who can write the slot — without even holding the symmetric key — upload an epoch = u64::MAX blob and permanently brick the slot. Fixed:

  • Service contract rewritten: retain the last N uploads FIFO (RECOMMENDED N ≥ 4), never evict by the uploader-asserted epoch; epoch-jump cap demoted to an optional non-load-bearing hint.
  • Join-side flow rewritten to multi-candidate validate-and-select: AEAD-decrypt + MLS-validate every candidate, take the freshest valid one.
  • New Slot poisoning security subsection; dead fork-rejection references removed; threat-model + DoS rows updated.
  • Net: a permanent wedge degrades to transient, self-healing unavailability, with no service-held secrets.

2 / 3 / 4 — Stronger typing

Typed nominal newtypes instead of raw bytes: SymmetricKey {material}, GroupStateHash {digest}, ServicePointer {oneof https_url | opaque}. No algorithm enums — the versioned envelopes pin the AEAD and the ciphersuite pins the hash, so an in-message tag could only contradict them. group_state_hash's previously-handwaved preimage ("tree-hash or equivalent", a latent fork-check bug) is pinned to digest(GroupContext).

5 — Limit number of joiners

New uint32 max_uses on ExternalCommitPolicyV1 (0 = unlimited, 1 = single-use). Enforced by every validator by counting accepted external commits from the ordered commit stream since issuance — deterministic and convergent, and no change to the atomic external-commit shape (the joiner never touches the policy component). Enable/rotate resets the derived count.

7 — Confidential components (documented as v2 direction)

Documents a content-key (CCK) scheme that gates a subset of members from a subset of content, layered over MLS:

  • CCK held outside group state (not derivable from the blob); confidential values encrypted under it ride in the AppDataDictionary.
  • Distributed by member decision — in the Welcome (implicit grant on add) or over the app-stream (explicit grant for a self-joiner), no commit needed.
  • Re-encrypt only when a value changes — the key travels to members rather than re-keying values per add (the property an exporter-derived key can't give).
  • Optional per-group rotate-on-removal flag (in EXTERNAL_COMMIT_POLICY) trades re-encrypt churn for revoking a removed member's continued read via the public blob.
  • States the inherent limitation: hides content, not membership (the ratchet tree exposes member identities).

v1 ships no confidential components — this is forward-looking text only.

Notes

  • Test cases extended (now 15): max_uses single-use / bounded, slot poisoning by a non-key-holder, versioned-slot selection, forged-metadata blob.
  • The typed primitives + max_uses require a follow-up to xmtp/proto#334; the Reference-implementation section notes this and that the open libxmtp PRs get the consuming changes as separate follow-ups.
  • cspell clean; MD013/MD033 disabled in repo config.

Note

Revise XIP-82 spec with typed primitives, versioned-slot anti-poisoning, max_uses, and confidential-components direction

  • Replaces the ASCII overview in xip-82-external-commit-invites.md with a Mermaid sequence diagram and rewrites the end-to-end join flow covering candidate fetch, selection, pre-commit checks, and member-driven refresh.
  • Introduces typed wire primitives (SymmetricKey, GroupStateHash, ServicePointer) replacing raw bytes fields in ExternalInvitePayloadV1 and EncryptedGroupInfoBlobV1.
  • Adds max_uses and refresh_pointers fields to ExternalCommitPolicyV1, with enforcement via tagged admitted_via_external_group_id membership entries and lifecycle rules for revocation.
  • Expands receive-side validation: joiner uniqueness, delivery-service timestamp bounds, admitted_via tag equality, and max_uses enforcement by counting tagged membership entries.
  • Greatly expands security considerations covering slot poisoning mitigations, stolen-blob replay, removed-member implications, and joiner-side resource bounds.

Macroscope summarized 6ff3312.

Comment thread XIPs/xip-82-external-commit-invites.md Outdated
Comment thread XIPs/xip-82-external-commit-invites.md Outdated
Comment thread XIPs/xip-82-external-commit-invites.md
Comment thread XIPs/xip-82-external-commit-invites.md Outdated
Comment thread XIPs/xip-82-external-commit-invites.md Outdated
@macroscopeapp

macroscopeapp Bot commented Jun 8, 2026

Copy link
Copy Markdown

Approvability

Verdict: Needs human review

Unable to check for correctness in 6ff3312. The author does not own the main changed file (owned by @jhaaaa), and there are multiple unresolved medium-severity review comments identifying potential design inconsistencies in the specification. The designated code owner should review these substantial protocol specification changes.

You can customize Macroscope's approvability policy. Learn more.

@tylerhawkes tylerhawkes force-pushed the tyler/xip-82-review-fixes branch from 3021bf3 to 09756c8 Compare June 8, 2026 21:45
@tylerhawkes

Copy link
Copy Markdown
Contributor Author

Thanks for the review. Addressed all five findings in the latest push, and converted the overview to a mermaid sequenceDiagram.

🟠 High — max_uses convergence for new members. Correct and substantive — a member that joined after issuance can't replay history, so a commit-stream-derived count diverges. Reworked: the count is now materialized in group state. Each external committer tags its own GROUP_MEMBERSHIP entry with the external_group_id it joined under, and the use-count is the number of live entries carrying the active id — read straight from GroupInfo, so every member (including late joiners) converges. This keeps the atomic external-commit shape unchanged (the joiner still writes only its own entry). The tradeoff, now documented explicitly: semantics are concurrent (removing an invited member frees a slot), and the budget is scoped to external_group_id (rotating the slot resets it; a key-only rotation does not). Added a late-joiner convergence test case.

🟡 Medium — "unchanged on the wire" is false. Correct. Wrapping bytes in a submessage adds inner tag/length framing — a genuine wire-format change. Reworded to say so plainly, and noted the raw-bytes layout never shipped (Draft XIP; proto#334 is only on the unreleased feature/external-invite branch), so there's no compatibility constraint to honor.

🟡 Medium — duplicate coalescing. Changed rule 5 to MUST coalesce. Also expanded the Slot poisoning section, which previously implied all key-holder junk fails validation — it doesn't: a genuine-but-stale blob passes both AEAD and the consistency check, so a flood of distinct stale blobs can transiently evict the current one. Now documented as an availability-only, self-healing degradation (stale never validates → no bad state; bounded by FIFO + MUST-coalesce + rate-limiting + N), distinct from the permanent-wedge class which is structurally eliminated.

🟡 Medium — CCK over the app-stream leaks to all members. Correct. The v2 explicit-grant path now delivers the CCK encrypted specifically to the approved member (HPKE to their installation key(s) / a 1:1 channel), never as a group application message (which is encrypted to all current members and would leak to every other unapproved self-joiner). Still requires no commit.

🟢 Low — fork tie-breaker. Agree there's a loop risk, but a single deterministic pick would guarantee the loop (it re-selects the same losing fork on every re-fetch). Fixed with deterministic ordering (ascending group_state_hash) plus try-next-on-rejection — exclude a fork that fails validation and try the next; retry later only if all same-epoch forks are exhausted.

Flagging @jhaaaa per the ownership note — these are substantive protocol-spec revisions and should have an owner's eyes before merge.

Comment thread XIPs/xip-82-external-commit-invites.md
Comment thread XIPs/xip-82-external-commit-invites.md
Comment thread XIPs/xip-82-external-commit-invites.md Outdated
@tylerhawkes tylerhawkes force-pushed the tyler/xip-82-review-fixes branch from 09756c8 to 988edca Compare June 8, 2026 23:03
@tylerhawkes

Copy link
Copy Markdown
Contributor Author

Second round — fixed the three new findings in 988edca:

🟠 High — coalescing identity (line 421). Right, and subtle: keying coalescing on the plaintext (epoch, group_state_hash) is itself a wedge vector, since those fields are attacker-copyable onto garbage ciphertext and a "latest-wins" service would discard the genuine blob. Rule 5 now requires coalescing on the whole blob (epoch, group_state_hash, nonce, ciphertext); uploads sharing metadata but differing in ciphertext/nonce are distinct FIFO entries, and the joiner discards the undecryptable ones (the service can't — no key). The "retain a decryptable copy" alternative isn't available to a keyless service, noted inline.

🟡 Medium — SymmetricKey length (line 261). Right — I over-claimed the submessage enforces length; protobuf bytes doesn't. Restored the explicit requirement: validators/setters MUST check material is exactly 32 bytes (v1), in both the SymmetricKey definition and the policy field comment.

🟡 Medium — ServicePointer empty oneof (line 90). Right. Documented that exactly one location variant MUST be set; an empty oneof gives no fetch target and MUST fail closed as a parse error — and added that check to join-side flow step 1.

@@ -412,9 +479,10 @@ The `EXTERNAL_COMMIT_POLICY.allow_external_commit` master switch provides a sepa

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Low XIPs/xip-82-external-commit-invites.md:479

The max_uses count tracks only live GROUP_MEMBERSHIP entries with the active external_group_id, and rotating to a new external_group_id resets the budget. This means max_uses does not actually bound total group size from external invites—an admin can rotate the slot while previously invited members remain, then admit another max_uses members under the new slot. With max_uses = 1, one externally invited member can stay, the slot rotates, and a second external invite is accepted. The PR's goal of capping total externally invited members is not achieved; the mechanism only enforces a per-slot cap.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @XIPs/xip-82-external-commit-invites.md around line 479:

The `max_uses` count tracks only live `GROUP_MEMBERSHIP` entries with the **active** `external_group_id`, and rotating to a new `external_group_id` resets the budget. This means `max_uses` does not actually bound total group size from external invites—an admin can rotate the slot while previously invited members remain, then admit another `max_uses` members under the new slot. With `max_uses = 1`, one externally invited member can stay, the slot rotates, and a second external invite is accepted. The PR's goal of capping total externally invited members is not achieved; the mechanism only enforces a per-slot cap.

Comment thread XIPs/xip-82-external-commit-invites.md
Comment thread XIPs/xip-82-external-commit-invites.md Outdated
Comment thread XIPs/xip-82-external-commit-invites.md
Comment thread XIPs/xip-82-external-commit-invites.md Outdated
@tylerhawkes tylerhawkes force-pushed the tyler/xip-82-review-fixes branch 2 times, most recently from 41414bc to c5d00b7 Compare June 9, 2026 05:59
@tylerhawkes

Copy link
Copy Markdown
Contributor Author

Added an Invite redirection (known limitation) security subsection documenting a phishing/redirect risk and why it's hard to fix (c5d00b7).

The risk: the invite is a bearer capability — any holder of the QR (or any member) can both join and write refreshed blobs. So a holder can drop a blob wrapping a different, attacker-controlled group's GroupInfo into the slot, and a later scanner ends up in the attacker's group believing it's the advertised one.

Why the obvious fixes don't work, under the bearer model:

  • Signing the blob / per-invite key — "can refresh" and "can forge" are the same capability: every self-joiner needs the signing key to refresh the blob after joining, and the attacker joins via the same invite to obtain it. Withholding it instead breaks blob propagation (QR dies when the admin goes offline).
  • Binding to group_idgroup_id is opaque and forgeable: an attacker can create a second group with the same group_id but different members/data, so the check passes against their group. It only blocks the trivial variant.

So v1 does not cryptographically prevent redirection; it's documented as a known limitation for future work, with the interim mitigations being (a) treat the payload as a secret and (b) application-layer post-join verification (fingerprint cross-check / admin-identity challenge). Also corrected the now-stale "substitution is prevented" claims in Malicious service and Alternatives considered (AEAD only stops substitution by a party without the key), and added threat-model rows splitting keyless-substitution (prevented) from key-holder redirection (known limitation).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium

// `symmetric_key` and `external_group_id` MUST be cleared (set to
// empty bytes) in the same proposal. Leaving stale coordinates in
// the group state after revoke would let a future re-enable
// accidentally revive a previously-distributed key.

After changing symmetric_key from raw bytes to the SymmetricKey submessage, the revoke invariant at lines 234-237 still instructs clearing it by "set to empty bytes". This is no longer well-typed for field 4, so different validator implementations could encode revocation inconsistently (symmetric_key absent vs symmetric_key {} vs empty material), causing divergent policy validation and potential state forks. Consider updating the comment to specify the exact wire encoding (absent field vs empty submessage vs empty material) and ensure libxmtp enforces that encoding uniformly.

-  //   * When `allow_external_commit` transitions to false (revoke):
-  //     `symmetric_key` and `external_group_id` MUST be cleared (set to
-  //     empty bytes) in the same proposal. Leaving stale coordinates in
+  //   * When `allow_external_commit` transitions to false (revoke):
+  //     `symmetric_key` and `external_group_id` MUST be cleared in the
+  //     same proposal. `symmetric_key` MUST be absent (not merely empty
+  //     `material`). Leaving stale coordinates in
   //     the group state after revoke would let a future re-enable
   //     accidentally revive a previously-distributed key.
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @XIPs/xip-82-external-commit-invites.md around lines 234-237:

After changing `symmetric_key` from raw `bytes` to the `SymmetricKey` submessage, the revoke invariant at lines 234-237 still instructs clearing it by "set to empty bytes". This is no longer well-typed for field 4, so different validator implementations could encode revocation inconsistently (`symmetric_key` absent vs `symmetric_key {}` vs empty `material`), causing divergent policy validation and potential state forks. Consider updating the comment to specify the exact wire encoding (absent field vs empty submessage vs empty `material`) and ensure `libxmtp` enforces that encoding uniformly.

Comment on lines +571 to +576
11. **Single active invited member (`max_uses = 1`)**: the first external-commit join is accepted by every member; a second concurrent external commit is rejected; after the first invited member is removed, a new external commit is admitted (the slot is freed — concurrent semantics).
12. **Bounded invite (`max_uses = N`)**: the commit that would create the (`N`+1)-th concurrent member tagged with the active `external_group_id` is rejected; rotating to a new `external_group_id` starts a fresh budget.
13. **Slot poisoning by a non-key-holder**: an attacker uploads a high-`epoch` garbage blob; the joiner discards it at AEAD-decrypt, selects the genuine blob, and joins successfully (slot not bricked).
14. **Versioned-slot selection**: the slot holds several valid blobs at different epochs; the joiner selects the highest-epoch one.
15. **Forged-metadata blob by a key-holder**: a blob whose plaintext `epoch` / `group_state_hash` disagree with the wrapped `GroupInfo` is discarded at the consistency check.
16. **`max_uses` convergence for a late joiner**: a member that joined *after* the invite was issued computes the same use-count from `GROUP_MEMBERSHIP` (entries tagged with the active `external_group_id`), not from replayed history, and reaches the same accept/reject decision as older members.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Low XIPs/xip-82-external-commit-invites.md:571

Lines 571-576 list test cases 11-16 (max_uses, versioned-slot selection, etc.) as "included" in the reference implementation, but line 580 states these features are only being "layered onto the open PRs as review follow-ups". This creates a contradiction: the document claims the tests exist while simultaneously admitting they do not, so implementers cannot verify the revised spec against the referenced code. Consider either removing the not-yet-implemented tests from the "included" list, or updating line 580 to clarify that the features are now present.

-11. **Single active invited member (`max_uses = 1`)**: the first external-commit join is accepted by every member; a second concurrent external commit is rejected; after the first invited member is removed, a new external commit is admitted (the slot is freed — concurrent semantics).
-12. **Bounded invite (`max_uses = N`)**: the commit that would create the (`N`+1)-th concurrent member tagged with the active `external_group_id` is rejected; rotating to a new `external_group_id` starts a fresh budget.
-13. **Slot poisoning by a non-key-holder**: an attacker uploads a high-`epoch` garbage blob; the joiner discards it at AEAD-decrypt, selects the genuine blob, and joins successfully (slot not bricked).
-14. **Versioned-slot selection**: the slot holds several valid blobs at different epochs; the joiner selects the highest-epoch one.
-15. **Forged-metadata blob by a key-holder**: a blob whose plaintext `epoch` / `group_state_hash` disagree with the wrapped `GroupInfo` is discarded at the consistency check.
-16. **`max_uses` convergence for a late joiner**: a member that joined *after* the invite was issued computes the same use-count from `GROUP_MEMBERSHIP` (entries tagged with the active `external_group_id`), not from replayed history, and reaches the same accept/reject decision as older members.
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @XIPs/xip-82-external-commit-invites.md around lines 571-576:

Lines 571-576 list test cases 11-16 (`max_uses`, versioned-slot selection, etc.) as "included" in the reference implementation, but line 580 states these features are only being "layered onto the open PRs as review follow-ups". This creates a contradiction: the document claims the tests exist while simultaneously admitting they do not, so implementers cannot verify the revised spec against the referenced code. Consider either removing the not-yet-implemented tests from the "included" list, or updating line 580 to clarify that the features are now present.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium

An external commit accepted under this XIP MUST consist of:
1. Exactly one `ExternalInit` proposal (RFC 9420 §12.4.3.2).
2. Zero or more by-value `Add` proposals, where every added `KeyPackage` carries the same `inbox_id` as the joiner's signing key. (This enforces "the joiner can add only their own installations.")
3. Exactly one `AppDataUpdate` proposal scoped to the joiner's own entry in the `GROUP_MEMBERSHIP` component. The entry MUST record the `external_group_id` the joiner was admitted under (this is what `max_uses` accounting counts). (This keeps the tree-level MLS membership and the AppData-level membership coupled in a single atomic commit — required by libxmtp's cross-layer invariants — and still touches only the joiner's own entry, so the atomic shape is unchanged.)

Requiring the joiner's GROUP_MEMBERSHIP entry to record the specific external_group_id they were admitted under breaks in-flight joins when an admin rotates external_group_id before commit validation. The joiner only knows the old slot id from their QR/blob, so they cannot populate the entry with the new active id, causing validators to reject the commit. This contradicts the stated stale-QR flow where commits should still validate and only blob re-upload is skipped.

-3. Exactly one `AppDataUpdate` proposal scoped to the joiner's own entry in the `GROUP_MEMBERSHIP` component. The entry MUST record the `external_group_id` the joiner was admitted under (this is what `max_uses` accounting counts). (This keeps the tree-level MLS membership and the AppData-level membership coupled in a single atomic commit — required by libxmtp's cross-layer invariants — and still touches only the joiner's own entry, so the atomic shape is unchanged.)
+3. Exactly one `AppDataUpdate` proposal scoped to the joiner's own entry in the `GROUP_MEMBERSHIP` component. (This keeps the tree-level MLS membership and the AppData-level membership coupled in a single atomic commit — required by libxmtp's cross-layer invariants.)
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @XIPs/xip-82-external-commit-invites.md around lines 403-407:

Requiring the joiner's `GROUP_MEMBERSHIP` entry to record the specific `external_group_id` they were admitted under breaks in-flight joins when an admin rotates `external_group_id` before commit validation. The joiner only knows the old slot id from their QR/blob, so they cannot populate the entry with the new active id, causing validators to reject the commit. This contradicts the stated stale-QR flow where commits should still validate and only blob re-upload is skipped.

@tylerhawkes tylerhawkes force-pushed the tyler/xip-82-review-fixes branch from c5d00b7 to b167b7a Compare June 9, 2026 08:21
@tylerhawkes

Copy link
Copy Markdown
Contributor Author

Fixed the CI markdown failure (MD032 — missing blank lines before the libxmtp:/proto:/openmls: lists) and the six new findings in b167b7a:

  • 🟠 High (GROUP_MEMBERSHIP has no field for the tag). Correct — GROUP_MEMBERSHIP maps inbox_id → uint64 with no per-member metadata. Documented the required additive schema change: each entry gains an optional admitted_via_external_group_id, set by the joiner in the same entry it already writes (atomic shape unchanged); the count is entries whose tag == the active external_group_id.
  • 🟠 High (step 8 / revocation). Step 8 now checks both external_group_id and symmetric_key, and is explicit that it's post-commit — it gates blob refresh, not joining. Key-only rotation does not revoke join ability (retained old-key blobs stay joinable until they age out / the epoch advances); the master switch is the only instant kill. Reconciled the rotation-flavor bullets and the Rotation section to match.
  • 🟡 Medium (coalescing omits expires_at_ns). Coalescing key is now all blob fields incl. expires_at_ns; metadata-only coalescing explicitly forbidden (covers the forge-expiry variant).
  • 🟡 Medium ("self-healing" overstated). Corrected: a flood of N distinct blobs evicts the genuine one and no join can succeed, so there's no joiner-driven re-upload — recovery needs an admin/member to actively re-upload. It's availability-only and no permanent wedge, but not automatic; rate-limiting + larger N are what bound it.
  • 🟡 Medium (revoke encoding for SymmetricKey). Pinned one canonical cleared encoding — symmetric_key absent (not empty submessage / not empty material) — in the message comment, field comment, and Revoke-atomicity invariant.
  • 🟢 Low (per-slot, not total). Stated plainly that max_uses is a per-invite/per-slot throttle, not a total-membership cap (rotation opens a fresh budget); a hard total cap is out of scope.

markdownlint + cspell both green locally.

Comment thread XIPs/xip-82-external-commit-invites.md Outdated
Comment thread XIPs/xip-82-external-commit-invites.md Outdated
Comment on lines +473 to +601
4. Parse the plaintext as a `VerifiableGroupInfo`.
5. Consistency: the GroupInfo's epoch equals `blob.epoch` **and** `digest(GroupContext)` equals `blob.group_state_hash.digest`. A blob whose key-holder lied about its metadata is discarded here.
4. If no candidate survives, abort: the slot is empty, stale, or poisoned. This is transient — a subsequent genuine upload repopulates the slot — so the client MAY retry later.
5. Among survivors, select the **highest-epoch** candidate. If several share the highest epoch with differing `group_state_hash` (concurrent forks), order them deterministically (e.g. ascending `group_state_hash`) and attempt them in that order; on a commit rejected at validation, exclude that fork and try the next. (A deterministic *single* pick is not sufficient — it would re-select the same losing fork on every re-fetch and loop forever.) If all same-epoch forks are exhausted, abort and retry later — a newer genuine blob may appear.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium XIPs/xip-82-external-commit-invites.md:473

Join-side flow step 5 aborts after exhausting same-epoch forks without falling back to lower-epoch survivors. If the highest-epoch candidate(s) are all validator-rejected forks, the joiner abandons the slot even when lower-epoch blobs are present and valid. Consider continuing iteration through sorted-by-descending-epoch survivors, attempting each distinct epoch until a commit succeeds or all candidates are exhausted.

-5. Among survivors, select the **highest-epoch** candidate. If several share the highest epoch with differing `group_state_hash` (concurrent forks), order them deterministically (e.g. ascending `group_state_hash`) and attempt them in that order; on a commit rejected at validation, exclude that fork and try the next. (A deterministic *single* pick is not sufficient — it would re-select the same losing fork on every re-fetch and loop forever.) If all same-epoch forks are exhausted, abort and retry later — a newer genuine blob may appear.
+5. Among survivors, select candidates by **descending epoch**, and within each epoch by deterministic `group_state_hash` order (e.g. ascending). Attempt each candidate in this order; on a commit rejected at validation, exclude that candidate and continue with the next. If all candidates are exhausted, abort and retry later — a newer genuine blob may appear.
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @XIPs/xip-82-external-commit-invites.md around line 473:

`Join-side flow` step 5 aborts after exhausting same-epoch forks without falling back to lower-epoch survivors. If the highest-epoch candidate(s) are all validator-rejected forks, the joiner abandons the slot even when lower-epoch blobs are present and valid. Consider continuing iteration through sorted-by-descending-epoch survivors, attempting each distinct epoch until a commit succeeds or all candidates are exhausted.

@tylerhawkes tylerhawkes force-pushed the tyler/xip-82-review-fixes branch from b167b7a to d2038ea Compare June 9, 2026 18:20
@tylerhawkes

Copy link
Copy Markdown
Contributor Author

Two corrections in d2038ea:

  • Re the High on GROUP_MEMBERSHIP: it's a TlsMap<InboxId, Bytes> where the per-member Bytes value is itself a protobuf — so it's already extensible, and admitted_via_external_group_id is a routine additive field on that message, not a structural change to the map (the finding assumed a uint64 value). Reworded the note accordingly.
  • Re the Low (per-slot vs total): the per-external_group_id scoping is intentionalmax_uses throttles a single invite/slot, by design. A cap on how many inboxes may join the group is a separate setting, not max_uses. Reworded to state this as design intent rather than a limitation.

markdownlint + cspell green; markdown CI passing.

Comment thread XIPs/xip-82-external-commit-invites.md Outdated
Comment thread XIPs/xip-82-external-commit-invites.md Outdated
Comment on lines +517 to +519
### Why the joiner selects the blob, not the service

The service is untrusted and keyless, so any authority it exercises over which blob "wins" is an authority an attacker can hijack. An earlier design ordered uploads by a plaintext `epoch` ("strictly newer wins"); but `epoch` is uploader-asserted, so a single forged `epoch = u64::MAX` upload — possible **without** the symmetric key — would mark every genuine upload stale and brick the slot permanently. The fix moves selection to the only party that can actually verify a blob: the joiner, who holds the key and runs MLS validation. The service's job shrinks to "retain the last N uploads, FIFO, and don't pretend to know which is real"; the joiner decrypts each candidate, checks it parses to a `GroupInfo` whose epoch and `GroupContext` digest match the blob metadata, and takes the freshest survivor. Poison from a non-key-holder fails AEAD; poison from a key-holder fails the consistency check; a permanent wedge degrades to transient unavailability (recoverable by an admin or member re-uploading — see "Slot poisoning").

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium XIPs/xip-82-external-commit-invites.md:517

Line 519 claims "poison from a key-holder fails the consistency check," but this contradicts the spec's own "Invite redirection (known limitation)" section. Any holder of the symmetric_key can forge a self-consistent blob for an attacker-controlled group where the epoch and GroupContext digest match the forged GroupInfo, passing the joiner's validation. The text documents a security property the protocol does not provide, which can mislead implementers into believing key-holder redirection is mitigated when it is not.

- Poison from a non-key-holder fails AEAD; poison from a key-holder fails the consistency check; a permanent wedge degrades to transient unavailability (recoverable by an admin or member re-uploading — see "Slot poisoning").
+ Poison from a non-key-holder fails AEAD; poison from a key-holder that uploads inconsistent metadata fails the consistency check, but a key-holder can still redirect to an attacker-controlled group with a fully self-consistent blob (see "Invite redirection (known limitation)"). A permanent wedge from non-key-holder poison degrades to transient unavailability (recoverable by an admin or member re-uploading — see "Slot poisoning").
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @XIPs/xip-82-external-commit-invites.md around lines 517-519:

Line 519 claims "poison from a key-holder fails the consistency check," but this contradicts the spec's own "Invite redirection (known limitation)" section. Any holder of the `symmetric_key` can forge a self-consistent blob for an attacker-controlled group where the epoch and `GroupContext` digest match the forged `GroupInfo`, passing the joiner's validation. The text documents a security property the protocol does not provide, which can mislead implementers into believing key-holder redirection is mitigated when it is not.

@tylerhawkes tylerhawkes force-pushed the tyler/xip-82-review-fixes branch from d2038ea to 124aba0 Compare June 9, 2026 19:37
Comment thread XIPs/xip-82-external-commit-invites.md
Comment thread XIPs/xip-82-external-commit-invites.md
Comment thread XIPs/xip-82-external-commit-invites.md Outdated
Comment thread XIPs/xip-82-external-commit-invites.md Outdated
@tylerhawkes tylerhawkes force-pushed the tyler/xip-82-review-fixes branch from 124aba0 to bdf6b54 Compare June 9, 2026 20:04
Comment thread XIPs/xip-82-external-commit-invites.md Outdated
Comment on lines +585 to +615

1. **Jitter.** Wait a uniformly random delay (RECOMMENDED 0–30 s).
2. **Re-check.** If the group's epoch advanced again during the wait, stop — the newer commit's refresh cycle supersedes this one.
3. **Check-before-write.** For each pointer in `refresh_pointers`: fetch the slot's candidates and validate them exactly as a joiner would (AEAD + consistency). If any candidate is valid at the current epoch, skip that slot — another member won the race. Otherwise wrap a fresh blob (fresh random nonce, effective `expires_at_ns`) and upload it.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium XIPs/xip-82-external-commit-invites.md:585

In Keeping the blob fresh step 3, the check-before-write logic stands down whenever it finds any candidate that is "valid at the current epoch" based only on AEAD + metadata consistency. A same-epoch blob from a different fork passes these checks even when its group_state_hash does not match the member's own current group state. Members on the winning fork will incorrectly stand down, leaving a losing-fork blob in the slot. Future joiners will select and attempt commits from that blob, which validators reject due to the hash mismatch, causing join failures until a later epoch coincidentally refreshes the slot.

- 3. **Check-before-write.** For each pointer in `refresh_pointers`: fetch the slot's candidates and validate them exactly as a joiner would (AEAD + consistency). If any candidate is valid at the current epoch, skip that slot — another member won the race. Otherwise wrap a fresh blob (fresh random nonce, effective `expires_at_ns`) and upload it.
+ 3. **Check-before-write.** For each pointer in `refresh_pointers`: fetch the slot's candidates and validate them exactly as a joiner would (AEAD + consistency). If any candidate is valid at the current epoch **and its `group_state_hash` matches the member's current group state**, skip that slot — another member on this fork won the race. Otherwise wrap a fresh blob (fresh random nonce, effective `expires_at_ns`) and upload it.
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @XIPs/xip-82-external-commit-invites.md around line 585:

In `Keeping the blob fresh` step 3, the check-before-write logic stands down whenever it finds *any* candidate that is "valid at the current epoch" based only on AEAD + metadata consistency. A same-epoch blob from a different fork passes these checks even when its `group_state_hash` does not match the member's own current group state. Members on the winning fork will incorrectly stand down, leaving a losing-fork blob in the slot. Future joiners will select and attempt commits from that blob, which validators reject due to the hash mismatch, causing join failures until a later epoch coincidentally refreshes the slot.

Comment thread XIPs/xip-82-external-commit-invites.md Outdated
@tylerhawkes tylerhawkes force-pushed the tyler/xip-82-review-fixes branch from bdf6b54 to 6e6fc7d Compare June 9, 2026 20:22
2. Fetch the candidate set from the resolved service slot (up to N `EncryptedGroupInfoBlob`s — see "Service contract"). Clients SHOULD bound this step: cap the number of candidates processed (RECOMMENDED: at least the service's retention N, e.g. 8) and cap the per-blob size — a malicious service must not be able to stream unbounded data through the decrypt loop. See "Joiner-side resource bounds and fetch hygiene".
3. For each candidate, discard it on the first failing check:
1. `blob.version` is recognized.
2. Not expired: `blob.expires_at_ns == 0 || now_ns < blob.expires_at_ns`.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium XIPs/xip-82-external-commit-invites.md:576

The Join-side flow candidate-expiry check at step 3.2 uses the joiner's local now_ns (line 576: blob.expires_at_ns == 0 || now_ns < blob.expires_at_ns), but validators reject commits based on delivery-service envelope timestamps (lines 560–561). A client with a slow clock accepts blobs already expired in the envelope-timestamp domain, builds a commit from them, and hits the "zombie join" this section claims to avoid: every member rejects the commit while the joiner believes it succeeded. Consider aligning the join-side check to use the same envelope-timestamp domain as validators, or at minimum documenting that local-clock skew creates this window of zombie-join risk.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @XIPs/xip-82-external-commit-invites.md around line 576:

The `Join-side flow` candidate-expiry check at step 3.2 uses the joiner's local `now_ns` (line 576: `blob.expires_at_ns == 0 || now_ns < blob.expires_at_ns`), but validators reject commits based on delivery-service envelope timestamps (lines 560–561). A client with a slow clock accepts blobs already expired in the envelope-timestamp domain, builds a commit from them, and hits the "zombie join" this section claims to avoid: every member rejects the commit while the joiner believes it succeeded. Consider aligning the join-side check to use the same envelope-timestamp domain as validators, or at minimum documenting that local-clock skew creates this window of zombie-join risk.

@tylerhawkes tylerhawkes force-pushed the tyler/xip-82-review-fixes branch from 6e6fc7d to 26f6259 Compare June 9, 2026 20:28
oneof location {
// RFC 3986 https URI. libxmtp parses and syntactically validates it.
string https_url = 1;
// Application-defined opaque bytes for non-URL transports (NFC tags,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium XIPs/xip-82-external-commit-invites.md:106

ServicePointer adds an opaque arm that is explicitly "opaque to libxmtp," but the spec later requires members to fetch and re-upload every refresh_pointer to keep slots fresh. When EXTERNAL_COMMIT_POLICY.refresh_pointers contains an opaque pointer, no interoperable fetch/upload procedure exists, so conforming implementations cannot refresh those invites — breaking the member-driven freshness mechanism.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @XIPs/xip-82-external-commit-invites.md around line 106:

`ServicePointer` adds an `opaque` arm that is explicitly "opaque to libxmtp," but the spec later requires members to fetch and re-upload every `refresh_pointer` to keep slots fresh. When `EXTERNAL_COMMIT_POLICY.refresh_pointers` contains an `opaque` pointer, no interoperable fetch/upload procedure exists, so conforming implementations cannot refresh those invites — breaking the member-driven freshness mechanism.

@tylerhawkes tylerhawkes force-pushed the tyler/xip-82-review-fixes branch from 26f6259 to 2af0112 Compare June 9, 2026 21:29
// a bound of 0 means "no bound" and drops out of the min):
//
// min(EXTERNAL_COMMIT_POLICY.expires_at_ns, // campaign end
// epoch_start_ns + EXTERNAL_COMMIT_POLICY.expire_in_ns)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium XIPs/xip-82-external-commit-invites.md:229

The expires_at_ns formula at line 229 uses epoch_start_ns + EXTERNAL_COMMIT_POLICY.expire_in_ns, but expire_in_ns is documented elsewhere in this spec as a duration since GroupInfo export. When an admin delays exporting GroupInfo and uses a short expire_in_ns, the joiner-side TTL becomes stricter than the validator-side bound: the joiner discards the blob early while validators would still accept it. This breaks the claimed 'zombie join' guarantee by causing false-negative joins that should succeed.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @XIPs/xip-82-external-commit-invites.md around line 229:

The `expires_at_ns` formula at line 229 uses `epoch_start_ns + EXTERNAL_COMMIT_POLICY.expire_in_ns`, but `expire_in_ns` is documented elsewhere in this spec as a duration **since GroupInfo export**. When an admin delays exporting GroupInfo and uses a short `expire_in_ns`, the joiner-side TTL becomes stricter than the validator-side bound: the joiner discards the blob early while validators would still accept it. This breaks the claimed 'zombie join' guarantee by causing false-negative joins that should succeed.

Comment thread XIPs/xip-82-external-commit-invites.md Outdated
Comment on lines +838 to +850
- **Fetch hygiene for `https_url`.** Clients MUST require the `https` scheme, MUST NOT follow redirects to other schemes, and MUST ignore credentials embedded in the URL. Scanning a QR and fetching immediately reveals the scanner's IP address (and scan time) to whoever controls the URL — before the user has decided anything. Clients SHOULD obtain user confirmation before the first fetch for an invite, and MAY route fetches through a proxy.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium XIPs/xip-82-external-commit-invites.md:838

The https_url fetch hygiene in the Joiner-side resource bounds and fetch hygiene section only forbids redirects to a different scheme, not to a different host. A malicious https:// URL can still 30x redirect the client to an attacker-controlled https:// origin, leaking the scanner's IP address and timing information to a third party and potentially serving attacker-controlled content. Consider prohibiting all cross-origin redirects, not just scheme changes.

Suggested change
- **Fetch hygiene for `https_url`.** Clients MUST require the `https` scheme, MUST NOT follow redirects to other schemes, and MUST ignore credentials embedded in the URL. Scanning a QR and fetching immediately reveals the scanner's IP address (and scan time) to whoever controls the URL — before the user has decided anything. Clients SHOULD obtain user confirmation before the first fetch for an invite, and MAY route fetches through a proxy.
- **Fetch hygiene for `https_url`.** Clients MUST require the `https` scheme, MUST NOT follow redirects to other schemes, and MUST ignore credentials embedded in the URL. Scanning a QR and fetching immediately reveals the scanner's IP address (and scan time) to whoever controls the URL — before the user has decided anything. Clients SHOULD obtain user confirmation before the first fetch for an invite, and MAY route fetches through a proxy.
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file @XIPs/xip-82-external-commit-invites.md around line 838:

The `https_url` fetch hygiene in the `Joiner-side resource bounds and fetch hygiene` section only forbids redirects to a different scheme, not to a different host. A malicious `https://` URL can still 30x redirect the client to an attacker-controlled `https://` origin, leaking the scanner's IP address and timing information to a third party and potentially serving attacker-controlled content. Consider prohibiting all cross-origin redirects, not just scheme changes.

tylerhawkes added a commit to xmtp/proto that referenced this pull request Jun 9, 2026
…fresh_pointers, AAD + effective expiry, admitted_via tag

Follow-up to #334, mirroring the revised XIP-82 (xmtp/XIPs#140). See
the PR body for the full breakdown.

Also extends the local-only AppDataUpdateData.V1 intent with repeated
additional_updates so writes whose invariants couple components
atomically (XIP-82 enable atomicity: policy + GROUP_MEMBERSHIP grant)
can land in one commit.
tylerhawkes added a commit to xmtp/proto that referenced this pull request Jun 9, 2026
…fresh_pointers, AAD + effective expiry, admitted_via tag

Follow-up to #334, mirroring the revised XIP-82 (xmtp/XIPs#140). See
the PR body for the full breakdown.

Also extends the local-only AppDataUpdateData.V1 intent with repeated
additional_updates so writes whose invariants couple components
atomically (XIP-82 enable atomicity: policy + GROUP_MEMBERSHIP grant)
can land in one commit.
…ing, max_uses cap, confidential-components direction
@tylerhawkes tylerhawkes force-pushed the tyler/xip-82-review-fixes branch from 65e305a to 0ba04f9 Compare June 15, 2026 20:02
@tylerhawkes tylerhawkes requested a review from a team as a code owner June 15, 2026 20:02
@tylerhawkes tylerhawkes force-pushed the tyler/xip-82-external-commit-invites branch from 1ac135e to edf4369 Compare June 15, 2026 20:02
allow_external_commit itself authorizes the joiner-self-entry insert (implied grant); external_committer_permissions reduced to a reserved deny-by-default surface. Rework blob refresh to committer-immediate (primary, no jitter) + member existence-check backstop + rare deep-check self-heal, justified by the group_state_hash unforgeability (witness) argument. Add service-contract rules for (epoch, group_state_hash) lookup and an existence query; align join-side step 10, the overview diagram, and the service-visible-metadata note.

Address the open Macroscope review threads: measure expire_in_ns from epoch start (not GroupInfo export); join-side selection iterates descending-epoch through all survivors (lower-epoch fallback); align the key-holder-poison wording with the redirection limitation; document the joiner local-clock vs envelope-timestamp skew; specify opaque refresh_pointers as application-mediated; clarify admitted_via is the GroupInfo's active id (epoch-bound, so rotation can't break in-flight joins); forbid cross-origin redirects; resolve the test-cases-vs-follow-ups contradiction.
@tylerhawkes tylerhawkes force-pushed the tyler/xip-82-review-fixes branch from 0ba04f9 to 6ff3312 Compare June 15, 2026 23:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants