fix(adapter/sqs): standard SendMessageBatch double-send under leader churn#923
Conversation
…churn The standard-queue SendMessageBatch path (sendMessageBatchWithRetry) has an in-process OCC retry loop that re-minted a fresh random MessageID (and send timestamp) for every entry on EACH attempt. Under a Raft leader-election storm an attempt can commit its writes but still surface a self-inflicted WriteConflict; the retry then re-mints new MessageIDs and writes a SECOND copy of every entry at new keys -> every batch entry is double-sent (the :duplicate-elements class fixed for DynamoDB in PR #920). Unlike single SendMessage (no in-process retry -> a conflict is a normal at-least-once SDK retry) and FIFO send (MessageDeduplicationId dedup record short-circuits retries), the standard batch path had neither fence. Fix: pre-generate one stable sqsSendIdentity (MessageID + receipt token + send timestamp) per entry BEFORE the retry loop and reuse it on every attempt (buildSendRecordWithIdentity). The storage keys (data/vis/byage) then stay constant across retries, so a committed-but-conflicted attempt is overwritten idempotently instead of duplicated. No FSM probe / PrevCommitTS / gate is needed (keys are content-stable -> plain idempotent overwrite); the record's QueueGeneration is re-derived per attempt so a concurrent DeleteQueue/PurgeQueue generation bump is still followed. Also records the S3/SQS audit outcome in the DynamoDB dedup design doc: S3 is safe (whole-value PUT at stable key + StartTS OCC fence; multipart manifest is deterministic from immutable parts), SQS single/FIFO safe; only standard batch needed the fix. Tests: adapter/sqs_batch_send_dedup_test.go pins key stability across the retry (fresh MessageIDs would fail it).
|
Warning Review limit reached
More reviews will be available in 4 minutes and 11 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@claude review |
|
Claude finished @bootjp's task in 3m 21s —— View job Code Review:
|
There was a problem hiding this comment.
Code Review
This pull request fixes a deduplication bug in SQS standard SendMessageBatch where retries would re-mint fresh message IDs and timestamps, potentially causing double-sends under leader churn. This is resolved by pre-generating a stable sqsSendIdentity per entry before the retry loop and reusing it across attempts. The design documentation and tests have been updated accordingly. The review feedback suggests optimizing this implementation by lazily initializing the identities only when entering the standard queue path, preventing unnecessary cryptographic random reads for FIFO queues.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| identities := make([]sqsSendIdentity, len(entries)) | ||
| for i := range identities { | ||
| id, err := newSendIdentity() | ||
| if err != nil { | ||
| return nil, nil, err | ||
| } | ||
| identities[i] = id | ||
| } |
There was a problem hiding this comment.
Pre-generating sqsSendIdentity for all entries before the retry loop is inefficient for FIFO queues, as FIFO queues completely ignore these identities (they mint their own per-entry dedup-fenced identity). Since newSendIdentity() calls crypto/rand.Read twice (once for MessageID and once for ReceiptToken), this results in up to 20 unnecessary cryptographic random reads per batch send on FIFO queues, which can be a significant performance bottleneck under high throughput.
We can optimize this by pre-allocating the identities slice with zero values, and then lazily initializing each identity only when we actually enter the standard queue path (sendBatchStandardOnce) and only if it hasn't been initialized yet. Since slices are passed by reference, the initialized identities will be preserved across retries.
| identities := make([]sqsSendIdentity, len(entries)) | |
| for i := range identities { | |
| id, err := newSendIdentity() | |
| if err != nil { | |
| return nil, nil, err | |
| } | |
| identities[i] = id | |
| } | |
| identities := make([]sqsSendIdentity, len(entries)) |
| for i, entry := range entries { | ||
| // identities[i] is the stable per-entry identity minted once before | ||
| // the retry loop; reusing it keeps the storage keys constant across | ||
| // retries so a committed-but-conflicted attempt is overwritten | ||
| // idempotently rather than double-sent. | ||
| rec, recordBytes, apiErr := buildBatchSendRecord(meta, entry, identities[i]) |
There was a problem hiding this comment.
Lazily initialize the identity on the first attempt if it's empty. This avoids generating unused identities for FIFO queues.
| for i, entry := range entries { | |
| // identities[i] is the stable per-entry identity minted once before | |
| // the retry loop; reusing it keeps the storage keys constant across | |
| // retries so a committed-but-conflicted attempt is overwritten | |
| // idempotently rather than double-sent. | |
| rec, recordBytes, apiErr := buildBatchSendRecord(meta, entry, identities[i]) | |
| for i, entry := range entries { | |
| if identities[i].messageID == "" { | |
| id, err := newSendIdentity() | |
| if err != nil { | |
| return nil, nil, false, err | |
| } | |
| identities[i] = id | |
| } | |
| // identities[i] is the stable per-entry identity minted once before | |
| // the retry loop; reusing it keeps the storage keys constant across | |
| // retries so a committed-but-conflicted attempt is overwritten | |
| // idempotently rather than double-sent. | |
| rec, recordBytes, apiErr := buildBatchSendRecord(meta, entry, identities[i]) |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 30b902b024
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // the retry loop; reusing it keeps the storage keys constant across | ||
| // retries so a committed-but-conflicted attempt is overwritten | ||
| // idempotently rather than double-sent. | ||
| rec, recordBytes, apiErr := buildBatchSendRecord(meta, entry, identities[i]) |
There was a problem hiding this comment.
Keep visibility index keys idempotent across retries
When a standard batch entry omits per-message DelaySeconds, this call rebuilds the record from freshly loaded queue metadata on every retry while reusing only the MessageId/timestamp identity. If the first attempt commits but reports a retryable conflict under leader churn, and SetQueueAttributes changes the queue DelaySeconds before the retry, the retry writes a second vis key for the same MessageId because AvailableAtMillis changes; the old committed vis key is never deleted. The receive path trusts the visibility index entry it scans, so that stale key can make the message visible before the new delay and leave another stale visibility entry that can redeliver the same message again. Either the delay-derived index key also needs to remain stable for the retry series, or retries need to clean up the previous vis/by-age keys for the same identity.
Useful? React with 👍 / 👎.
…ableAtMillis Addresses the #923 review: - gemini (medium): pre-generating an identity for every entry before the retry loop wasted up to 2 crypto/rand reads per entry on FIFO batches (which ignore these identities). Mint lazily on the first standard-path attempt (sendBatchStandardOnce) into a zero-valued slice; the slice is shared by reference so the mint persists across retries. FIFO batches now pay nothing. - codex (P2): the vis/by-age keys embed AvailableAtMillis = sendTs + delay. The delay was re-resolved from the freshly-read meta on every attempt, so a SetQueueAttributes that changed DelaySeconds between a committed-but-conflicted attempt and its retry would shift the vis key and orphan the first attempt's vis index entry (a stale entry that can redeliver). Capture AvailableAtMillis in the identity at first mint and reuse it, so the keys stay stable across the retry regardless of a mid-flight delay change. Validation + delay resolution also run only on first mint, so an already-committed entry is never re-failed against a since-changed meta. buildSendRecordWithIdentity drops its delay param (AvailableAtMillis now lives in the identity); buildSendRecord's external signature is unchanged, so the single-send and FIFO callers are unaffected (audited). Tests: add TestSendMessageBatchStandard_VisKeyStableAcrossDelayChange (a mid-retry DelaySeconds bump must not move the keys).
Round-1 review fixes pushedgemini (medium) — wasted identity minting on FIFO batches (FIXED)Identities are now minted lazily on the first standard-path attempt (into a zero-valued, by-reference slice reused across retries) instead of all upfront. FIFO batches — which mint their own per-entry dedup-fenced identity — pay zero crypto/rand cost for identities they ignore. codex (P2) — vis key not stable across a mid-retry DelaySeconds change (FIXED)
New test: @claude review |
|
Claude finished @bootjp's task in 2m 37s —— View job Claude finished @bootjp's task — Code Review: Round-1 fixes (
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8770f13aa5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| MessageGroupId: entry.MessageGroupId, | ||
| MessageDeduplicationId: entry.MessageDeduplicationId, | ||
| } | ||
| if id.messageID == "" { |
There was a problem hiding this comment.
Revalidate cached batch entries after OCC conflicts
When a standard batch attempt loses a normal OCC race to SetQueueAttributes, the first dispatch may not have committed any message, but this path has already populated id.messageID. On the retry, a now-cached entry skips mintBatchSendIdentity, so it also skips metadata-dependent checks such as MaximumMessageSize; if the concurrent attribute update lowered the limit below this body size, the retry can commit a message that the fresh queue metadata should reject. The stable identity should be reused for keys, but the per-attempt validation against the currently loaded meta still needs to run before committing.
Useful? React with 👍 / 👎.
… current meta every attempt codex P2 round-2: round-1 moved per-entry validation into the first-mint path, so an entry whose identity was minted by a prior UNCOMMITTED attempt (lost a normal OCC race) skipped validation on retry. If a concurrent SetQueueAttributes lowered MaximumMessageSize (or tightened another meta-dependent limit) in the meantime, the retry could commit a message the fresh meta should reject. Split the two concerns: - validateBatchEntry runs against the CURRENT meta on EVERY attempt (body required, MaximumMessageSize, attributes, FIFO params, delay bounds) and returns the resolved delay. - The stable identity (MessageID/token/timestamps incl. AvailableAtMillis) is still minted ONCE and reused, so the keys stay stable across retries (round-1 vis-key stability preserved); AvailableAtMillis is seeded from the first attempt's delay, and re-resolving delay per attempt only re-checks bounds. Caller audit: validateBatchEntry has one caller (buildBatchSendRecord); buildBatchSendRecord has one caller (sendBatchStandardOnce); mintBatchSendIdentity removed (no refs). Single-send/FIFO paths unaffected (buildSendRecord unchanged). Test: TestSendMessageBatchStandard_RevalidatesAgainstCurrentMetaOnRetry — a mid-flight MaximumMessageSize drop must make the retry reject the now-too-big entry.
Round-2 review fix pushedcodex (P2 round-2) — re-validate batch entries against current meta on every attempt (FIXED)Round-1 moved per-entry validation into the first-mint path, so an entry whose identity was minted by a prior uncommitted attempt (lost a normal OCC race) skipped validation on retry — if a concurrent Split the two concerns:
Caller audit: New test: @claude review |
|
Claude finished @bootjp's task in 3m 10s —— View job Code Review: Round-2 fix (
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ac3e12870b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // first attempt and reused on every retry; reusing it keeps the storage | ||
| // keys constant across retries so a committed-but-conflicted attempt is | ||
| // overwritten idempotently rather than double-sent. | ||
| rec, recordBytes, apiErr := buildBatchSendRecord(meta, entry, &identities[i]) |
There was a problem hiding this comment.
Fence retry overwrites against consumer mutations
When the first batch dispatch actually commits but returns a retryable conflict, this reused identity makes the retry PUT the same data/vis/by-age keys again, while the transaction still only reads the queue meta/generation keys. If the message is visible and a consumer receives or deletes it during the retry backoff, the second PUT can overwrite the rotated CurrentReceiptToken/VisibleAtMillis or even recreate a deleted record, because receive/delete mutate these same message keys under their own OCC transactions. The idempotent retry needs to read/fence the message keys or otherwise avoid overwriting a record that has changed since the first attempt.
Useful? React with 👍 / 👎.
…doc + note residual edge claude[bot] approved round-2; these are the two non-blocking nits it raised: - Remove the stale first paragraph of buildBatchSendRecord's doc comment (it referenced a *sqsAPIError return that no longer exists and duplicated the accurate description that follows). - Document the residual edge in the design doc: if attempt 1 commits and a concurrent SetQueueAttributes tightens a limit before the retry, the retry re-validates and reports the (already-stored) entry as failed — an inconsistent client view within the at-least-once contract, never a double-send. No code change. No functional change.
Round-3 (docs only) pushed — addresses the two non-blocking nits from the round-2 approvalclaude[bot] approved round-2; these close the remaining observations:
No functional change (comment + markdown only); @claude review |
|
Claude finished @bootjp's task in 1m 56s —— View job Code Review: Round-3 (docs only)
Stale doc comment removal — correctThe four stale lines are gone. What remains (
The Design doc update — correct and well-scopedThe updated paragraph (
The residual edge framing ("distinguishing committed-vs-not would need a dedup probe, which is out of proportion for this corner") is accurate and proportionate. Unaddressed codex P2 from round-3 — note for completenessThe codex review posted at 2026-06-04T13:13:04Z (reviewing the round-2 commit) surfaced a new P2: "Fence retry overwrites against consumer mutations" at The concern: if attempt 1 commits but returns a retryable conflict, and during the retry backoff a consumer calls Assessment: Real but narrow-window and within contract. The conditions require: zero delay (message immediately visible), a consumer calling VerdictApprove. The round-3 docs-only changes are correct, precise, and close the two nits from the round-2 approval. The unaddressed codex P2 on consumer-mutation fencing is real but narrow, within the at-least-once contract, and out of scope for this PR's fix. Safe to merge. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 65fe7d0d75
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| delay, err := validateBatchEntry(meta, entry, asSingle) | ||
| if err != nil { | ||
| return nil, nil, err |
There was a problem hiding this comment.
Preserve committed entries after retry-time revalidation
When the first standard batch attempt actually commits but surfaces a retryable conflict (the leader-churn case this patch is fixing), a concurrent SetQueueAttributes that lowers MaximumMessageSize before the retry makes this validation return an error even though id.messageID is already cached from the committed attempt. If all affected entries fail here, sendBatchStandardOnce returns them in Failed[] without rewriting anything, so the client receives a sender-fault failure for a message that is already visible in the queue. Fresh evidence beyond the earlier revalidation discussion is that validation now runs before checking whether the stable identity was already minted for a prior attempt.
Useful? React with 👍 / 👎.
codex P2 ×2 → deferred to #931 (decision: merge this fix, follow up on the full dedup)Both new codex P2 findings on Per maintainer decision, this PR ships the double-send fix (the Jepsen-caught
Net: strict improvement over |
Summary
Fixes a double-send bug in the SQS standard-queue
SendMessageBatchpath — the same:duplicate-elementsclass fixed for DynamoDB in PR #920, found by an adversarial S3/SQS audit prompted by the "extend dedup to S3/SQS" follow-up.The bug
sendMessageBatchWithRetry→trySendMessageBatchOnce→sendBatchStandardOnceruns an in-process OCC retry loop that, on every attempt, re-ranbuildBatchSendRecord→buildSendRecord→newMessageIDHex(), minting a fresh random MessageID (and send timestamp) per entry per attempt, and derived thedata/vis/byagestorage keys from them.Under a Raft leader-election storm, attempt 1 can commit its writes yet surface a self-inflicted
WriteConflictto the adapter (the mechanism PR #920 documented). The retry then mints new MessageIDs and writes a second copy of every entry at new keys:{M1,M2,M3}, commits, but returnsWriteConflictunder churn.{M1',M2',M3'}, commits cleanly, returns{M1',M2',M3'}+ HTTP 200.Standard queues carry no dedup identity, and the OCC
ReadKeysare only[meta, gen](the random data keys never self-conflict), so nothing fenced this.Why only batch: single
SendMessagehas no in-process retry (a conflict becomes a normal at-least-once SDK retry), and FIFO send is fenced by theMessageDeduplicationIddedup record. The batch path added an in-process retry without either fence.The fix
Pre-generate one stable
sqsSendIdentity(MessageID + receipt token + send timestamp) per entry, before the retry loop, and reuse it on every attempt (buildSendRecordWithIdentity). The storage keys then stay constant across retries, so a committed-but-conflicted attempt is overwritten idempotently instead of duplicated.No FSM probe /
PrevCommitTS/ gate is needed — unlike DynamoDB'slist_append(which re-read a growing value), the batch write is a whole-value PUT at a now-stable key, so re-applying is a plain idempotent overwrite. The record'sQueueGenerationis still re-derived per attempt, so a concurrentDeleteQueue/PurgeQueuegeneration bump is followed correctly (the stale-gen attempt-1 write is orphaned/unreachable).S3/SQS audit outcome (recorded in the design doc)
A handler-by-handler adversarial sweep found the original "S3/SQS share the same latent gap" note was over-broad:
PutObjectis OCC-fenced by a caller-suppliedStartTS(a leader-move retry that races an interleaved write fails the fence rather than clobbering), and has no in-process re-read-recompute.CompleteMultipartUpload's manifest is re-assembled deterministically from the immutable uploaded parts (computed once before its retry loop), not from a re-read of the object value → no duplicate.SendMessage/ FIFO — safe (random-UUID key + no in-process retry; FIFO dedup record).SendMessageBatch— fixed here.Tests (
adapter/sqs_batch_send_dedup_test.go)TestSendMessageBatchStandard_RetryReusesStableKeys— first dispatch commits-then-conflicts; asserts the retry dispatches the same storage keys (fresh MessageIDs would fail it → reproduces the bug pre-fix).TestSendMessageBatchStandard_ReturnedMessageIdsStableAcrossRetry— the returned MessageId is the one actually written on both attempts (client id == stored id).Validation
go test ./adapter/ -run 'SendMessage|Batch|FIFO|Send'pass; new dedup tests pass.golangci-lint run ./adapter/...0 issues;go build ./...clean.Self-review (5 passes)
[meta,gen]ReadKeys fence againstDeleteQueue/PurgeQueueis unchanged.crypto/randcalls on the retry path); hot path (no retry) is one mint per entry, same as before.