Fix int32 overflow in decompress() for frames > 256 MiB#21
Open
rasmusfaber wants to merge 1 commit into
Open
Conversation
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
fzstd.decompress()throwsinvalid zstd data(or silently corrupts output) on any zstd frame whose compressed size exceeds 256 MiB.Root cause: bit positions are computed as
bt << 3, wherebtis a byte offset that grows across blocks within a frame.<<coerces to int32, so atbt = 2**28it wraps:The paired
>> 3then reads the wrong bytes and decoding derails. Four sites insrc/index.tsuse this pattern; the streamingDecompressclass is unaffected because it resets the offset per chunk.Fix
Replace
x << 3withx * 8so the bit position is a float64 that doesn't wrap, and replace the pairedx >> 3withMath.floor(x / 8)(14 sites).x & 7stays as-is, int32 coercion preserves the low three bits.Math.floor(x / 8)is chosen overx / 8 | 0because| 0would have wrapped negative once the quotient reached 2³¹ (compressed byte offset >= 2 GiB), introducing a secondary overflow within the 4 GiBUint8Arrayceiling.Math.flooris correct for any frame size.Performance
No measurable regression. On a 200 MiB match-heavy frame, 21 alternated runs per build: +0.44% (inside noise). V8 lowers
Math.floor(x / 8)to the same integer shift as>> 3on SMI operands, and the surrounding inner loop is memory-bound.End-to-end check on a real 1.38 GiB JSON payload (1.03 GiB compressed): pre-fix throws after 2.6 s; post-fix decompresses in ~11 s, sha256 matches the original.
Test
tests/large_frame_overflow_test.tsround-trips a 500 MiB match-heavy corpus through a single frame > 256 MiB,:Fails before this change (
invalid zstd data), passes after. Runs in ~7 s.Requires
zstdon PATH.I am happy to remove the test again, but I thought it useful to demonstrate the problem and verify the fix.