fix: COBS code-byte count: ceil(u/254) vs floor(u/254) + 1 by rayozzie · Pull Request #222 · blues/note-c

rayozzie · 2026-02-10T14:25:29Z

@zfields this could be crap but it's claude's analysis and fix.

The bug is not in the notecard, but i have reverted the notecard anyway just to put it back to its conservative approach to be ore forgiving anyone else has a bug at the app level like this.

Obviously you should look deeply and analyze. BTW when you DO ultimately send code back to him, please ping me and i'll give you the right notecard code to send him.

======================================================================

Background

The Notecard firmware's cobsGuaranteedFit() tells the host library the maximum number of unencoded binary bytes that will safely fit in the Notecard's I/O buffer after COBS encoding. For years, the Notecard used a simple, slightly conservative formula:

overhead = 1 + (buflen / 254) + 1
return buflen - overhead

This estimates the COBS overhead from the buffer size rather than from the (smaller) payload size, which systematically overestimates the number of code bytes by a small amount. For the Notecard's 256 KB I/O buffer (262,144 bytes), this formula reports max = 261,110.

A recent firmware optimization replaced this with a formula intended to compute the mathematically exact answer. For the same buffer it now reports max = 261,114 -- four bytes more than before.

A customer immediately reported that filling the buffer to this new maximum causes card.binary.get to fail. The old limit of 261,110 had always worked; 261,114 does not.

The bug: ceil(u/254) vs floor(u/254) + 1

COBS encoding partitions data into blocks of up to 254 bytes, prefixing each block with a "code byte" that indicates its length (or the distance to the next zero). The central question for buffer-size arithmetic is: how many code bytes does worst-case encoding produce for u data bytes?

The note-c library computed this as:

ceil(u / 254)

The correct answer is:

floor(u / 254) + 1

For most values these are identical. They diverge at exact multiples of 254:

ceil(254 / 254) = 1     WRONG
floor(254 / 254) + 1 = 2     CORRECT

Why the +1? Walk through what the encoder actually does with 254 non-zero bytes:

1. Reserve initial code byte             (code = 1)
2. Copy 254 data bytes                   (code increments to 0xFF)
3. code == 0xFF: write it, reserve new   (first code byte emitted)
4. No data remains; loop exits
5. Write final code byte = 0x01          (second code byte emitted)

Output: [0xFF] [254 data bytes] [0x01]  =  256 bytes

The encoder ALWAYS writes a final code byte for the last block, even when the data ends exactly on a 254-byte boundary. That final code byte is the one that ceil() misses. ceil(254/254) = 1, but the encoder emitted two code bytes.

This error appeared in two note-c functions:

_cobsEncodedMaxLength(u): Used ceil(u/254) for the code byte count,
underestimating by 1 at exact multiples of 254. For u = 254 it
returned 256 instead of the correct 257.

_cobsGuaranteedFit(bufLen): The algebraic inverse, which solved for
the largest payload that fits in a given buffer. It used the same
flawed ceil-based derivation. At buffer sizes of the form 255k + 1
(i.e. 256, 511, 766, ...) it overestimated the safe payload by 1,
promising a fit that would actually overflow after encoding.

Why it was hidden, and why it surfaced

The Notecard's old conservative cobsGuaranteedFit reported max = 261,110 for a 262,144-byte buffer. The exact answer is 261,114. That four-byte gap meant the host never attempted to store a payload at a size where the ceil-vs-floor discrepancy in _cobsEncodedMaxLength could matter. The conservative formula had been quietly absorbing the note-c library's arithmetic error for years.

When the Notecard's formula was updated to report the exact maximum, the safety margin vanished. The host was now operating at the true edge of the buffer, where every byte of COBS overhead must be accounted for precisely. Any arithmetic error -- in the host library's buffer validation, in the encoding expansion calculation, or anywhere else in the pipeline -- would now manifest as a real failure instead of being silently absorbed by the four spare bytes.

The Notecard firmware has been reverted to the old conservative formula. It has a long track record, is obviously correct (it can only underestimate capacity, never overestimate), and its cost is negligible: roughly bufLen/64770 bytes of unused capacity, which is about 4 bytes per 256 KB.

The fix (note-c)

_cobsEncodedMaxLength: Replace the ceil-based code byte computation with floor + 1:

Old:  codeBytes = (length / 254) + ((length % 254) > 0)   // ceil
New:  codeBytes = (length / 254) + 1                       // floor + 1

_cobsGuaranteedFit: Replace the flawed inverse derivation. Given the constraint u + floor(u/254) + 2 <= bufLen:

Old:  e = bufLen - 1;  return e - floor((e + 254) / 255)
New:  t = bufLen - 2;  return t - floor((t + 1) / 255)

Both new formulas have been verified against a reference COBS encoder for all tested values from 0 to 262,144.

Tests added

_cobsEncodedMaxLength: Tests at exact multiples of 254 (254, 508) that fail with ceil() but pass with floor()+1. Also tests at 253 and 255 to confirm non-boundary values are unaffected.

_cobsGuaranteedFit: Tests at 255k+1 buffer sizes (256, 511, 766) that fail with the old formula. Also tests at 257 (non-boundary) and 262,144 (the customer's actual buffer size) to anchor the expected behavior.

Round-trip consistency: For a range of buffer sizes -- including every 255k and 255k+1 value from 0 to 2,550, plus large values up to 262,144 -- the tests verify that _cobsEncodedMaxLength(_cobsGuaranteedFit(B)) never exceeds B. This ensures the two functions agree with each other and that the guaranteed-fit promise is upheld.

====================================================================== Background ---------- The Notecard firmware's cobsGuaranteedFit() tells the host library the maximum number of unencoded binary bytes that will safely fit in the Notecard's I/O buffer after COBS encoding. For years, the Notecard used a simple, slightly conservative formula: overhead = 1 + (buflen / 254) + 1 return buflen - overhead This estimates the COBS overhead from the *buffer* size rather than from the (smaller) payload size, which systematically overestimates the number of code bytes by a small amount. For the Notecard's 256 KB I/O buffer (262,144 bytes), this formula reports max = 261,110. A recent firmware optimization replaced this with a formula intended to compute the mathematically exact answer. For the same buffer it now reports max = 261,114 -- four bytes more than before. A customer immediately reported that filling the buffer to this new maximum causes card.binary.get to fail. The old limit of 261,110 had always worked; 261,114 does not. The bug: ceil(u/254) vs floor(u/254) + 1 ----------------------------------------- COBS encoding partitions data into blocks of up to 254 bytes, prefixing each block with a "code byte" that indicates its length (or the distance to the next zero). The central question for buffer-size arithmetic is: how many code bytes does worst-case encoding produce for u data bytes? The note-c library computed this as: ceil(u / 254) The correct answer is: floor(u / 254) + 1 For most values these are identical. They diverge at exact multiples of 254: ceil(254 / 254) = 1 WRONG floor(254 / 254) + 1 = 2 CORRECT Why the +1? Walk through what the encoder actually does with 254 non-zero bytes: 1. Reserve initial code byte (code = 1) 2. Copy 254 data bytes (code increments to 0xFF) 3. code == 0xFF: write it, reserve new (first code byte emitted) 4. No data remains; loop exits 5. Write final code byte = 0x01 (second code byte emitted) Output: [0xFF] [254 data bytes] [0x01] = 256 bytes The encoder ALWAYS writes a final code byte for the last block, even when the data ends exactly on a 254-byte boundary. That final code byte is the one that ceil() misses. ceil(254/254) = 1, but the encoder emitted two code bytes. This error appeared in two note-c functions: _cobsEncodedMaxLength(u): Used ceil(u/254) for the code byte count, underestimating by 1 at exact multiples of 254. For u = 254 it returned 256 instead of the correct 257. _cobsGuaranteedFit(bufLen): The algebraic inverse, which solved for the largest payload that fits in a given buffer. It used the same flawed ceil-based derivation. At buffer sizes of the form 255k + 1 (i.e. 256, 511, 766, ...) it overestimated the safe payload by 1, promising a fit that would actually overflow after encoding. Why it was hidden, and why it surfaced -------------------------------------- The Notecard's old conservative cobsGuaranteedFit reported max = 261,110 for a 262,144-byte buffer. The exact answer is 261,114. That four-byte gap meant the host never attempted to store a payload at a size where the ceil-vs-floor discrepancy in _cobsEncodedMaxLength could matter. The conservative formula had been quietly absorbing the note-c library's arithmetic error for years. When the Notecard's formula was updated to report the exact maximum, the safety margin vanished. The host was now operating at the true edge of the buffer, where every byte of COBS overhead must be accounted for precisely. Any arithmetic error -- in the host library's buffer validation, in the encoding expansion calculation, or anywhere else in the pipeline -- would now manifest as a real failure instead of being silently absorbed by the four spare bytes. The Notecard firmware has been reverted to the old conservative formula. It has a long track record, is obviously correct (it can only underestimate capacity, never overestimate), and its cost is negligible: roughly bufLen/64770 bytes of unused capacity, which is about 4 bytes per 256 KB. The fix (note-c) ---------------- _cobsEncodedMaxLength: Replace the ceil-based code byte computation with floor + 1: Old: codeBytes = (length / 254) + ((length % 254) > 0) // ceil New: codeBytes = (length / 254) + 1 // floor + 1 _cobsGuaranteedFit: Replace the flawed inverse derivation. Given the constraint u + floor(u/254) + 2 <= bufLen: Old: e = bufLen - 1; return e - floor((e + 254) / 255) New: t = bufLen - 2; return t - floor((t + 1) / 255) Both new formulas have been verified against a reference COBS encoder for all tested values from 0 to 262,144. Tests added ----------- _cobsEncodedMaxLength: Tests at exact multiples of 254 (254, 508) that fail with ceil() but pass with floor()+1. Also tests at 253 and 255 to confirm non-boundary values are unaffected. _cobsGuaranteedFit: Tests at 255k+1 buffer sizes (256, 511, 766) that fail with the old formula. Also tests at 257 (non-boundary) and 262,144 (the customer's actual buffer size) to anchor the expected behavior. Round-trip consistency: For a range of buffer sizes -- including every 255k and 255k+1 value from 0 to 2,550, plus large values up to 262,144 -- the tests verify that _cobsEncodedMaxLength(_cobsGuaranteedFit(B)) never exceeds B. This ensures the two functions agree with each other and that the guaranteed-fit promise is upheld.

rayozzie requested a review from zfields February 10, 2026 14:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: COBS code-byte count: ceil(u/254) vs floor(u/254) + 1#222

fix: COBS code-byte count: ceil(u/254) vs floor(u/254) + 1#222
rayozzie wants to merge 1 commit intomasterfrom
ray/cobs-lengths

rayozzie commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rayozzie commented Feb 10, 2026

Background

The bug: ceil(u/254) vs floor(u/254) + 1

Why it was hidden, and why it surfaced

The fix (note-c)

Tests added

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant