Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion benchmarks/history/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ iteration are stored alongside, and each number is the **median across several
cores** with its spread recorded (see below).

The matrix is **dense**: every message shape is measured against every release
(v0.1.0–v0.7.1), not just from the release that first added it to the suite. A
(v0.1.0–v0.8.0), not just from the release that first added it to the suite. A
shape is a property of the protobuf schema, not of any buffa version — buffa
v0.1.0 could always decode a `MediaFrame`, we just never asked it to — so the
canonical shapes and datasets are fed to each release's own codegen and every
Expand Down
160 changes: 80 additions & 80 deletions benchmarks/history/REPORT.md

Large diffs are not rendered by default.

26 changes: 17 additions & 9 deletions benchmarks/history/annotations.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Why the numbers in [REPORT.md](REPORT.md) move. The data is a **dense,
per-message-isolated, layout-normalized matrix**: every message shape is measured
against every release (v0.1.0–v0.7.1), each built with only its own decoder
against every release (v0.1.0–v0.8.0), each built with only its own decoder
compiled, at the pinned toolchain (1.96.0), `lto=true, codegen-units=1`, and
**64-byte block alignment** (`-Cllvm-args=-align-all-nofallthru-blocks=6`), median
of 32 cores. See [DESIGN.md](DESIGN.md) for the system and [README.md](README.md)
Expand All @@ -24,25 +24,33 @@ below. The per-operation "Measurement spread" table in [REPORT.md](REPORT.md) an
the per-benchmark spread in `runs/*.json` remain the place to check how far a given
number can be trusted.

## Headline cross-release findings (v0.1.0 → v0.7.1)
## Headline cross-release findings (v0.1.0 → v0.8.0)

A movement counts as real if it is large *for its operation* and **persists** across
releases. Two findings stand, both now sitting on clean, flat baselines:

releases. Three findings stand, each now sitting on a clean, flat baseline:

- **`decode_view` +17–21% on string-heavy shapes at v0.8.0** — LogRecord 1429→1670,
AnalyticsEvent 205→249 MiB/s. This is the `fast-utf8` default landing: UTF-8
validation in `borrow_str` switched from `core::str::from_utf8` to
`smoothutf8::verify_with_slack`, which skips the per-string tail copy whenever
the wire buffer continues past the field. ApiResponse `decode_view` is flat under
isolation (the strings there are dominated by other field kinds), and MediaFrame
is +3% (bytes-dominated; little string work). The owned `decode`/`merge` paths
move less because the same validation is a smaller fraction of cycles once
per-field allocation enters.
- **AnalyticsEvent `encode` −12% / `compute_size` −9%** — a real regression. A step
down at v0.4.0 (encode 468→414, compute_size 1379→1262 MiB/s) that holds flat
through v0.7.1. `compute_size` is the tightest operation and corroborates the
through v0.8.0. `compute_size` is the tightest operation and corroborates the
`encode` figure, so the deeply nested, repeated-submessage shape genuinely lost
ground on the owned encode/size paths — the one result worth investigating.
- **PackedTile `decode_view` +47% at v0.7.1** — flat (~175 MiB/s) from v0.1.0
through v0.7.0, then a single-release jump to ~257 at v0.7.1, consistent with the
packed-varint reserve work in that release. A 47% step is well clear of noise; but
it is the latest release, so "persists" isn't confirmable yet.
packed-varint reserve work in that release; v0.8.0 confirms the step persists.

Everything else is flat across the eight releases — including all of `json_encode` /
Everything else is flat across the nine releases — including all of `json_encode` /
`json_decode`, which now hold steady at their fast value (LogRecord `json_encode`
~880 MiB/s at every release, vs a 19% flap before normalization). buffa's core
paths did not regress; the reassuring headline is that eight releases of `decode`,
paths did not regress; the reassuring headline is that nine releases of `decode`,
`merge`, and the JSON paths hold steady once layout is controlled.

## Layout normalization — why, and what it costs
Expand Down
117 changes: 62 additions & 55 deletions benchmarks/history/charts/compute_size.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Loading