Skip to content

Add a MemorySanitizer CI job for the offline engine self-tests#433

Merged
xroche merged 1 commit into
masterfrom
ci/msan-job
Jun 27, 2026
Merged

Add a MemorySanitizer CI job for the offline engine self-tests#433
xroche merged 1 commit into
masterfrom
ci/msan-job

Conversation

@xroche

@xroche xroche commented Jun 26, 2026

Copy link
Copy Markdown
Owner

MemorySanitizer is the only sanitizer that catches a read of uninitialized memory, the bug behind #143: the size filter tested an uninitialized stack value and forbade files at random. ASan and UBSan miss that class entirely, so nothing guards against a regression deterministically today.

MSan treats any byte written by an uninstrumented library as uninitialized, so the job is scoped tightly: clang with a static link (the MSan runtime isn't injected into shared objects), --disable-https to drop openssl, and only the offline 01_engine-* self-tests minus the zlib-backed cache trio. Those tests push the hostile-input parsers (charset, mime, HTML, entities, IDNA, filters) straight through MSan. A full-crawl run would need instrumented builds of zlib and openssl, fragile infrastructure for paths ASan+UBSan already cover.

Validated locally: the 19 selected tests pass clean, and reintroducing #143's bug makes MSan pinpoint the uninitialized read and fail the test.

MSan is the only sanitizer that catches a read of uninitialized memory --
the class of #143, where the size filter tested an uninitialized stack
LLint and forbade files at random. ASan and UBSan let that through.

MSan reports any byte produced by an uninstrumented library as
uninitialized, so the job stays inside our own code: clang, a static link
(the MSan runtime is not injected into shared objects), --disable-https to
drop openssl, and only the offline 01_engine-* self-tests minus the
zlib-backed cache trio. Those self-tests drive the hostile-input parsers
(charset, mime, html, entities, idna, filters) straight through MSan.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: Xavier Roche <roche@httrack.com>
@xroche xroche merged commit 768756e into master Jun 27, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant