[interp] Stop on first error in the binary reader fuzzer#2778
Conversation
|
|
||
| if (func_index >= func_types_.size()) { | ||
| PrintError("invalid function index: %" PRIindex, func_index); | ||
| return Result::Error; |
There was a problem hiding this comment.
Why is this needed for OnReturnCallExpr but not OnCallExpr?
There was a problem hiding this comment.
OnCallExpr never dereferences func_types_. It just emits the opcode with the raw func_index, so a stale index can't read out of bounds there. OnReturnCallExpr is different: it reads func_types_[func_index] to work out the drop/keep counts, and the same index feeds GetFuncOffset, so the stale index actually gets dereferenced. That's why only this one needed the guard.
| TEST_F(InterpTest, FunctionSectionTypeOutOfRange) { | ||
| // The function section declares a function whose type index is out of range, | ||
| // followed by a code section with a matching body. When reading in | ||
| // collect-all-errors mode (stop_on_first_error=false), OnFunction rejects the |
There was a problem hiding this comment.
Does this officially supported? As far as I remember this part is a testing utility, and never intended to support invalid code. It would require a lot of changes in the code, increasing its complexity heavily.
See: https://github.com/WebAssembly/wabt/blob/main/src/interp/binary-reader-interp.cc#L526
Supporting invalid wasm is useful for text<->binary conversions, but not really for executing it.
There was a problem hiding this comment.
Fair point, the interp reader isn't meant to execute invalid wasm, and I'm not trying to make it do that. None of these three sites are on the execution path. They're hit purely while reading, via read_binary_interp_fuzzer.cc, which sets stop_on_first_error=false. That's how I found them: fuzzing the reader in that mode under ASan.
The sequence is: OnFunction rejects the out-of-range type and skips the append, but the reader carries on, so the validator's func count and the reader's func_types_/module_.funcs are now out of step. The next callback then indexes past the end. BeginElemSegment and BeginDataSegment already bounds-check that exact mismatch, so this just adds the same guard at the three sites that were missing it, no new recovery machinery.
If you'd rather the harness set stop_on_first_error=true so the reader bails on the first error instead, that's fine by me too.
There was a problem hiding this comment.
I understand the problem. I think we should only support stop_on_first_error=true, since continuing the parsing does not add any user value to the interpreter (other than make it complex). But lets wait @sbc100 's opinion.
There was a problem hiding this comment.
Fine by me. The harness side is a one-liner: read_binary_interp_fuzzer.cc passes stop_on_first_error=false today, flip it to true and the parse bails in OnFunction the moment the validator rejects the type index, well before BeginFunctionBody/OnExport/OnReturnCallExpr ever run with the stale index. If that's the direction you'd both prefer, I'll drop the three reader-side guards and just change the harness. Happy to follow whatever you and @sbc100 settle on.
There was a problem hiding this comment.
Yes, maybe lets remove the stop_on_first_error=false feature completely.
There was a problem hiding this comment.
Done. Dropped the three reader-side guards and the two tests, and flipped read_binary_interp_fuzzer.cc to stop_on_first_error=true. That was the only caller driving the interp reader in collect-all-errors mode, so the OOB sites are no longer reachable from it.
On removing it completely: objdump still sets stop_on_first_error=false (with fail_on_custom_section_error=false) so it can dump as much of a corrupt module as possible, so the flag and the recovery path in binary-reader.cc need to stay for that. The interp reader's only use of it is now gone. Happy to scope a wider removal as a separate change if you want, but that one would touch objdump's behaviour.
There was a problem hiding this comment.
I see, yes I suppose its useful of objdump
There was a problem hiding this comment.
Input reading and validation are two different steps, and the core specification tests has assert_malformed and assert_invalid tests. So for those tools, which just reads and writes raw webassebly (but does not interpret it in any way), supporting well-formed but invalid webassembly should be supported. For example: IR, wat2wasm wasm2wat or objdump. On the contrary, the c writer should not, since it "interprets" the code. Some tools even try to support malformed webassembly, but I don't know whether it is useful.
The interp binary reader is only meant to read valid modules. In collect-all-errors mode (stop_on_first_error=false) the SharedValidator keeps counting declarations the reader has already rejected, so the reader's func_types_/module_.funcs vectors and the validator's counts drift apart and later callbacks index past the end. read_binary_interp_fuzzer.cc was the only caller running the interp reader in that mode. Stop on the first error like every other interp reader caller, so the reader is no longer driven over invalid modules.
5a7f61a to
901d707
Compare
The interp binary reader is only meant to read valid modules. Every
ReadBinaryInterpcaller runs withstop_on_first_error=trueexceptread_binary_interp_fuzzer.cc, which passedfalse.In collect-all-errors mode the
SharedValidatorkeeps counting declarations the reader has already rejected. For example a function whose type index is out of range:OnFunctionskips the append, but the validator still counts it. So the reader'sfunc_types_/module_.funcsvectors and the validator's counts drift apart, and later callbacks index past the end (BeginFunctionBody,OnExport,OnReturnCallExpr).Per the review, rather than adding reader-side guards for a mode the interp reader was never meant to support, this stops the fuzzer on the first error like every other caller, so the reader is no longer driven over invalid modules.