| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We had two JS files that could run a wasm file for fuzzing purposes:
* --emit-js-shell, which emitted a custom JS file that runs the wasm.
* scripts/fuzz_shell.js, which was a generic file that did the same.
Both of those load the wasm and then call the exports in order and print out
logging as it goes of their return values (if any), exceptions, etc. Then the
fuzzer compares that output to running the same wasm in another VM, etc. The
difference is that one was custom for the wasm file, and one was generic. Aside
from that they are similar and duplicated a bunch of code.
This PR improves things by removing 1 and using 2 in all places, that is, we
now use the generic file everywhere.
I believe we added 1 because we thought a generic file can't do all the
things we need, like know the order of exports and the types of return values,
but in practice there are ways to do those things: The exports are in fact
in the proper order (JS order of iteration is deterministic, thankfully), and
for the type we don't want to print type internals anyhow since that would
limit fuzzing --closed-world. We do need to be careful with types in JS (see
notes in the PR about the type of null) but it's not too bad. As for the types
of params, it's fine to pass in null for them all anyhow (null converts to a
number or a reference without error).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously the idea was that we started with HANG_LIMIT = 10 or so, and we'd decrement
it by one in each potentially-recursive call and loop entry. When we reached 0 we'd start
to unwind the stack. Then, after we unwound it all the way, we'd reset HANG_LIMIT before
calling the next export.
That approach adds complexity that each "execution wrapper", like for JS or for --fuzz-exec,
had to manually reset HANG_LIMIT. That was done by calling an export. Calls to those
exports had to appear in various places, which is sort of a hack.
The new approach here does the following when the hang limit reaches zero: It resets
HANG_LIMIT, and it traps. The trap unwinds the call stack all the way out. When the next
export is called, it will have a fresh hang limit since we reset it before the trap.
This does have downsides. Before, we did not always trap when we hit the hang limit but
rather we'd emit something unreachable, like a return. The idea was that we'd leave the
current function scope at least, so we don't hang forever. That let us still execute a small
amount of code "on the way out" as we unwind the stack. I'm not sure it's worth the
complexity for that.
The advantages of this PR are to simplify the code, and also it makes more fuzzing
approaches easy to implement. I'd like to add a wasm-ctor-eval fuzzer, and having to add
hacks to call the hang limit init export in it would be tricky. With this PR, the execution
model is simple in the fuzzer: The exports are called one by one, in order, and that's it -
no extra magic execution needs to be done.
Also bump the hang limit from 10 to 100, just to give some more chance for code to run.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This matches #3747 which makes us not log out reference values, instead
we print just their types.
This also prints a type for non-reference things, replacing a previous
exception, which affects things like SIMD and BigInts, but those trap
anyhow at the JS boundary I believe (or did that change for SIMD?).
Anyhow, printing the type won't give a false "ok" when comparing wasm2js
output to the interpreter, assuming the interpreter prints out a value and
not just a type (which is the case). We could try to do better, but this
code is on the JS side, where we don't have the type - just a string
representation of it, which we'd need to parse etc.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main benefit here is comparing VMs, instead of just comparing
each VM to itself after opts. Comparing VMs is a little tricky since there
is room for nondeterminism with how results are printed and other
annoying things, which is why that didn't work well earlier.
With this PR I can run 10's of thousands of iterations without finding
any issues between v8 and the binaryen interpreter. That's after
fixing the various issues over the last few days as found by this:
#2760 #2757 #2750 #2752
Aside from that main benefit I ended up adding more improvements
to make it practical to do all that testing:
Randomize global fuzz settings like whether we allow NaNs and
out-of-bounds memory accesses. (This was necessary here since
we have to disable cross-VM comparisons if NaNs are enabled.)
Better logging of statistics like how many times each handler
was run.
Remove redundant FuzzExecImmediately handler (looks like
after past refactorings it was no longer adding any value).
Deterministic testcase handling: if you run e.g. fuzz_opt.py 42 it
will run one testcase and exactly the same one. If you run without
an argument it will run forever until it fails, and if it fails, it prints out
that ID so that you can easily reproduce it (I guess, on the same
binaryen + same python, not sure how python's deterministic
RNG changes between versions and builds).
Upgrade to Python 3.
|
|
|
|
|
|
|
|
|
| |
* make DE_NAN avoid creating nan literals in the first place
* add a reducer option `--denan` to not introduce nans in destructive reduction
* add a `Literal::isNaN()` method
* also remove the default exception logging from the fuzzer js glue, which is a source of non-useful VM differences (like nan nondeterminism)
* added an option `--no-fuzz-nans` to make it easy to avoid nans when fuzzing (without hacking the source and recompiling).
Background: trying to get fuzzing on jsc working despite this open issue: https://bugs.webkit.org/show_bug.cgi?id=175691
|
|
|
|
|
|
|
|
|
|
|
| |
The main fuzz_opt.py script compares JS VMs, and separately runs binaryen's fuzz-exec that compares the binaryen interpreter to itself (before and after opts). This PR lets us directly compare binaryen's interpreter output to JS VMs. This found a bunch of minor things we can do better on both sides, giving more fuzz coverage.
To enable this, a bunch of tiny fixes were needed:
* Add --fuzz-exec-before which is like --fuzz-exec but just runs the code before opts are run, instead of before and after.
* Normalize double printing (so JS and C++ print comparable things). This includes negative zero in JS, which we never printed properly til now.
* Various improvements to how we print fuzz-exec logging - remove unuseful things, and normalize the others across JS and C++.
* Properly legalize the wasm when --emit-js-wrapper (i.e., we will run the code from JS), and use that in the JS wrapper code.
|
|
|
|
|
|
|
|
|
| |
After we added logging to the fuzzer, we forgot to add to the JS glue code the necessary imports so it can be run there too.
Also adds legalization for the JS glue code imports and exports.
Also adds a missing validator check on imports having a function type (the fuzzing code was missing one).
Fixes #1842
|
|
This adds a new method of fuzzing, "translate to fuzz" which means we consider the input to be a stream of data that we translate into a valid wasm module. It's sort of like a random seed for a process that creates a random wasm module. By using the input that way, we can explore the space of valid wasm modules quickly, and it makes afl-fuzz integration easy.
Also adds a "fuzz binary" option which is similar to "fuzz execution". It makes wasm-opt not only execute the code before and after opts, but also write to binary and read from it, helping to fuzz the binary format.
|