forks/binaryen.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	Fuzzer: Remove --emit-js-shell logic and reuse fuzz_shell.js instead (#6310)	Alon Zakai	2024-02-20	1	-134/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We had two JS files that could run a wasm file for fuzzing purposes: * --emit-js-shell, which emitted a custom JS file that runs the wasm. * scripts/fuzz_shell.js, which was a generic file that did the same. Both of those load the wasm and then call the exports in order and print out logging as it goes of their return values (if any), exceptions, etc. Then the fuzzer compares that output to running the same wasm in another VM, etc. The difference is that one was custom for the wasm file, and one was generic. Aside from that they are similar and duplicated a bunch of code. This PR improves things by removing 1 and using 2 in all places, that is, we now use the generic file everywhere. I believe we added 1 because we thought a generic file can't do all the things we need, like know the order of exports and the types of return values, but in practice there are ways to do those things: The exports are in fact in the proper order (JS order of iteration is deterministic, thankfully), and for the type we don't want to print type internals anyhow since that would limit fuzzing --closed-world. We do need to be careful with types in JS (see notes in the PR about the type of null) but it's not too bad. As for the types of params, it's fine to pass in null for them all anyhow (null converts to a number or a reference without error).
*	[Fuzzer] Simplify the hang limit mechanism (#5513)	Alon Zakai	2023-02-23	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously the idea was that we started with HANG_LIMIT = 10 or so, and we'd decrement it by one in each potentially-recursive call and loop entry. When we reached 0 we'd start to unwind the stack. Then, after we unwound it all the way, we'd reset HANG_LIMIT before calling the next export. That approach adds complexity that each "execution wrapper", like for JS or for --fuzz-exec, had to manually reset HANG_LIMIT. That was done by calling an export. Calls to those exports had to appear in various places, which is sort of a hack. The new approach here does the following when the hang limit reaches zero: It resets HANG_LIMIT, and it traps. The trap unwinds the call stack all the way out. When the next export is called, it will have a fresh hang limit since we reset it before the trap. This does have downsides. Before, we did not always trap when we hit the hang limit but rather we'd emit something unreachable, like a return. The idea was that we'd leave the current function scope at least, so we don't hang forever. That let us still execute a small amount of code "on the way out" as we unwind the stack. I'm not sure it's worth the complexity for that. The advantages of this PR are to simplify the code, and also it makes more fuzzing approaches easy to implement. I'd like to add a wasm-ctor-eval fuzzer, and having to add hacks to call the hang limit init export in it would be tricky. With this PR, the execution model is simple in the fuzzer: The exports are called one by one, in order, and that's it - no extra magic execution needs to be done. Also bump the hang limit from 10 to 100, just to give some more chance for code to run.
*	Make `Name` a pointer, length pair (#5122)	Thomas Lively	2022-10-11	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With the goal of supporting null characters (i.e. zero bytes) in strings. Rewrite the underlying interned `IString` to store a `std::string_view` rather than a `const char`, reduce the number of map lookups necessary to intern a string, and present a more immutable interface. Most importantly, replace the `c_str()` method that returned a `const char` with a `toString()` method that returns a `std::string`. This new method can correctly handle strings containing null characters. A `const char` can still be had by calling `data()` on the `std::string_view`, although this usage should be discouraged. This change is NFC in spirit, although not in practice. It does not intend to support any particular new functionality, but it is probably now possible to use strings containing null characters in at least some cases. At least one parser bug is also incidentally fixed. Follow-on PRs will explicitly support and test strings containing nulls for particular use cases. The C API still uses `const char` to represent strings. As strings containing nulls become better supported by the rest of Binaryen, this will no longer be sufficient. Updating the C and JS APIs to use pointer, length pairs is left as future work.
*	Preserve Function HeapTypes (#3952)	Thomas Lively	2021-06-30	1	-5/+5
\| \| \| \| \| \| \| \| \|	When using nominal types, func.ref of two functions with identical signatures but different HeapTypes will yield different types. To preserve these semantics, Functions need to track their HeapTypes, not just their Signatures. This PR replaces the Signature field in Function with a HeapType field and adds new utility methods to make it almost as simple to update and query the function HeapType as it was to update and query the Function Signature.
*	Fuzzing in JS VMs: Emit null for reference type params instead of 0 (#3774)	Alon Zakai	2021-04-06	1	-4/+8
\| \| \| \|	VMs will not convert a 0 or undefined from JS into a wasm null reference - it must be null.
*	Fuzzing in JS VMs: Print types when we have nothing better (#3773)	Alon Zakai	2021-04-06	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	This matches #3747 which makes us not log out reference values, instead we print just their types. This also prints a type for non-reference things, replacing a previous exception, which affects things like SIMD and BigInts, but those trap anyhow at the JS boundary I believe (or did that change for SIMD?). Anyhow, printing the type won't give a false "ok" when comparing wasm2js output to the interpreter, assuming the interpreter prints out a value and not just a type (which is the case). We could try to do better, but this code is on the JS side, where we don't have the type - just a string representation of it, which we'd need to parse etc.
*	cleanup to allow binaryen to be built in more strict environments (#3566)	walkingeyerobot	2021-02-16	1	-0/+4
\|
*	Use const modifier when dealing with types (#3064)	Daniel Wirtz	2020-08-20	1	-1/+1
\| \| \|	Since they make the code clearer and more self-documenting.
*	Replace Type::expand() with an iterator-based approach (#3061)	Daniel Wirtz	2020-08-19	1	-1/+1
\| \| \|	This leads to simpler code and is a prerequisite for #3012, which makes it so that not all `Type`s are backed by vectors that `expand` could return.
*	Enable cross-VM fuzzing + related improvements to fuzz_opt.py (#2762)	Alon Zakai	2020-04-15	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The main benefit here is comparing VMs, instead of just comparing each VM to itself after opts. Comparing VMs is a little tricky since there is room for nondeterminism with how results are printed and other annoying things, which is why that didn't work well earlier. With this PR I can run 10's of thousands of iterations without finding any issues between v8 and the binaryen interpreter. That's after fixing the various issues over the last few days as found by this: #2760 #2757 #2750 #2752 Aside from that main benefit I ended up adding more improvements to make it practical to do all that testing: Randomize global fuzz settings like whether we allow NaNs and out-of-bounds memory accesses. (This was necessary here since we have to disable cross-VM comparisons if NaNs are enabled.) Better logging of statistics like how many times each handler was run. Remove redundant FuzzExecImmediately handler (looks like after past refactorings it was no longer adding any value). Deterministic testcase handling: if you run e.g. fuzz_opt.py 42 it will run one testcase and exactly the same one. If you run without an argument it will run forever until it fails, and if it fails, it prints out that ID so that you can easily reproduce it (I guess, on the same binaryen + same python, not sure how python's deterministic RNG changes between versions and builds). Upgrade to Python 3.
*	[NFC] Enforce use of `Type::` on type names (#2434)	Thomas Lively	2020-01-07	1	-1/+1
\|
*	Remove FunctionType (#2510)	Thomas Lively	2019-12-11	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Function signatures were previously redundantly stored on Function objects as well as on FunctionType objects. These two signature representations had to always be kept in sync, which was error-prone and needlessly complex. This PR takes advantage of the new ability of Type to represent multiple value types by consolidating function signatures as a pair of Types (params and results) stored on the Function object. Since there are no longer module-global named function types, significant changes had to be made to the printing and emitting of function types, as well as their parsing and manipulation in various passes. The C and JS APIs and their tests also had to be updated to remove named function types.
*	Multivalue type creation and inspection (#2459)	Thomas Lively	2019-11-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Adds the ability to create multivalue types from vectors of concrete value types. All types are transparently interned, so their representation is still a single uint32_t. Types can be extracted into vectors of their component parts, and all the single value types expand into vectors containing themselves. Multivalue types are not yet used in the IR, but their creation and inspection functionality is exposed and tested in the C and JS APIs. Also makes common type predicates methods of Type and improves the ergonomics of type printing.
*	clang-tidy braces changes (#2075)	Alon Zakai	2019-05-01	1	-1/+2
\| \| \|	Applies the changes in #2065, and temprarily disables the hook since it's too slow to run on a change this large. We should re-enable it in a later commit.
*	Apply format changes from #2048 (#2059)	Alon Zakai	2019-04-26	1	-11/+24
\| \| \|	Mass change to apply clang-format to everything. We are applying this in a PR by me so the (git) blame is all mine ;) but @aheejin did all the work to get clang-format set up and all the manual work to tidy up some things to make the output nicer in #2048
*	NaN fuzzing improvements (#1913)	Alon Zakai	2019-02-19	1	-1/+1
\| \| \| \| \| \| \| \| \|	* make DE_NAN avoid creating nan literals in the first place * add a reducer option `--denan` to not introduce nans in destructive reduction * add a `Literal::isNaN()` method * also remove the default exception logging from the fuzzer js glue, which is a source of non-useful VM differences (like nan nondeterminism) * added an option `--no-fuzz-nans` to make it easy to avoid nans when fuzzing (without hacking the source and recompiling). Background: trying to get fuzzing on jsc working despite this open issue: https://bugs.webkit.org/show_bug.cgi?id=175691
*	Compare binaryen fuzz-exec to JS VMs (#1856)	Alon Zakai	2019-01-10	1	-20/+28
\| \| \| \| \| \| \| \| \| \| \|	The main fuzz_opt.py script compares JS VMs, and separately runs binaryen's fuzz-exec that compares the binaryen interpreter to itself (before and after opts). This PR lets us directly compare binaryen's interpreter output to JS VMs. This found a bunch of minor things we can do better on both sides, giving more fuzz coverage. To enable this, a bunch of tiny fixes were needed: * Add --fuzz-exec-before which is like --fuzz-exec but just runs the code before opts are run, instead of before and after. * Normalize double printing (so JS and C++ print comparable things). This includes negative zero in JS, which we never printed properly til now. * Various improvements to how we print fuzz-exec logging - remove unuseful things, and normalize the others across JS and C++. * Properly legalize the wasm when --emit-js-wrapper (i.e., we will run the code from JS), and use that in the JS wrapper code.
*	Fix fuzzing JS glue code (#1843)	Alon Zakai	2018-12-27	1	-1/+17
\| \| \| \| \| \| \| \| \|	After we added logging to the fuzzer, we forgot to add to the JS glue code the necessary imports so it can be run there too. Also adds legalization for the JS glue code imports and exports. Also adds a missing validator check on imports having a function type (the fuzzing code was missing one). Fixes #1842
*	Rename WasmType => Type (#1398)	Alon Zakai	2018-02-02	1	-2/+2
\| \| \| \|	* rename WasmType to Type. it's in the wasm:: namespace anyhow, and without Wasm- it fits in better alongside Index, Address, Expression, Module, etc.
*	New fuzzer (#1126)	Alon Zakai	2017-08-11	1	-0/+89
	This adds a new method of fuzzing, "translate to fuzz" which means we consider the input to be a stream of data that we translate into a valid wasm module. It's sort of like a random seed for a process that creates a random wasm module. By using the input that way, we can explore the space of valid wasm modules quickly, and it makes afl-fuzz integration easy. Also adds a "fuzz binary" option which is similar to "fuzz execution". It makes wasm-opt not only execute the code before and after opts, but also write to binary and read from it, helping to fuzz the binary format.