summaryrefslogtreecommitdiff
path: root/scripts/fuzz_opt.py
Commit message (Collapse)AuthorAgeFilesLines
* [threads] Ignore shared-array.wast in fuzzer initial contents (#6706)Thomas Lively2024-06-261-0/+1
|
* [threads] Validate shared-polymorphic instructions (#6702)Thomas Lively2024-06-251-0/+1
| | | | Such as `ref.eq`, `i31.get_{s,u}`, and `array.len`. Also validate that struct and array operations work on shared structs and arrays.
* [threads] Binary reading and writing of shared composite types (#6664)Thomas Lively2024-06-141-0/+4
| | | | Also update the parser so that implicit type uses are not matched with shared function types.
* Fuzzer: Stop testing with TurboFan as Turboshaft is rolling out and is ↵Alon Zakai2024-05-281-7/+0
| | | | faster (#6623)
* Fix binary emitting of br_if with a refined value by emitting a cast (#6510)Alon Zakai2024-05-161-1/+18
| | | | | | | | | | | | | | | | This makes us compliant with the wasm spec by adding a cast: we use the refined type for br_if fallthrough values, and the wasm spec uses the branch target. If the two differ, we add a cast after the br_if to make things match. Alternatively we could match the wasm spec's typing in our IR, but we hope the wasm spec will improve here, and so this is will only be temporary in that case. Even if not, this is useful because by using the most refined type in the IR we optimize in the best way possible, and only suffer when we emit fixups in the binary, but in practice those cases are very rare: br_if is almost always dropped rather than used, in real-world code (except for fuzz cases and exploits). We check carefully when a br_if value is actually used (and not dropped) and its type actually differs, and it does not already have a cast. The last condition ensures that we do not keep adding casts over repeated roundtripping.
* Fuzzer: Add another stringview subtyping error message to ignore (#6575)Alon Zakai2024-05-091-0/+1
|
* Fuzzer: Stop emitting nullable stringviews (#6574)Alon Zakai2024-05-081-0/+3
| | | | | | | | | | | | | As of https://chromium-review.googlesource.com/c/v8/v8/+/5471674 V8 requires stringviews to be non-nullable. It might be possible to make that change in our IR, or to remove views entirely, but for now this PR makes the fuzzer stop emitting nullable stringviews as a workaround to allow us to fuzz current V8. There are still rare corner cases where this pattern is emitted, that we have not tracked down, and so this also makes the fuzzer ignore the error for now.
* Re-enable fuzzing of text round trips (#6560)Thomas Lively2024-04-291-2/+1
| | | The bug that had been preventing this fuzzing no longer reproduces.
* [Strings] Add the string heaptype to core fuzzer places (#6527)Alon Zakai2024-04-231-1/+2
| | | | | | | With this we emit strings spontaneously (as opposed to just getting them from initial contents). The relevant -ttf test has been tweaked slightly to show the impact of this change: now there are some string.new/const in the output.
* Fuzzer: Do not run other VMs if the Binaryen interpreter hits a VM ↵Alon Zakai2024-04-101-10/+19
| | | | | | | | | limitation (#6483) The VM limitation might be an OOM (which can change due to opts) or an atomic wait (which can hang on proper VMs with support). We already avoided running VMs on the optimized wasm in this case, but we still ran them on the original wasm, which this changes, mainly to avoid that atomic wait situation.
* [NFC] Fix fuzzer counting of ignored runs due to many errors (#6444)Alon Zakai2024-03-271-4/+12
| | | | | | When the interpreter sees that most exports simply trap we mark the iteration as ignored. But we run the interpreter on the before wasm and also the after wasm, so we were incrementing that counter by 2 each time, which could be misleading.
* [Strings] Represent string values as WTF-16 internally (#6418)Thomas Lively2024-03-221-3/+0
| | | | | | | | | | | | | | | | WTF-16, i.e. arbitrary sequences of 16-bit values, is the encoding of Java and JavaScript strings, and using the same encoding makes the interpretation of string operations trivial, even when accounting for non-ascii characters. Specifically, use little-endian WTF-16. Re-encode string constants from WTF-8 to WTF-16 in the parsers, then back to WTF-8 in the writers. Update the constructor for string `Literal`s to interpret the string as WTF-16 and store a sequence of WTF-16 code units, i.e. 16-bit integers. Update `Builder::makeConstantExpression` accordingly to convert from the new `Literal` string representation back to a WTF-16 string. Update the interpreter to remove the logic for detecting non-ascii characters and bailing out. The naive implementations of all the string operations are correct now that our string encoding matches the JS string encoding.
* Update file name in INITIAL_CONTENTS_IGNORE (#6425)Thomas Lively2024-03-221-1/+1
| | | | The test file was renamed, but the fuzzer still used the old name in INITIAL_CONTENTS_IGNORE.
* [Strings] Implement stringview_wtf16.slice (#6404)Alon Zakai2024-03-191-1/+2
|
* Typed continuations: suspend instructions (#6393)Frank Emrich2024-03-191-0/+1
| | | | | | | | | | | | | | | | | | | | | This PR is part of a series that adds basic support for the [typed continuations/wasmfx proposal](https://github.com/wasmfx/specfx). This particular PR adds support for the `suspend` instruction for suspending with a given tag, documented [here](https://github.com/wasmfx/specfx/blob/main/proposals/continuations/Overview.md#instructions). These instructions are of the form `(suspend $tag)`. Assuming that `$tag` is defined with _n_ `param` types `t_1` to `t_n`, the instruction consumes _n_ arguments of types `t_1` to `t_n`. Its result type is the same as the `result` type of the tag. Thus, the folded textual representation looks like `(suspend $tag arg1 ... argn)`. Support for the instruction is implemented in both the old and the new wat parser. Note that this PR does not implement validation of the new instruction. This PR also fixes finalization of `cont.new`, `cont.bind` and `resume` nodes in those cases where any of their children are unreachable.
* Fuzzer: Fix up null outputs in wasm2js optimized builds (#6374)Alon Zakai2024-03-081-0/+15
| | | | | | | | This is fallout from #6310 where we moved to use fuzz_shell.js for all fuzzing purposes. That script doesn't know wasm types, all it has on the JS side is the number of arguments to a function, and it passes in null for them all regardless of their type. That normally works fine - null is cast to the right type upon use - but in wasm2js optimized builds we can remove casts, which can make that noticeable.
* Fuzzer: Standardize notation for exception prefixes (#6369)Alon Zakai2024-03-051-2/+8
| | | | | | | | | We had exception: in one and exception thrown: in another. Making those consistent allows fuzz_shell.js to print the exception after that prefix, which makes debugging easier sometimes. Also canonicalize tag names. Like funcref names, JS VMs print out the internal name, which can change after opts, so canonicalize it.
* Fuzzer: Ignore fuzz testcases that make VMs run out of stack (#6376)Alon Zakai2024-03-041-8/+19
| | | | | | | | | | | | | | | When the stack runs out is observable and optimizations can change it, so we must ignore such testcases. Also add some logic to help debug stuff like this, as suggested by tlively in the past, to add some metrics on the reasons we ignored a testcase. That emits something like this: (ignored 253 iters, for reasons {'too many errors vs calls': 230, '[host limit ': 20, 'uninitialized non-defaultable local': 3}) As a drive by make the metrics print wasm bytes/iter rather than by second (the former is easy to compute from the latter anyhow, and the latter is more interesting I think).
* Typed continuations: cont.bind instructions (#6365)Frank Emrich2024-03-041-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | This PR is part of a series that adds basic support for the [typed continuations/wasmfx proposal](https://github.com/wasmfx/specfx). This particular PR adds support for the `cont.bind` instruction for partially applying continuations, documented [here](https://github.com/wasmfx/specfx/blob/main/proposals/continuations/Overview.md#instructions). In short, these instructions are of the form `(cont.bind $ct_before $ct_after)` where `$ct_before` and `$ct_after` are related continuation types. They must only differ in the number of arguments, where `$ct_before` has _n_ additional parameters as compared to `$ct_after`, for some _n_ ≥ 0. The idea is that `(cont.bind $ct_before $ct_after)` then takes a reference to a continuation of type `$ct_before` as well as _n_ operands and returns a (reference to a) continuation of type `$ct_after`. Thus, the folded textual representation looks like `(cont.bind $ct_before $ct_after arg1 ... argn c)`. Support for the instruction is implemented in both the old and the new wat parser. Note that this PR does not implement validation of the new instruction.
* Fuzz V8 Turboshaft (#6360)Alon Zakai2024-02-281-1/+8
|
* Fuzzer: Separate arguments used to make the fuzz wasm from the opts we run ↵Alon Zakai2024-02-271-8/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | on it (#6357) Before FUZZ_OPTS was used both when doing --translate-to-fuzz/-ttf to generate the wasm from the random bytes and also when later running optimizations to generate a second wasm file for comparison. That is, we ended up doing this, if the opts were -O3: wasm-opt random.input -ttf -o a.wasm -O3 wasm-opt a.wasm -O3 -o b.wasm Now we have a pair a.wasm,b.wasm which we can test. However, we have run -O3 on both which is a little silly - the second -O3 might not actually have anything left to do, which would mean we compare the same wasm to itself. Worse, this is incorrect, as there are things we need to do only during the generation phase, like --denan. We need that in order to generate a valid wasm to test on, but it is "destructive" in itself: when removing NaNs (to avoid nondeterminism) if replaces them with 0, which is different. As a result, running --denan when generating the second wasm from the first could lead to different execution in them. This was always a problem, but became more noticable recently now that DeNaN modifies SIMD operations, as one optimization we do is to replace a memory.copy with v128.load + v128.store, and --denan will make sure the loaded value has no NaNs... To fix this, separate the generation and optimization phase. Instead of wasm-opt random.input -ttf -o a.wasm --denan -O3 wasm-opt a.wasm --denan -O3 -o b.wasm (note how --denan -O3 appears twice), do this: wasm-opt random.input -ttf -o a.wasm --denan wasm-opt a.wasm -O3 -o b.wasm (note how --denan appears in generation, and -O3 in optimization).
* Fuzzer: Handle negative i31s (#6341)Alon Zakai2024-02-231-1/+1
|
* Fuzzer: Ignore V8 errors on uninitialized non-defaultable locals (#6337)Alon Zakai2024-02-221-0/+9
| | | | | | | | | | See #5665 #5599, this is an existing issue and we have a workaround for it using --dce, but it does not always work. I seem to be seeing this in higher frequency since landing recent fuzzer improvements, so ignore it. There is some risk of us missing real bugs here (that we validate and V8 does not), but this is a validation error which is not as serious as a difference in behavior. And this is a long-standing issue that hasn't bitten us yet.
* Typed continuations: cont.new instructions (#6308)Frank Emrich2024-02-221-0/+1
| | | | | | | | | | | | | | | | | This PR is part of a series that adds basic support for the [typed continuations/wasmfx proposal](https://github.com/wasmfx/specfx). This particular PR adds support for the `cont.new` instruction for creating continuations, documented [here(https://github.com/wasmfx/specfx/blob/main/proposals/continuations/Overview.md#instructions). In short, these instructions are of the form `(cont.new $ct)` where `$ct` must be a continuation type. The instruction takes a single (nullable) function reference as its argument, which means that the folded representation of the instruction is of the form `(cont.new $ct (foo ...))`. Support for the instruction is implemented in both the old and the new wat parser. Note that this PR does not implement validation of the new instruction.
* Fuzzer: Adjust feature fuzzing frequency (#6305)Alon Zakai2024-02-221-14/+21
| | | | | | | | | We used to fuzz MVP 1/3, all 1/3, and a mixture 1/3, but that gives far too much priority to the MVP which is increasingly less important. It is also a good idea to give "all" more priority as that enables more initial content to run (the fuzzer will discard initial content if it doesn't validate with the features chosen in the current iteration). Also (NFC) rename POSSIBLE_FEATURE_OPTS to make the code easier to follow.
* Fuzzer: Allow using initial content with V8 (#6327)Alon Zakai2024-02-221-4/+3
| | | | | | | | | | | | | | | One problem was that spec testcases had exports with names that are not valid to write as JS exports.name. For example an export with a - in the name would end up as exports.foo-bar etc. Since #6310 that is fixed as we do not emit such JS (we use the generic fuzz_shell.js script which iterates over the keys in exports with exports[name]). Also fix a few trivial fuzzer issues that initial content uncovered: - Ignore a wat file with invalid utf-8. - Print string literals in the same way from JS as from C++. - Enable the stringref flag in V8. - Remove tag imports (the same as we do for global and function and other imports).
* Fuzzer: Match the logging of i31ref between JS and C++ (#6335)Alon Zakai2024-02-221-0/+4
| | | | | | | | | | | | | JS engines print i31ref as just a number, so we need a small regex to standardize the representation (similar to what we do for funcrefs on the code above). On the C++ side, make it actually print the i31ref rather than treat it like a generic reference (for whom we only print "object"). To do that we must unwrap an externalized i31 as necessary, and add a case for i31 in the printing logic. Also move that printing logic to its own function, as it was starting to get quite long.
* Fuzzer: Add a pass to prune illegal imports and exports for JS (#6312)Alon Zakai2024-02-201-3/+3
| | | | | | | | | | | | | | | | | | We already have passes to legalize i64 imports and exports, which the fuzzer will run so that we can run wasm files in JS VMs. SIMD and multivalue also pose a problem as they trap on the boundary. In principle we could legalize them as well, but that is substantial effort, so instead just prune them: given a wasm module, remove any imports or exports that use SIMD or multivalue (or anything else that is not legal for JS). Running this in the fuzzer will allow us to not skip running v8 on any testcase we enable SIMD and multivalue for. (Multivalue is allowed in newer VMs, so that part of this PR could be removed eventually.) Also remove the limitation on running v8 with multimemory (v8 now supports that).
* Fuzzer: Remove --emit-js-shell logic and reuse fuzz_shell.js instead (#6310)Alon Zakai2024-02-201-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | We had two JS files that could run a wasm file for fuzzing purposes: * --emit-js-shell, which emitted a custom JS file that runs the wasm. * scripts/fuzz_shell.js, which was a generic file that did the same. Both of those load the wasm and then call the exports in order and print out logging as it goes of their return values (if any), exceptions, etc. Then the fuzzer compares that output to running the same wasm in another VM, etc. The difference is that one was custom for the wasm file, and one was generic. Aside from that they are similar and duplicated a bunch of code. This PR improves things by removing 1 and using 2 in all places, that is, we now use the generic file everywhere. I believe we added 1 because we thought a generic file can't do all the things we need, like know the order of exports and the types of return values, but in practice there are ways to do those things: The exports are in fact in the proper order (JS order of iteration is deterministic, thankfully), and for the type we don't want to print type internals anyhow since that would limit fuzzing --closed-world. We do need to be careful with types in JS (see notes in the PR about the type of null) but it's not too bad. As for the types of params, it's fine to pass in null for them all anyhow (null converts to a number or a reference without error).
* Fuzzer: Remove Asyncify integration (#6309)Alon Zakai2024-02-141-71/+0
| | | | | | | | | | | Fuzzing Asyncify has a significant cost both in terms of the complexity in the fuzzer and the slowness of the fuzzing. In practice it was useful years ago when Asyncify was written but hasn't found anything for a while, and Asyncify is really deprecated given JSPI. For all those reasons, remove it from the fuzzer. We do still have lots of normal coverage of asyncify in lit tests, unit tests, and the Emscripten test suite. Removing this will also make future improvements to the fuzzer simpler.
* Fuzzer: Use a directory for important fuzz testcases (#6297)Alon Zakai2024-02-121-23/+14
| | | | | Users can put files in ./fuzz and they will be fuzzed with high priority. Docs in source and https://github.com/WebAssembly/binaryen/wiki/Fuzzing#helper-scripts
* StringLowering: Start to lower instructions (#6281)Alon Zakai2024-02-061-0/+1
|
* [EH] Support CFGWalker for new EH spec (#6235)Heejin Ahn2024-01-251-0/+1
| | | | | | | | This adds support `CFGWalker` for the new EH instructions (`try_table` and `throw_ref`). `CFGWalker` is used by many different passes, but in the same vein as #3494, this adds tests for `RedundantSetElimination` pass. `rse-eh.wast` file is created from translated and simplified version of `rse-eh-old.wast`, but many tests were removed because we don't have special `catch` block or `delegate` anymore.
* [EH] Add translator from old to new EH instructions (#6210)Heejin Ahn2024-01-231-0/+1
| | | | | | | | | | | | | | | | | | | | This translates the old Phase 3 EH instructions, which include `try`, `catch`, `catch_all`, `delegate`, and `rethrow`, into the new EH instructions, which include `try_table` (with `catch` / `catch_ref` / `catch_all` / `catch_all_ref`) and `throw_ref`, passed at the Oct 2023 CG meeting. This translator can be used as a standalone tool by users of the previous EH toolchain to generate binaries for the new spec without recompiling, and also can be used at the end of the Binaryen pipeline to produce binaries for the new spec while the end-to-end toolchain implementation for the new spec is in progress. While the goal of this pass is not optimization, this tries to a little better than the most naive implementation, namely by omitting a few instructions where possible and trying to minimize the number of additional locals, because this can be used as a standalone translator or the last stage of the pipeline while we can't post-optimize the results because the whole pipeline (-On) is not ready for the new EH.
* Typed continuations: resume instructions (#6083)Frank Emrich2024-01-111-0/+1
| | | | | This PR is part of a series that adds basic support for the [typed continuations proposal](https://github.com/wasmfx/specfx). This particular PR adds support for the `resume` instruction. The most notable missing feature is validation, which is not implemented, yet.
* [EH][test] Split EH tests into old and new spec (#6178)Heejin Ahn2023-12-131-0/+2
| | | | | | | | | This moves tests for the old EH spec to `exception-handling-old.wast` and moves the new `exnref` test into `exception-handling.wast`, onto which I plan to add more tests for the new EH spec. The primary reason for splitting the files is I plan to exclude the new EH test from the fuzzing while the new spec's implementation is in progress, and I don't want to exclude the old EH tests altogether.
* Add an arity immediate to tuple.extract (#6172)Thomas Lively2023-12-121-1/+1
| | | | | | | | Once support for tuple.extract lands in the new WAT parser, this arity immediate will let the parser determine how many values it should pop off the stack to serve as the tuple operand to `tuple.extract`. This will usually coincide with the arity of a tuple-producing instruction on top of the stack, but in the spirit of treating the input as a proper stack machine, it will not have to and the parser will still work correctly.
* Update `tuple.make` text format to include arity (#6169)Thomas Lively2023-12-121-1/+1
| | | | | | | | | | Previously, the number of tuple elements was inferred from the number of s-expression children of the `tuple.make` expression, but that scheme would not work in the new wat parser, where s-expressions are optional and cannot be semantically meaningful. Update the text format to take the number of tuple elements (i.e. the tuple arity) as an immediate. This new format will be able to be implemented in the new parser as follow-on work.
* Typed Continuations: Add cont type (#5998)Frank Emrich2023-10-241-0/+2
| | | | | | | | | This PR is part of a series that adds basic support for the [typed continuations proposal](https://github.com/wasmfx/specfx). This PR adds continuation types, of the form `(cont $foo)` for some function type `$foo`. The only notable changes affecting existing code are the following: - This is the first `HeapType` which has another `HeapType` (rather than, say, a `Type`) as its immediate child. This required fixes to certain traversals that have a flag for being at the toplevel of a type. - Some shared logic for parsing `HeapType`s has been factored out.
* Fuzzer: Add missing simplify-globals* passes (#6025)Alon Zakai2023-10-181-0/+2
|
* Fuzzer: Mark runs where over 50% of functions trap as ignored (#6007)Alon Zakai2023-10-131-3/+18
| | | | | | | | The number of ignored functions is logged out, so this can help us avoid getting into a situation where many testcases just trap most of the time rather than doing anything useful. 50% seems a reasonable cutoff. Even if 50% of functions trap, at least we are getting 50% that don't, so lots of useful work.
* Add an "unsubtyping" optimization (#5982)Thomas Lively2023-10-101-0/+1
| | | | | | | | | | | | | | Add a new pass that analyzes the module to find the minimal subtyping relation that is necessary to maintain the validity and semantics of the program and rewrites the types to use this minimal relation. Besides eliminating references to otherwise-unused intermediate types, this optimization should unlock significant additional optimizing power in other type optimizations that are constrained by having to maintain supertype validity, since after this new optimization there are fewer and more general supertypes. The analysis works by visiting each expression and module element to collect the subtypings that are required to maintain its validity, then, using that as a starting point, iteratively adding new subtypings required by type definitions and casts until reaching a fixed point.
* Add passes to finalize or unfinalize types (#5944)Alon Zakai2023-09-181-0/+2
| | | | | | | | | TypeFinalization finalizes all types that we can, that is, all private types that have no children. TypeUnFinalization unfinalizes (opens) all (private) types. These could be used by first opening all types, optimizing, and then finalizing, as that might find more opportunities. Fixes #5933
* Enable auto_initial_contents by default in fuzz_opt.py (#5943)Thomas Lively2023-09-141-3/+4
| | | | This setting is useful enough that there is basically no reason not to use it. Turn it on by default to save some typing when running the fuzzer.
* Add a simple tuple optimization pass (#5937)Alon Zakai2023-09-141-0/+1
| | | | | | | | | | | In some cases tuples are obviously not needed, such as when they are only used in local operations and make/extract. Such tuples are not used as return values or in control flow structures, so we might as well lower them to individual locals per lane, which other passes can optimize a lot better. I believe LLVM does the same with its own tuples: it lowers them as much as possible, leaving only necessary ones. Fixes #5923
* Do not prompt user on fuzz_opt.py --auto-initial-contents (#5907)Thomas Lively2023-08-291-8/+0
| | | | | Remove the prompt for user confirmation when using the --auto-initial-contents option with the fuzzer. It is not actionable, and it prevents me from going off and doing something else when I build and start the fuzzer in the same command.
* Rename multimemory flag (#5890)Ashley Nelson2023-08-211-4/+4
| | | Renaming the multimemory flag in Binaryen to match its naming in LLVM.
* GUFA: Add a version that casts all of our inferences (#5846)Alon Zakai2023-07-271-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GUFA refines existing casts, but does not add new casts for fear of increasing code size and adding more cast operations at runtime. This PR adds a version that does add all those casts, and it looks like at least code size improves rather than regresses, at least on J2Wasm and Kotlin. That is, this pass adds a lot more casts, but subsequent optimizations benefit enough to shrink overall code size. However, this may still not be worthwhile, as even if code size decreases we may end up doing more casts at runtime, and those casts might be hard to remove, e.g.: (call $foo (x) ;; inferred to be non-null ) (func $foo (param (ref null $A) => (call $foo (ref.cast $A (x) ;; add a cast here ) (func $foo (param (ref $A) ;; later pass refines here That new cast cannot be removed after we refine the function parameter. If the function never benefits from the fact that the input is non-null, then the cast is wasted work (e.g. if the function only compares the input to another value). To use this new pass, try --gufa-cast-all rather than --gufa. As with normal GUFA, running the full optimizer afterwards is important, and even more important in order to get rid of as many of the new casts as possible.
* CtorEval Fuzzer: Generalize export regex (#5778)Alon Zakai2023-06-221-1/+1
| | | | | | Just look for export names as "" with some other stuff in the middle. Missing from the old regex: spaces, parens, and probably more. Spaces and parens are used in the test suite, which is how this was noticed by the fuzzer.
* StackIR: Remove nops (#5746)Alon Zakai2023-05-301-0/+2
| | | | | | | No nop instruction is necessary in wasm, so in StackIR we can simply remove them all. Fixes #5745