summaryrefslogtreecommitdiff
path: root/scripts
Commit message (Collapse)AuthorAgeFilesLines
...
* Fuzzer: Ignore testcases that V8 OOMs on (#5733)Alon Zakai2023-05-181-0/+3
| | | | We already ignore OOMs in the interpreter. This adds the syntax for V8, which I saw an error on now (on an array.new of a massive size).
* Reintroduce wasm-merge (#5709)Alon Zakai2023-05-162-5/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We used to have a wasm-merge tool but removed it for a lack of use cases. Recently use cases have been showing up in the wasm GC space and elsewhere, as people are using more diverse toolchains together, for example a project might build some C++ code alongside some wasm GC code. Merging those wasm files together can allow for nice optimizations like inlining and better DCE etc., so it makes sense to have a tool for merging. Background: * Removal: #1969 * Requests: * wasm-merge - why it has been deleted #2174 * Compiling and linking wat files #2276 * wasm-link? #2767 This PR is a compete rewrite of wasm-merge, not a restoration of the original codebase. The original code was quite messy (my fault), and also, since then we've added multi-memory and multi-table which makes things a lot simpler. The linking semantics are as described in the "wasm-link" issue #2767 : all we do is merge normal wasm files together and connect imports and export. That is, we have a graph of modules and their names, and each import to a module name can be resolved to that module. Basically, like a JS bundler would do for JS, or, in other words, we do the same operations as JS code would do to glue wasm modules together at runtime, but at compile time. See the README update in this PR for a concrete example. There are no plans to do more than that simple bundling, so this should not really overlap with wasm-ld's use cases. This should be fairly fast as it works in linear time on the total input code. However, it won't be as fast as wasm-ld, of course, as it does build Binaryen IR for each module. An advantage to working on Binaryen IR is that we can easily do some global DCE after merging, and further optimizations are possible later.
* Prune trapping code during TrapsNeverHappen fuzzing (#5717)Alon Zakai2023-05-121-25/+68
| | | | | | | | | This removes the trapping export and all others after it. This avoids a potential infinite loop that can happen when fuzzing TNH, as if TNH is set and a trap happens then the optimizer can cause an iloop, and while that is valid, it would hang the fuzzer. We could check for a timeout, but it is faster and more robust to just remove the code we can't compare anyhow. This uses wasm-metadce to remove the exports from the failing one.
* [NFC] Refactor each of ArrayNewSeg and ArrayInit into subclasses for ↵Alon Zakai2023-05-041-4/+4
| | | | | | | | | | | Data/Elem (#5692) ArrayNewSeg => ArrayNewSegData, ArrayNewSegElem ArrayInit => ArrayInitData, ArrayInitElem Basically we remove the opcode and use the class type to differentiate them. This adds some code but it makes the representation simpler and more compact in memory, and it will help with #5690
* Fuzzer: Run --dce when GC is enabled (#5677)Alon Zakai2023-04-191-0/+9
| | | | | | | | | | | DCE at the end avoids issues with non-nullable local operations in unreachable code, which is still being discussed. This PR avoids fuzzer errors for now, but we should revert it when we have a proper fix. See * #5599 * #5665 * https://github.com/WebAssembly/function-references/issues/98
* Remove the --hybrid and --nominal command line options (#5669)Thomas Lively2023-04-141-5/+0
| | | | | After this change, the only type system usable from the tools will be the standard isorecursive type system. The nominal type system is still usable via the API, but it will be removed entirely in a follow-on PR.
* Implement array.fill, array.init_data, and array.init_elem (#5637)Thomas Lively2023-04-061-0/+3
| | | | | These complement array.copy, which we already supported, as an initial complete set of bulk array operations. Replace the WIP spec tests with the upstream spec tests, lightly edited for compatibility with Binaryen.
* Adjust fuzzer frequencies (#5612)Alon Zakai2023-03-311-6/+5
| | | | | | | | | | I ran CheckDeterminism at full throttle overnight (set to 1, and disabled all other things) and it found a bug, so we should focus on that more. Also ctor-eval as there is ongoing work there. I reduced a few other priorities of things that haven't seen bugs in a very long time and are not high priority.
* Switch fuzzer to hybrid typing (#5609)Alon Zakai2023-03-311-1/+1
| | | This is the default, and also used by J2Wasm.
* Fuzz partial-inlining-ifs (#5600)Alon Zakai2023-03-291-0/+4
|
* Use more than a single wasm page in Asyncify fuzzing (#5597)Alon Zakai2023-03-221-1/+3
| | | | | I saw a testcase fail on the internal assertion of the buffer being too small. Enlarge it to use as much of the memory we have anyhow to reduce that risk (we can use 15 pages instead of 1, without changing anything else).
* Fuzzer: Ignore infinite recursion in Asyncify handler (#5596)Alon Zakai2023-03-221-0/+17
|
* Fix misoptimization in TypeMerging (#5572)Thomas Lively2023-03-141-3/+1
| | | | | | | | | | | | TypeMerging previously tried to merge types with their supertypes and siblings in a single step, but this could cause a misoptimization in which a type was merged with its parent's sibling without being merged with its parent, breaking subtyping. Fix the bug by merging with supertypes and siblings separately. Since we now have multiple merging steps, also take the opportunity to run the sibling merging step multiple times to exploit more merging opportunities. Fixes #5556.
* Fuzzer: CompareVMs: Do not compare when hitting a host limitation (#5562)Alon Zakai2023-03-141-2/+18
| | | | | | | For example, we might hit an allocation limit in the wasm, but the optimized wasm might optimize that allocation out. So we need to ignore comparisons in such cases, as we cannot expect the output to be identical. We already do similar things for FuzzExec and #5560 adds it for TrapsNeverHappen; this adds it to CompareVMs.
* TrapsNeverHappen fuzzing: Handle a trap vs a host limitation (#5560)Alon Zakai2023-03-101-0/+8
| | | | | | | | | | If the program tries to allocate an infinite number of objects, but is prevented from doing that by a null pointer trap, then after we run with trapsNeverHappen the trap may fail to occur, and we'll hit the host limitation on allocations. As a result, we'd be comparing one run with a trap and one run that is meant to be ignored (as we ignore runs with host limitations), and before this PR we'd error as we would expect to find the normal output and not the "ignore this host limitation" marker.
* Integrate the heap type fuzzer into the main fuzzer (#5555)Alon Zakai2023-03-091-1/+3
| | | | | | | | | | | | | With this we generate random GC types that may be used in creating instructions later. We don't create many instructions yet, which will be the next step after this. Also add some trivial assertions in some places, that have helped debugging in the past. Stop fuzzing TypeMerging for now due to #5556 , which this PR uncovers.
* Fuzzer: Skip testcases that hit V8's array size limit (#5550)Alon Zakai2023-03-071-2/+8
|
* Fuzzer: Count the number of VM runs we ignore (#5538)Alon Zakai2023-03-011-1/+11
| | | | | If this number ever gets high then we would need to look into why we ignore so much. Right now we seem to end up ignoring much less than 1% which seems ok.
* Fuzzer: Ignore host limits (#5536)Alon Zakai2023-03-011-0/+6
| | | | | We can't just skip host limits (#5534) but must also ignore execution at that point, as optimizations can change the results if they change whether we reach a host limit.
* Parse and print `array.new_fixed` (#5527)Thomas Lively2023-02-281-1/+2
| | | | | | | | | This is a (more) standard name for `array.init_static`. (The full upstream name in the spec repo is `array.new_canon_fixed`, but I'm still hoping we can drop `canon` from all the instruction names and it doesn't appear elsewhere in Binaryen). Update all the existing tests to use the new name and add a test specifically to ensure the old name continues parsing.
* [NFC] Internally rename `ArrayInit` to `ArrayNewFixed` (#5526)Thomas Lively2023-02-281-1/+1
| | | | | | | | To match the standard instruction name, rename the expression class without changing any parsing or printing behavior. A follow-on PR will take care of the functional side of this change while keeping support for parsing the old name. This change will allow `ArrayInit` to be used as the expression class for the upcoming `array.init_data` and `array.init_elem` instructions.
* Fuzzing: Add wasm-ctor-eval fuzzing in fuzz_opt.py (#5523)Alon Zakai2023-02-241-4/+44
| | | | | | After the recent improvements and fixes this is now simple and the fuzzer found no more issues overnight for me. Also adjust some existing frequencies.
* [wasm-ctor-eval] Add v128 load/store support (#5512)Alon Zakai2023-02-231-0/+5
|
* [Strings] Add experimental string.hash instruction (#5480)Alon Zakai2023-02-031-0/+1
| | | See WebAssembly/stringref#60
* [Wasm GC] Add AbstractTypeRefining pass (#5461)Alon Zakai2023-02-031-0/+2
| | | | | | | | | | | | | | If a type hierarchy has abstract classes in the middle, that is, types that are never instantiated, then we can optimize casts and other operations to them. Say in Java that we have `AbstractList`, and it only has one subclass `IntList` that is ever created, then any place we have an `AbstractList` we must actually have an `IntList`, or a null. (Or, if no subtype is instantiated, then the value must definitely be a null.) The actual implementation does a type mapping, that is, it finds all places using an abstract type and makes them refer to the single instantiated subtype (or null). After that change, no references to the abstract type remain in the program, so this both refines types and also cleans up the type section.
* Add a CMake flag to enable Wasm exceptions in the BinaryenJS build (#5454)Derek Schuff2023-02-021-1/+1
|
* [Strings] Add experimental StringNew variants (#5459)Alon Zakai2023-01-261-4/+7
| | | | | | string.from_code_point makes a string from an int code point. string.new_utf8*_try makes a utf8 string and returns null on a UTF8 encoding error rather than trap.
* [Strings] Add string.compare (#5453)Alon Zakai2023-01-251-1/+2
| | | See WebAssembly/stringref#58
* [Wasm GC] Replace `HeapType::data` with `HeapType::struct_` (#5416)Thomas Lively2023-01-101-4/+0
| | | | | | `struct` has replaced `data` in the upstream spec, so update Binaryen's types to match. We had already supported `struct` as an alias for data, but now remove support for `data` entirely. Also remove instructions like `ref.is_data` that are deprecated and do not make sense without a `data` type.
* Represent ref.as_{func,data,i31} with RefCast (#5413)Thomas Lively2023-01-101-3/+3
| | | | | | | | | | | | | These operations are deprecated and directly representable as casts, so remove their opcodes in the internal IR and parse them as casts instead. For now, add logic to the printing and binary writing of RefCast to continue emitting the legacy instructions to minimize test changes. The few test changes necessary are because it is no longer valid to perform a ref.as_func on values outside the func type hierarchy now that ref.as_func is subject to the ref.cast validation rules. RefAsExternInternalize, RefAsExternExternalize, and RefAsNonNull are left unmodified. A future PR may remove RefAsNonNull as well, since it is also expressible with casts.
* Replace `RefIs` with `RefIsNull` (#5401)Thomas Lively2023-01-091-4/+4
| | | | | | | | | | | | | | | * Replace `RefIs` with `RefIsNull` The other `ref.is*` instructions are deprecated and expressible in terms of `ref.test`. Update binary and text parsing to parse those instructions as `RefTest` expressions. Also update the printing and emitting of `RefTest` expressions to emit the legacy instructions for now to minimize test changes and make this a mostly non-functional change. Since `ref.is_null` is the only `RefIs` instruction left, remove the `RefIsOp` field and rename the expression class to `RefIsNull`. The few test changes are due to the fact that `ref.is*` instructions are now subject to `ref.test` validation, and in particular it is no longer valid to perform a `ref.is_func` on a value outside of the `func` type hierarchy.
* Consolidate br_on* operations (#5399)Thomas Lively2023-01-061-12/+12
| | | | | | | | | | | | | | | | | | The `br_on{_non}_{data,i31,func}` operations are deprecated and directly representable in terms of the new `br_on_cast` and `br_on_cast_fail` instructions, so remove their dedicated IR opcodes in favor of representing them as casts. `br_on_null` and `br_on_non_null` cannot be consolidated the same way because their behavior is not directly representable in terms of `br_on_cast` and `br_on_cast_fail`; when the cast to null bottom type succeeds, the null check instructions implicitly drop the null value whereas the cast instructions would propagate it. Add special logic to the binary writer and printer to continue emitting the deprecated instructions for now. This will allow us to update the test suite in a separate future PR with no additional functional changes. Some tests are updated because the validator no longer allows passing non-func data to `br_on_func`. Doing so has not made sense since we separated the three reference type hierarchies.
* [Parser] Parse blocks (#5393)Thomas Lively2023-01-051-1/+1
| | | | | | | | | | | | | Parse both the folded and unfolded forms of blocks and structure the code to make supporting additional block instructions like if-else and try-catch relatively simple. Parsing block types is extra fun because they may implicitly define new signature heap types via a typeuse, but only if their types are not given by a single result type. To figuring out whether a new type may be introduced in all the relevant parsing stages, always track at least the arity of parsed results. The parser parses block labels, but more work will be required to support branch instructions that use them.
* Skip some new initial content in the fuzzer, with imported memories (#5383)Alon Zakai2023-01-031-0/+3
| | | | | Do not fuzz some new testcases that have imported memories. The fuzzer doesn't seem to have support for that (it errors when it tries to do operations on them, since the import hasn't been created).
* Work around bugs with open world type optimizations (#5367)Thomas Lively2022-12-201-69/+89
| | | | | | | | | | | | | | Since #5347 public types are never updated by type optimizations, but the optimization passes have not yet been updated to take that into account, so they are all buggy under an open world assumption. In #5359 we worked around many closed world validation errors in the fuzzer by treating --closed-world like a feature flag and checking whether it was necessary for fuzzer input, but that did not prevent the type optimization passes from running under an open world, so it did not work around all the potential issues. Work around the problem more thoroughly by not running any type optimization passes in the fuzzer without --closed-world. Also add logic to those passes to error out if they are run without --closed-world and update the tests accordingly.
* Fixed fuzzing of closed-world after #5347 (#5359)Alon Zakai2022-12-161-4/+19
| | | | | An initial content testcase may only work in open world, so check for that using the existing mechanism of checking if such testcases work with out feature flags.
* Fix OOB string_view read in generated parser code (#5349)Thomas Lively2022-12-141-7/+5
| | | | | | | | | The `op` string_view was intentionally created to point into the `buf` buffer so that reading past its end would still be safe, but some C++ standard library implementations assert when reading past the end of a string_view. Change the generated code to read out of `buf` instead to avoid those assertions. Fixes #5322. Fixes #5342.
* Add standard versions of WasmGC casts (#5331)Thomas Lively2022-12-071-9/+12
| | | | | | | We previously supported only the non-standard cast instructions introduced when we were experimenting with nominal types. Parse the names and opcodes of their standard counterparts and switch to emitting the standard names and opcodes. Port all of the tests to use the standard instructions, but add additional tests showing that the non-standard versions are still parsed correctly.
* [Wasm GC] Add TypeMerging pass (#5321)Alon Zakai2022-12-071-0/+1
| | | | | | | | This finds types that can be merged into their super: types that add no fields, and are not used in casts, etc. - so we might as well use the super. This complements TypeSSA, in that it can merge back the new types that TypeSSA created, if we never found a use for them. Without this, TypeSSA can bloat binary size quite a lot (I see 10-20%).
* [Wasm GC] Add TypeSSA pass (#5299)Alon Zakai2022-12-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This creates new nominal types for each (interesting) struct.new. That then allows type-based optimizations to be more precise, as those optimizations will track separate info for each struct.new, in effect. That is kind of like SSA, however, we do not handle merges. For example: x = struct.new $A (5); print(x.value); y = struct.new $A (11); print(y.value); // => // x = struct.new $A.x (5); print(x.value); y = struct.new $A.y (11); print(y.value); After the pass runs each of those struct.new creates a unique type, and type-based analysis can see that 5 or 11 are the only values written in that type (if nothing else writes there). This bloats the type section with the new subtypes, so it is best used with a pass to merge unneeded duplicate types, which a later PR will add. That later PR will exactly merge back in the types created here, which are nominally different but indistinguishable otherwise. This pass is not enabled by default. It's not clear yet where is the best place to do it, as it must be balanced by type merging, but it might be better to do multiple rounds of optimization between the two. Needs more investigation.
* [Wasm GC] Implement closed-world flag (#5303)Alon Zakai2022-11-301-0/+3
| | | | | | | | | | | | | With this change we default to an open world, that is, we do the safe thing by default: we no longer assume a closed world. Users that want a closed world must pass --closed-world. Atm we just do not run passes that assume a closed world. (We might later refine them to find which types don't escape and only optimize those.) The RemoveUnusedModuleElements is an exception in that the closed-world flag influences one part of its operation, but not the rest. Fixes #5292
* Unpin V8 (#5277)Thomas Lively2022-11-171-3/+1
| | | | | | | | | | | | | | * Do not compare reference values across executions Since we optimize assuming a closed world, optimizations can change the types and structure of GC data even in externally-visible ways. Because differences are expected, the fuzzer already did not compare reference-typed values from before and after optimizations when running with nominal typing. Update it to not compare these values under any type system. * Unpin V8 Our WasmGC output is no longer compatible with the previously pinned version and the issue that caused us to pin it in the first place has been resolved.
* [Wasm GC] Start an OptimizeCasts pass and reuse cast values there (#5263)Alon Zakai2022-11-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | (some.operation (ref.cast .. (local.get $ref)) (local.get $ref) ) => (some.operation (local.tee $temp (ref.cast .. (local.get $ref)) ) (local.get $temp) ) This can help cases where we cast for some reason but happen to not use the cast value in all places. This occurs in j2wasm in itable calls sometimes: The this pointer is is refined, but the itable may be done with an unrefined pointer, which is less optimizable. So far this is just inside basic blocks, but that is enough for the cast of itable calls and other common patterns I see.
* [Wasm GC] Add Monomorphize pass (#5238)Alon Zakai2022-11-111-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Monomorphization finds cases where we send more refined types to a function than it declares. In such cases we can copy the function and refine the parameters: // B is a subtype of A foo(new B()); function foo(x : A) { ..} => foo_B(new B()); // call redirected to refined copy function foo(x : A) { ..} // unchanged function foo_B(x : B) { ..} // refined copy This increases code size so it may not be worth it in all cases. This initial PR is hopefully enough to start experimenting with this on performance, and so it does not enable the pass by default. This adds two variations of monomorphization, one that always does it, and the default which is "careful": it sees whether monomorphizing lets the refined function actually be better than the original (say, by removing a cast). If there is no improvement then we do not make any changes. This saves a significant amount of code size - on j2wasm the careful version increases by 13% instead of 20% - but it does run more slowly obviously.
* Implement `array.new_data` and `array.new_elem` (#5214)Thomas Lively2022-11-071-0/+2
| | | | | | | | | In order to test them, fix the binary and text parsers to accept passive data segments even if a module has no memory. In addition to parsing and emitting the new instructions, also implement their validation and interpretation. Test the interpretation directly with wasm-shell tests adapted from the upstream spec tests. Running the upstream spec tests directly would require fixing too many bugs in the legacy text parser, so it will have to wait for the new text parser to be ready.
* Fix fuzzer default contents after #5212 (#5223)Thomas Lively2022-11-071-1/+1
| | | | | | | That PR renamed test/lit/optimize-instructions.wast to test/lit/optimize-instructions-mvp.wast. However, the fuzzer was explicitly adding the testto the list of important initial contents under the old name, so it was failing an assertion that the initial contents existed. Update the fuzzer to use the new test name.
* [Wasm GC] Fix GUFA on externalize/internalize (#5220)Alon Zakai2022-11-041-0/+1
| | | | | | | These operations emit a completely different type than their input, so they must be marked as roots, and not as things that flow values through them (because then we filter everything out as the types are not compatible). Fixes #5219
* Fuzzer: Skip fuzzing VMs when multi-memories are enabled (#5207)Alon Zakai2022-11-031-3/+3
|
* Fix fuzzer to ignore externalize/internalize (#5176)Alon Zakai2022-10-211-0/+2
| | | | | The fuzzer started to fail on the recent externalize/internalize test that was added in #5175 as we lack interpreter support. Move that to a separate file and ignore it in the fuzzer for now.
* [Parser] Parse loads and stores (#5174)Thomas Lively2022-10-211-24/+28
| | | | | | | | | | Add parsing functions for `memarg`s, the offset and align fields of load and store instructions. These fields are interesting because they are lexically reserved words that need to be further parsed to extract their actual values. On top of that, add support for parsing all of the load and store instructions. This required fixing a buffer overflow problem in the generated parser code and adding more information to the signatures of the SIMD load and store instructions. `SIMDLoadStoreLane` instructions are particularly interesting because they may require backtracking to parse correctly.