summaryrefslogtreecommitdiff
path: root/src/tools/wasm-reduce.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [wasm-reduce] Reduce struct.new arguments away when possible (#7118)HEADmainAlon Zakai2024-12-301-1/+20
| | | | | | | | If all the fields of a struct.new are defaultable, see if replacing it with a struct.new_default preserves the behavior, and reduce that way if so. Also add a missing --closed-world to the --remove-unused-types invocation. Without that, it was erroring and not working, which I noticed when testing this. The test also checks that.
* [wasm-reduce] Add an option to save all interim working files as we reduce ↵Alon Zakai2024-12-161-2/+29
| | | | | | | (#7154) With this option, each time we reduce we save a file w.wasm.17 or such, incrementing that counter. This is useful when debugging the reducer, but might have more uses.
* Improve fuzzing of both closed and open world styles of modules (#7090)Alon Zakai2024-11-191-0/+1
| | | | | | | | | | Before, we would simply not export a function that had an e.g. anyref param. As a result, the modules were effectively "closed", which was good for testing full closed-world mode, but not for testing degrees of open world. To improve that, this PR allows the fuzzer to export such functions, and an "enclose world" pass is added that "closes" the wasm (makes it more compatible with closed-world) that is run 50% of the time, giving us coverage of both styles.
* Add a --preserve-type-order option (#6916)Thomas Lively2024-09-101-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unlike other module elements, types are not stored on the `Module`. Instead, they are collected by traversing the IR before printing and binary writing. The code that collects the types tries to optimize the order of rec groups based on the number of times each type is used. As a result, the output order of types generally has no relation to the input order of types. In addition, most type optimizations rewrite the types into a single large rec group, and the order of types in that group is essentially arbitrary. Changes to the code for counting type uses, sorting types, or sorting rec groups can yield very large changes in the output order of types, producing test diffs that are hard to review and potentially harming the readability of tests by moving output types away from the corresponding input types. To help make test output more stable and readable, introduce a tool option that causes the order of output types to match the order of input types as closely as possible. It is implemented by having the parsers record the indices of the input types on the `Module` just like they already record the type names. The `GlobalTypeRewriter` infrastructure used by type optimizations associates the new types with the old indices just like it already does for names and also respects the input order when rewriting types into a large recursion group. By default, wasm-opt and other tools clear the recorded type indices after parsing the module, so their default behavior is not modified by this change. Follow-on PRs will use the new flag in more tests, which will generate large diffs but leave the tests in stable, more readable states that will no longer change due to other changes to the optimizing type sorting logic.
* [wasm-reduce] Do not crash on non-func element segments (#6778)Thomas Lively2024-07-261-10/+5
| | | | Generalize the code for simplifying element segments to handle more than just null and funcref elements.
* [StackIR] Run StackIR during binary writing and not as a pass (#6568)Alon Zakai2024-05-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Previously we had passes --generate-stack-ir, --optimize-stack-ir, --print-stack-ir that could be run like any other passes. After generating StackIR it was stashed on the function and invalidated if we modified BinaryenIR. If it wasn't invalidated then it was used during binary writing. This PR switches things so that we optionally generate, optimize, and print StackIR only during binary writing. It also removes all traces of StackIR from wasm.h - after this, StackIR is a feature of binary writing (and printing) logic only. This is almost NFC, but there are some minor noticeable differences: 1. We no longer print has StackIR in the text format when we see it is there. It will not be there during normal printing, as it is only present during binary writing. (but --print-stack-ir still works as before; as mentioned above it runs during writing). 2. --generate/optimize/print-stack-ir change from being passes to being flags that control that behavior instead. As passes, their order on the commandline mattered, while now it does not, and they only "globally" affect things during writing. 3. The C API changes slightly, as there is no need to pass it an option "optimize" to the StackIR APIs. Whether we optimize is handled by --optimize-stack-ir which is set like other optimization flags on the PassOptions object, so we don't need the old option to those C APIs. The main benefit here is simplifying the code, so we don't need to think about StackIR in more places than just binary writing. That may also allow future improvements to our usage of StackIR.
* wasm-reduce: Improve tryToReduceCurrentToConst() (#6193)Alon Zakai2024-01-021-9/+26
| | | | | | | | | Avoid replacing with the exact same thing in the case of RefNull and a default tuple. Also be more careful with handling of numbers. Before we exited immediately if we saw a number, but we can try to replace a number with a 0 or a 1, even if it was a number before. That is, we consider 1 simpler than e.g. 12345678, and 0 simpler than 1.
* Asyncify: Improve comments (#5987)Heejin Ahn2023-10-031-1/+1
| | | | | | | | This fixes some outdated comments and typos in Asyncify and improves some other comments. This tries to make code comments more readable by making them more accurate and also by using the three state (normal, unwinding, and rewinding) consistently. Drive-by fix: Typo fixes in SimplifyGlobals and wasm-reduce option.
* Remove the --hybrid and --nominal command line options (#5669)Thomas Lively2023-04-141-7/+0
| | | | | After this change, the only type system usable from the tools will be the standard isorecursive type system. The nominal type system is still usable via the API, but it will be removed entirely in a follow-on PR.
* wasm-reduce: Add more passes (#5667)Alon Zakai2023-04-141-0/+6
|
* [NFC] Remove our bespoke `make_unique` implementation (#5613)Thomas Lively2023-03-311-2/+2
| | | | This code predates our adoption of C++14 and can now be removed in favor of `std::make_unique`, which should be more efficient.
* Use C++17's [[maybe_unused]]. NFC (#5309)Sam Clegg2022-12-021-1/+0
|
* Remove equirecursive typing (#5240)Thomas Lively2022-11-231-2/+3
| | | | Equirecursive is no longer standards track and its implementation is extremely complex. Remove it.
* Make `Name` a pointer, length pair (#5122)Thomas Lively2022-10-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | With the goal of supporting null characters (i.e. zero bytes) in strings. Rewrite the underlying interned `IString` to store a `std::string_view` rather than a `const char*`, reduce the number of map lookups necessary to intern a string, and present a more immutable interface. Most importantly, replace the `c_str()` method that returned a `const char*` with a `toString()` method that returns a `std::string`. This new method can correctly handle strings containing null characters. A `const char*` can still be had by calling `data()` on the `std::string_view`, although this usage should be discouraged. This change is NFC in spirit, although not in practice. It does not intend to support any particular new functionality, but it is probably now possible to use strings containing null characters in at least some cases. At least one parser bug is also incidentally fixed. Follow-on PRs will explicitly support and test strings containing nulls for particular use cases. The C API still uses `const char*` to represent strings. As strings containing nulls become better supported by the rest of Binaryen, this will no longer be sufficient. Updating the C and JS APIs to use pointer, length pairs is left as future work.
* Implement bottom heap types (#5115)Thomas Lively2022-10-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | These types, `none`, `nofunc`, and `noextern` are uninhabited, so references to them can only possibly be null. To simplify the IR and increase type precision, introduce new invariants that all `ref.null` instructions must be typed with one of these new bottom types and that `Literals` have a bottom type iff they represent null values. These new invariants requires several additional changes. First, it is now possible that the `ref` or `target` child of a `StructGet`, `StructSet`, `ArrayGet`, `ArraySet`, or `CallRef` instruction has a bottom reference type, so it is not possible to determine what heap type annotation to emit in the binary or text formats. (The bottom types are not valid type annotations since they do not have indices in the type section.) To fix that problem, update the printer and binary emitter to emit unreachables instead of the instruction with undetermined type annotation. This is a valid transformation because the only possible value that could flow into those instructions in that case is null, and all of those instructions trap on nulls. That fix uncovered a latent bug in the binary parser in which new unreachables within unreachable code were handled incorrectly. This bug was not previously found by the fuzzer because we generally stop emitting code once we encounter an instruction with type `unreachable`. Now, however, it is possible to emit an `unreachable` for instructions that do not have type `unreachable` (but are known to trap at runtime), so we will continue emitting code. See the new test/lit/parse-double-unreachable.wast for details. Update other miscellaneous code that creates `RefNull` expressions and null `Literals` to maintain the new invariants as well.
* [NFC] wasm-reduce: Avoid wasted work on drops (#4850)Alon Zakai2022-07-291-0/+7
| | | | | | It was wasted work to see a drop and then check if we can replace it with a drop of its child, which is identical to the original state. This didn't cause any harm (we'd not reduce code size, and stop eventually) but it did slow us down.
* wasm-reduce: Apply commandline features (#4833)Alon Zakai2022-07-261-3/+11
| | | | | This lets wasm-reduce --enable-FOO work. Usually this is not needed as we do enable all features by default, but sometimes it is nice to disable features (e.g. to avoid reducing into a testcase that uses something the original wasm did not use).
* Remove basic reference types (#4802)Thomas Lively2022-07-201-30/+10
| | | | | | | | | Basic reference types like `Type::funcref`, `Type::anyref`, etc. made it easy to accidentally forget to handle reference types with the same basic HeapTypes but the opposite nullability. In principle there is nothing special about the types with shorthands except in the binary and text formats. Removing these shorthands from the internal type representation by removing all basic reference types makes some code more complicated locally, but simplifies code globally and encourages properly handling both nullable and non-nullable reference types.
* Fix more no-assertions warnings (#4765)Alon Zakai2022-06-301-0/+1
|
* First class Data Segments (#4733)Ashley Nelson2022-06-211-10/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Updating wasm.h/cpp for DataSegments * Updating wasm-binary.h/cpp for DataSegments * Removed link from Memory to DataSegments and updated module-utils, Metrics and wasm-traversal * checking isPassive when copying data segments to know whether to construct the data segment with an offset or not * Removing memory member var from DataSegment class as there is only one memory rn. Updated wasm-validator.cpp * Updated wasm-interpreter * First look at updating Passes * Updated wasm-s-parser * Updated files in src/ir * Updating tools files * Last pass on src files before building * added visitDataSegment * Fixing build errors * Data segments need a name * fixing var name * ran clang-format * Ensuring a name on DataSegment * Ensuring more datasegments have names * Adding explicit name support * Fix fuzzing name * Outputting data name in wasm binary only if explicit * Checking temp dataSegments vector to validateBinary because it's the one with the segments before we processNames * Pass on when data segment names are explicitly set * Ran auto_update_tests.py and check.py, success all around * Removed an errant semi-colon and corrected a counter. Everything still passes * Linting * Fixing processing memory names after parsed from binary * Updating the test from the last fix * Correcting error comment * Impl kripken@ comments * Impl tlively@ comments * Updated tests that remove data print when == 0 * Ran clang format * Impl tlively@ comments * Ran clang-format
* Reducer: Support --hybrid (#4726)Alon Zakai2022-06-141-0/+3
|
* wasm-reduce: Fix order in shrinkByReduction call (#4673)Alon Zakai2022-05-171-1/+4
| | | | | | The old code would short-circuit and not do anything after we managed any reduction in the loop here. That would end up doing entire iterations of the whole pipeline before removing another element segment, which could be slow.
* Remove externref (#4633)Thomas Lively2022-05-041-5/+0
| | | | | | Remove `Type::externref` and `HeapType::ext` and replace them with uses of anyref and any, respectively, now that we have unified these types in the GC proposal. For backwards compatibility, continue to parse `extern` and `externref` and maintain their relevant C API functions.
* wasm-reduce: Try to remove functions from a random place (#4612)Alon Zakai2022-04-251-7/+32
| | | | | | Previously we'd only try to remove functions from index 0, so we missed some opportunities. With this change we still go through all the functions if things go well, but we start from a deterministic random location in the vector.
* [Wasm GC] Signature Pruning (#4545)Alon Zakai2022-03-251-0/+1
| | | | | | | | | | | | | This adds a new signature-pruning pass that prunes parameters from signature types where those parameters are never used in any function that has that type. This is similar to DeadArgumentElimination but works on a set of functions, and it can handle indirect calls. Also move a little code from SignatureRefining into a shared place to avoid duplication of logic to update signature types. This pattern happens in j2wasm code, for example if all method functions for some virtual method just return a constant and do not use the this pointer.
* wasm-reduce: Add newer passes (#4502)Alon Zakai2022-02-031-0/+4
| | | | | | | | | | | | | | | | These might help reduction. Most newer passes, like say --type-refining, are not going to actually help by themselves without other passes, so those are not added (they get run in the -O2 etc. modes, which at least gives them a chance to help). DeadArgumentElimination: Might help by itself, if just removing arguments reduces code size. In some cases applying constants may increase code size, though, but the -optimizing variant helps there. GlobalTypeOptimization: This can remove type fields which can shrink the type section by a lot. This is the reason I realized I should open this PR, when I happened to notice that running that pass manually after reduction helped a lot more. SimplifyGlobals: Can remove unused globals, merge identical immutable ones, etc., all of which can help code size directly.
* Add categories to --help text (#4421)Alon Zakai2022-01-051-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The general shape of the --help output is now: ======================== wasm-foo Does the foo operation ======================== wasm-foo opts: -------------- --foo-bar .. Tool opts: ---------- .. The options are now in categories, with the more specific ones - most likely to be wanted by the user - first. I think this makes the list a lot less confusing. In particular, in wasm-opt all the opt passes are now in their own category. Also add a script to make it easy to update the help tests.
* Reducer: Apply --debug to all commands (#4275)Alon Zakai2021-10-251-3/+4
| | | | | | Do so by applying --debug to extraFlags right at the start. That global is used everywhere already. In particular, this PR removes manually adding -g in the first diff chunk here, and you can see extraFlags appears there already on the previous line.
* LocalCSE rewrite (#4079)Alon Zakai2021-08-171-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Technically this is not a new pass, but it is a rewrite almost from scratch. Local Common Subexpression Elimination looks for repeated patterns, stuff like this: x = (a + b) + c y = a + b => temp = a + b x = temp + c y = temp The old pass worked on flat IR, which is inefficient, and was overly complicated because of that. The new pass uses a new algorithm that I think is pretty simple, see the detailed comment at the top. This keeps the pass enabled only in -O4, like before - right after flattening the IR. That is to make this as minimal a change as possible. Followups will enable the pass in the main pipeline, that is, we will finally be able to run it by default. (Note that to make the pass work well after flatten, an extra simplify-locals is added - the old pass used to do part of simplify-locals internally, which was one source of complexity. Even so, some of the -O4 tests have changes, due to minor factors - they are just minor orderings etc., which can be seen by inspecting the outputs before and after using e.g. --metrics) This plus some followup work leads to large wins on wasm GC output. On j2cl there is a common pattern of repeated struct.gets, so common that this pass removes 85% of all struct.gets, which makes the total binary 15% smaller. However, on LLVM-emitted code the benefit is minor, less than 1%.
* Support nominal typing in wasm-reduce (#4080)Alon Zakai2021-08-161-3/+8
| | | | | Use ToolOptions there, which adds --nominal support. We must also pass --nominal to the sub-commands we run.
* Clean up and rewrite wasm-reduce element segment logic (#4015)Alon Zakai2021-07-231-16/+19
| | | | | | | | | Practically NFC, but it does reorder some code a little. Previously we would find a "zero", then shrink segments, then use that zero - which might no longer be in the table. That seems weird, so this reorders that, but there should be no significant difference in the output. Also reduce the factor of 100 to 1, which in practice is important on one of the Dart GC benchmarks that has a huge number of table segments.
* wasm-reduce: Avoid a crash where function names change after ↵Alon Zakai2021-07-221-3/+4
| | | | | | | | | | | | tryToRemoveFunctions (#4013) tryToRemoveFunctions() will reload the wasm from binary if it fails to optimize, and without the names section we don't have a guarantee on the names being the same after that. And then tryToEmptyFunctions would look for a name, and crash. In the reverse order there is no risk, as tryToEmptyFunctions does not reload the wasm from binary, it carefully undoes what it tried to do when it fails.
* Reduce more carefully when it looks like we are failing (#3996)Alon Zakai2021-07-221-3/+3
| | | | | | Instead of skipping to the end, move quickly towards the end. This is sometimes more efficient (as jumping from a big factor to a factor of 1 can skip over big opportunities to remove code all at once instead of once instruction at a time).
* Exponentially empty out function bodies when reducing (#3997)Alon Zakai2021-07-201-45/+75
| | | | | | | | | This removes the code that did so one at a time, and instead adds it in a way that we can do it in an exponentially growing set of functions. On large testcases where other methods do not work, this is very useful. Also adjust the factor to do this 20x more often, which in practice is very useful too.
* Preserve Function HeapTypes (#3952)Thomas Lively2021-06-301-4/+4
| | | | | | | | | When using nominal types, func.ref of two functions with identical signatures but different HeapTypes will yield different types. To preserve these semantics, Functions need to track their HeapTypes, not just their Signatures. This PR replaces the Signature field in Function with a HeapType field and adds new utility methods to make it almost as simple to update and query the function HeapType as it was to update and query the Function Signature.
* wasm-reduce: Always decrease the factor (#3849)Alon Zakai2021-05-181-3/+9
| | | | | | When things go well, the reducer shrinks the factor by 50% or more, but when things are slow it kept the factor unchanged. That is annoying in some cases where you really have no benefit from reduction until the factor gets small. So this at least reduces it by 10% in each iteration.
* Reducer: Replace entire function bodies with either unreachable or nop (#3802)Alon Zakai2021-04-121-5/+15
| | | | | Previously we just used unreachable. This also tries nop when it is possible, and sometimes that is better (if the code is called, a nop may be less intrusive of a change).
* Reducer: skip more functions when failing to remove them (#3718)Alon Zakai2021-04-051-2/+2
| | | | | | | This avoids an annoying case where in each iteration we try to remove every function one by one and keep failing. Instead, we'll skip large numbers of them when the factor is large at least. Also shorten some unnecessary logging.
* Fix reduction of nondefaultable tuples (#3746)Alon Zakai2021-03-291-1/+1
| | | | There is a makeZeros right below that, which will assert on a nondefaultable type.
* Print parse errors in reducer and roundtrip (#3737)Alon Zakai2021-03-251-1/+7
| | | | Without this, crashes from things like #3736 simply get reported as "a parse exception was thrown" with no detail.
* [RT] Support expressions in element segments (#3666)Abbas Mashayekh2021-03-241-31/+48
| | | | | | This PR adds support for `ref.null t` as a valid element segment item. The abbreviated format of `(elem ... func $f $g...)` is kept in both printing and binary emitting if all items are `ref.func`s. Public APIs aren't updated in this PR.
* Reducer: Improve reduction of function bodies and the factor for text ↵Alon Zakai2021-03-091-6/+21
| | | | | | | | | | | | | | | | | reduction (#3668) The old code tried to call visitExpression from outside of a walk on the wasm, which works except that replaceCurrent does nothing as there is no current node. Perhaps it should assert if called outside of a walk? Might be an expensive check, but once we have no-assert builds maybe that's worthwhile. Replace that with a working check during the walk. Also limit the frequency of it (do it 1000x more often than a normal reduction, but not all the time like we used to). Also optimize the starting factor for text reduction. Text files are much larger for the same amount of IR, so the initial factor was far too high and inefficient.
* [reference-types] Support passive elem segments (#3572)Abbas Mashayekh2021-03-051-80/+77
| | | | | | | | | | | Passive element segments do not belong to any table, so the link between Table and elem needs to be weaker; i.e. an elem may have a table in case of active segments, or simply be a collection of function references in case of passive/declarative segments. This PR takes Table::Segment out and turns it into a first class module element just like tables and functions. It also implements early support for parsing, printing, encoding and decoding passive/declarative elem segments.
* [reference-types] remove single table restriction in IR (#3517)Abbas Mashayekh2021-02-091-1/+7
| | | Adds support for modules with multiple tables. Adds a field for the table name to `CallIndirect` and updates the C/JS APIs accordingly.
* Remove exnref and br_on_exn (#3505)Heejin Ahn2021-01-221-5/+0
| | | This removes `exnref` type and `br_on_exn` instruction.
* [GC] Add dataref type (#3500)Alon Zakai2021-01-211-0/+5
| | | | | This is not 100% of everything, but is enough to get tests passing, which includes full binary and text format support, getting all switches to compile without error, and some additions to InstrumentLocals.
* wasm-reduce: Fix setting of feature flags after loading (#3493)Alon Zakai2021-01-151-2/+6
| | | | | We mistakenly did not set the flags to all, which meant that if the features section was not present, we'd not have the proper features set, leading to errors on writing.
* wasm-reduce: default to -all, and make it customizable (#3492)Alon Zakai2021-01-151-11/+20
| | | | | | | | | This goes back to the downsides of #2813, but that seems unavoidable as without this, testcases without the features section but that use features did not work. This PR at least makes it easy to customize the flags send to the commands. See also #3393 (comment)
* Reducer: Improve warning on scripts that ignore the input (#3490)Alon Zakai2021-01-151-9/+20
| | | | | | | | | | | | | The risk the warning checks for is giving the reducer a script that ignores the input. To do so it runs the command in the input, and runs it on a garbage file, and checks if the result is different. However, if the script does immediately fail on the input - because the input is a crash testcase or such - then this does not work, as the result on a garbage input may be the same error. To avoid that, also check what happens on a trivial valid wasm as input. Only show the warning if the result on the original input, on a garbage wasm, and on a trivial wasm, are all the same - in that case, likely the script really is ignoring the input.
* [wasm-reduce] Improve support for reducing on text files (#3437)Alon Zakai2020-12-141-4/+8
| | | | Passing --detect-features there doesn't work (as there is no feature section).