summaryrefslogtreecommitdiff
path: root/src/tools/wasm-reduce.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Use C++17's [[maybe_unused]]. NFC (#5309)Sam Clegg2022-12-021-1/+0
|
* Remove equirecursive typing (#5240)Thomas Lively2022-11-231-2/+3
| | | | Equirecursive is no longer standards track and its implementation is extremely complex. Remove it.
* Make `Name` a pointer, length pair (#5122)Thomas Lively2022-10-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | With the goal of supporting null characters (i.e. zero bytes) in strings. Rewrite the underlying interned `IString` to store a `std::string_view` rather than a `const char*`, reduce the number of map lookups necessary to intern a string, and present a more immutable interface. Most importantly, replace the `c_str()` method that returned a `const char*` with a `toString()` method that returns a `std::string`. This new method can correctly handle strings containing null characters. A `const char*` can still be had by calling `data()` on the `std::string_view`, although this usage should be discouraged. This change is NFC in spirit, although not in practice. It does not intend to support any particular new functionality, but it is probably now possible to use strings containing null characters in at least some cases. At least one parser bug is also incidentally fixed. Follow-on PRs will explicitly support and test strings containing nulls for particular use cases. The C API still uses `const char*` to represent strings. As strings containing nulls become better supported by the rest of Binaryen, this will no longer be sufficient. Updating the C and JS APIs to use pointer, length pairs is left as future work.
* Implement bottom heap types (#5115)Thomas Lively2022-10-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | These types, `none`, `nofunc`, and `noextern` are uninhabited, so references to them can only possibly be null. To simplify the IR and increase type precision, introduce new invariants that all `ref.null` instructions must be typed with one of these new bottom types and that `Literals` have a bottom type iff they represent null values. These new invariants requires several additional changes. First, it is now possible that the `ref` or `target` child of a `StructGet`, `StructSet`, `ArrayGet`, `ArraySet`, or `CallRef` instruction has a bottom reference type, so it is not possible to determine what heap type annotation to emit in the binary or text formats. (The bottom types are not valid type annotations since they do not have indices in the type section.) To fix that problem, update the printer and binary emitter to emit unreachables instead of the instruction with undetermined type annotation. This is a valid transformation because the only possible value that could flow into those instructions in that case is null, and all of those instructions trap on nulls. That fix uncovered a latent bug in the binary parser in which new unreachables within unreachable code were handled incorrectly. This bug was not previously found by the fuzzer because we generally stop emitting code once we encounter an instruction with type `unreachable`. Now, however, it is possible to emit an `unreachable` for instructions that do not have type `unreachable` (but are known to trap at runtime), so we will continue emitting code. See the new test/lit/parse-double-unreachable.wast for details. Update other miscellaneous code that creates `RefNull` expressions and null `Literals` to maintain the new invariants as well.
* [NFC] wasm-reduce: Avoid wasted work on drops (#4850)Alon Zakai2022-07-291-0/+7
| | | | | | It was wasted work to see a drop and then check if we can replace it with a drop of its child, which is identical to the original state. This didn't cause any harm (we'd not reduce code size, and stop eventually) but it did slow us down.
* wasm-reduce: Apply commandline features (#4833)Alon Zakai2022-07-261-3/+11
| | | | | This lets wasm-reduce --enable-FOO work. Usually this is not needed as we do enable all features by default, but sometimes it is nice to disable features (e.g. to avoid reducing into a testcase that uses something the original wasm did not use).
* Remove basic reference types (#4802)Thomas Lively2022-07-201-30/+10
| | | | | | | | | Basic reference types like `Type::funcref`, `Type::anyref`, etc. made it easy to accidentally forget to handle reference types with the same basic HeapTypes but the opposite nullability. In principle there is nothing special about the types with shorthands except in the binary and text formats. Removing these shorthands from the internal type representation by removing all basic reference types makes some code more complicated locally, but simplifies code globally and encourages properly handling both nullable and non-nullable reference types.
* Fix more no-assertions warnings (#4765)Alon Zakai2022-06-301-0/+1
|
* First class Data Segments (#4733)Ashley Nelson2022-06-211-10/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Updating wasm.h/cpp for DataSegments * Updating wasm-binary.h/cpp for DataSegments * Removed link from Memory to DataSegments and updated module-utils, Metrics and wasm-traversal * checking isPassive when copying data segments to know whether to construct the data segment with an offset or not * Removing memory member var from DataSegment class as there is only one memory rn. Updated wasm-validator.cpp * Updated wasm-interpreter * First look at updating Passes * Updated wasm-s-parser * Updated files in src/ir * Updating tools files * Last pass on src files before building * added visitDataSegment * Fixing build errors * Data segments need a name * fixing var name * ran clang-format * Ensuring a name on DataSegment * Ensuring more datasegments have names * Adding explicit name support * Fix fuzzing name * Outputting data name in wasm binary only if explicit * Checking temp dataSegments vector to validateBinary because it's the one with the segments before we processNames * Pass on when data segment names are explicitly set * Ran auto_update_tests.py and check.py, success all around * Removed an errant semi-colon and corrected a counter. Everything still passes * Linting * Fixing processing memory names after parsed from binary * Updating the test from the last fix * Correcting error comment * Impl kripken@ comments * Impl tlively@ comments * Updated tests that remove data print when == 0 * Ran clang format * Impl tlively@ comments * Ran clang-format
* Reducer: Support --hybrid (#4726)Alon Zakai2022-06-141-0/+3
|
* wasm-reduce: Fix order in shrinkByReduction call (#4673)Alon Zakai2022-05-171-1/+4
| | | | | | The old code would short-circuit and not do anything after we managed any reduction in the loop here. That would end up doing entire iterations of the whole pipeline before removing another element segment, which could be slow.
* Remove externref (#4633)Thomas Lively2022-05-041-5/+0
| | | | | | Remove `Type::externref` and `HeapType::ext` and replace them with uses of anyref and any, respectively, now that we have unified these types in the GC proposal. For backwards compatibility, continue to parse `extern` and `externref` and maintain their relevant C API functions.
* wasm-reduce: Try to remove functions from a random place (#4612)Alon Zakai2022-04-251-7/+32
| | | | | | Previously we'd only try to remove functions from index 0, so we missed some opportunities. With this change we still go through all the functions if things go well, but we start from a deterministic random location in the vector.
* [Wasm GC] Signature Pruning (#4545)Alon Zakai2022-03-251-0/+1
| | | | | | | | | | | | | This adds a new signature-pruning pass that prunes parameters from signature types where those parameters are never used in any function that has that type. This is similar to DeadArgumentElimination but works on a set of functions, and it can handle indirect calls. Also move a little code from SignatureRefining into a shared place to avoid duplication of logic to update signature types. This pattern happens in j2wasm code, for example if all method functions for some virtual method just return a constant and do not use the this pointer.
* wasm-reduce: Add newer passes (#4502)Alon Zakai2022-02-031-0/+4
| | | | | | | | | | | | | | | | These might help reduction. Most newer passes, like say --type-refining, are not going to actually help by themselves without other passes, so those are not added (they get run in the -O2 etc. modes, which at least gives them a chance to help). DeadArgumentElimination: Might help by itself, if just removing arguments reduces code size. In some cases applying constants may increase code size, though, but the -optimizing variant helps there. GlobalTypeOptimization: This can remove type fields which can shrink the type section by a lot. This is the reason I realized I should open this PR, when I happened to notice that running that pass manually after reduction helped a lot more. SimplifyGlobals: Can remove unused globals, merge identical immutable ones, etc., all of which can help code size directly.
* Add categories to --help text (#4421)Alon Zakai2022-01-051-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The general shape of the --help output is now: ======================== wasm-foo Does the foo operation ======================== wasm-foo opts: -------------- --foo-bar .. Tool opts: ---------- .. The options are now in categories, with the more specific ones - most likely to be wanted by the user - first. I think this makes the list a lot less confusing. In particular, in wasm-opt all the opt passes are now in their own category. Also add a script to make it easy to update the help tests.
* Reducer: Apply --debug to all commands (#4275)Alon Zakai2021-10-251-3/+4
| | | | | | Do so by applying --debug to extraFlags right at the start. That global is used everywhere already. In particular, this PR removes manually adding -g in the first diff chunk here, and you can see extraFlags appears there already on the previous line.
* LocalCSE rewrite (#4079)Alon Zakai2021-08-171-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Technically this is not a new pass, but it is a rewrite almost from scratch. Local Common Subexpression Elimination looks for repeated patterns, stuff like this: x = (a + b) + c y = a + b => temp = a + b x = temp + c y = temp The old pass worked on flat IR, which is inefficient, and was overly complicated because of that. The new pass uses a new algorithm that I think is pretty simple, see the detailed comment at the top. This keeps the pass enabled only in -O4, like before - right after flattening the IR. That is to make this as minimal a change as possible. Followups will enable the pass in the main pipeline, that is, we will finally be able to run it by default. (Note that to make the pass work well after flatten, an extra simplify-locals is added - the old pass used to do part of simplify-locals internally, which was one source of complexity. Even so, some of the -O4 tests have changes, due to minor factors - they are just minor orderings etc., which can be seen by inspecting the outputs before and after using e.g. --metrics) This plus some followup work leads to large wins on wasm GC output. On j2cl there is a common pattern of repeated struct.gets, so common that this pass removes 85% of all struct.gets, which makes the total binary 15% smaller. However, on LLVM-emitted code the benefit is minor, less than 1%.
* Support nominal typing in wasm-reduce (#4080)Alon Zakai2021-08-161-3/+8
| | | | | Use ToolOptions there, which adds --nominal support. We must also pass --nominal to the sub-commands we run.
* Clean up and rewrite wasm-reduce element segment logic (#4015)Alon Zakai2021-07-231-16/+19
| | | | | | | | | Practically NFC, but it does reorder some code a little. Previously we would find a "zero", then shrink segments, then use that zero - which might no longer be in the table. That seems weird, so this reorders that, but there should be no significant difference in the output. Also reduce the factor of 100 to 1, which in practice is important on one of the Dart GC benchmarks that has a huge number of table segments.
* wasm-reduce: Avoid a crash where function names change after ↵Alon Zakai2021-07-221-3/+4
| | | | | | | | | | | | tryToRemoveFunctions (#4013) tryToRemoveFunctions() will reload the wasm from binary if it fails to optimize, and without the names section we don't have a guarantee on the names being the same after that. And then tryToEmptyFunctions would look for a name, and crash. In the reverse order there is no risk, as tryToEmptyFunctions does not reload the wasm from binary, it carefully undoes what it tried to do when it fails.
* Reduce more carefully when it looks like we are failing (#3996)Alon Zakai2021-07-221-3/+3
| | | | | | Instead of skipping to the end, move quickly towards the end. This is sometimes more efficient (as jumping from a big factor to a factor of 1 can skip over big opportunities to remove code all at once instead of once instruction at a time).
* Exponentially empty out function bodies when reducing (#3997)Alon Zakai2021-07-201-45/+75
| | | | | | | | | This removes the code that did so one at a time, and instead adds it in a way that we can do it in an exponentially growing set of functions. On large testcases where other methods do not work, this is very useful. Also adjust the factor to do this 20x more often, which in practice is very useful too.
* Preserve Function HeapTypes (#3952)Thomas Lively2021-06-301-4/+4
| | | | | | | | | When using nominal types, func.ref of two functions with identical signatures but different HeapTypes will yield different types. To preserve these semantics, Functions need to track their HeapTypes, not just their Signatures. This PR replaces the Signature field in Function with a HeapType field and adds new utility methods to make it almost as simple to update and query the function HeapType as it was to update and query the Function Signature.
* wasm-reduce: Always decrease the factor (#3849)Alon Zakai2021-05-181-3/+9
| | | | | | When things go well, the reducer shrinks the factor by 50% or more, but when things are slow it kept the factor unchanged. That is annoying in some cases where you really have no benefit from reduction until the factor gets small. So this at least reduces it by 10% in each iteration.
* Reducer: Replace entire function bodies with either unreachable or nop (#3802)Alon Zakai2021-04-121-5/+15
| | | | | Previously we just used unreachable. This also tries nop when it is possible, and sometimes that is better (if the code is called, a nop may be less intrusive of a change).
* Reducer: skip more functions when failing to remove them (#3718)Alon Zakai2021-04-051-2/+2
| | | | | | | This avoids an annoying case where in each iteration we try to remove every function one by one and keep failing. Instead, we'll skip large numbers of them when the factor is large at least. Also shorten some unnecessary logging.
* Fix reduction of nondefaultable tuples (#3746)Alon Zakai2021-03-291-1/+1
| | | | There is a makeZeros right below that, which will assert on a nondefaultable type.
* Print parse errors in reducer and roundtrip (#3737)Alon Zakai2021-03-251-1/+7
| | | | Without this, crashes from things like #3736 simply get reported as "a parse exception was thrown" with no detail.
* [RT] Support expressions in element segments (#3666)Abbas Mashayekh2021-03-241-31/+48
| | | | | | This PR adds support for `ref.null t` as a valid element segment item. The abbreviated format of `(elem ... func $f $g...)` is kept in both printing and binary emitting if all items are `ref.func`s. Public APIs aren't updated in this PR.
* Reducer: Improve reduction of function bodies and the factor for text ↵Alon Zakai2021-03-091-6/+21
| | | | | | | | | | | | | | | | | reduction (#3668) The old code tried to call visitExpression from outside of a walk on the wasm, which works except that replaceCurrent does nothing as there is no current node. Perhaps it should assert if called outside of a walk? Might be an expensive check, but once we have no-assert builds maybe that's worthwhile. Replace that with a working check during the walk. Also limit the frequency of it (do it 1000x more often than a normal reduction, but not all the time like we used to). Also optimize the starting factor for text reduction. Text files are much larger for the same amount of IR, so the initial factor was far too high and inefficient.
* [reference-types] Support passive elem segments (#3572)Abbas Mashayekh2021-03-051-80/+77
| | | | | | | | | | | Passive element segments do not belong to any table, so the link between Table and elem needs to be weaker; i.e. an elem may have a table in case of active segments, or simply be a collection of function references in case of passive/declarative segments. This PR takes Table::Segment out and turns it into a first class module element just like tables and functions. It also implements early support for parsing, printing, encoding and decoding passive/declarative elem segments.
* [reference-types] remove single table restriction in IR (#3517)Abbas Mashayekh2021-02-091-1/+7
| | | Adds support for modules with multiple tables. Adds a field for the table name to `CallIndirect` and updates the C/JS APIs accordingly.
* Remove exnref and br_on_exn (#3505)Heejin Ahn2021-01-221-5/+0
| | | This removes `exnref` type and `br_on_exn` instruction.
* [GC] Add dataref type (#3500)Alon Zakai2021-01-211-0/+5
| | | | | This is not 100% of everything, but is enough to get tests passing, which includes full binary and text format support, getting all switches to compile without error, and some additions to InstrumentLocals.
* wasm-reduce: Fix setting of feature flags after loading (#3493)Alon Zakai2021-01-151-2/+6
| | | | | We mistakenly did not set the flags to all, which meant that if the features section was not present, we'd not have the proper features set, leading to errors on writing.
* wasm-reduce: default to -all, and make it customizable (#3492)Alon Zakai2021-01-151-11/+20
| | | | | | | | | This goes back to the downsides of #2813, but that seems unavoidable as without this, testcases without the features section but that use features did not work. This PR at least makes it easy to customize the flags send to the commands. See also #3393 (comment)
* Reducer: Improve warning on scripts that ignore the input (#3490)Alon Zakai2021-01-151-9/+20
| | | | | | | | | | | | | The risk the warning checks for is giving the reducer a script that ignores the input. To do so it runs the command in the input, and runs it on a garbage file, and checks if the result is different. However, if the script does immediately fail on the input - because the input is a crash testcase or such - then this does not work, as the result on a garbage input may be the same error. To avoid that, also check what happens on a trivial valid wasm as input. Only show the warning if the result on the original input, on a garbage wasm, and on a trivial wasm, are all the same - in that case, likely the script really is ignoring the input.
* [wasm-reduce] Improve support for reducing on text files (#3437)Alon Zakai2020-12-141-4/+8
| | | | Passing --detect-features there doesn't work (as there is no feature section).
* [Reducer] Don't error on compound types, just don't reduce them more for now ↵Alon Zakai2020-11-171-1/+4
| | | | (#3383)
* wasm-reduce: Don't try to replace a non-number (like a reference) with a ↵Alon Zakai2020-11-051-1/+6
| | | | | Const (#3218) Also don't assume numbers are 32-bit.
* wasm-reduce: When trying to remove a function, try to replace ref.func ↵Alon Zakai2020-10-191-0/+5
| | | | usages too (#3254)
* Refactor naming convention for functions handling tuples (#3196)Max Graey2020-10-091-2/+2
| | | When there are two versions of a function, one handling tuples and the other handling non-tuple values, the previous naming convention was to have "Single" in the name of the non-tuple handling function. This PR simplifies the convention and shortens function names by making the names plural for the tuple-handling version and singular for the non-tuple-handling version.
* GC: Integrate eqref and i31ref types (#3141)Daniel Wirtz2020-09-191-0/+10
| | | Adds the `eqref` and `i31ref` types to their respective code locations. Implements what can be implemented trivially and otherwise traps with a TODO for now. Integration of `eqref` is mostly complete due to it being nullable, just like `anyref`, but `i31ref` needs to remain disabled in the fuzzer because we are lacking the functionality to create trivial `i31ref` values, i.e. `(i31.new (i32.const 0))`, which is left for follow-ups to implement.
* Add anyref feature and type (#3109)Daniel Wirtz2020-09-101-0/+5
| | | Adds `anyref` type, which is enabled by a new feature `--enable-anyref`. This type is primarily used for testing that passes correctly handle subtype relationships so that the codebase will continue to be prepared for future subtyping. Since `--enable-anyref` is meaningless without also using `--enable-reference-types`, this PR also makes it a validation error to pass only the former (and similarly makes it a validation error to enable exception handling without enabling reference types).
* Update reference types (#3084)Daniel Wirtz2020-09-091-7/+2
| | | | | | | Align with the current state of the reference types proposal: * Remove `nullref` * Remove `externref` and `funcref` subtyping * A `Literal` of a nullable reference type can now represent `null` (previously was type `nullref`) * Update the tests and temporarily comment out those tests relying on subtyping
* Add new compound Signature, Struct and Array types (#3012)Daniel Wirtz2020-08-241-1/+1
| | | | | Extends the `Type` hash-consing infrastructure to handle type-parameterized and constructed types introduced in the typed function references and GC proposals. This should be a non-functional change since the new types are not used anywhere yet. Recursive type construction and canonicalization is also left as future work. Co-authored-by: Thomas Lively <tlively@google.com>
* Prepare for compound types that are single but not basic (#3046)Daniel Wirtz2020-08-171-5/+10
| | | | | | | | | | | | | | As a follow-up to https://github.com/WebAssembly/binaryen/pull/3012#pullrequestreview-459686171 this PR prepares for the new compound Signature, Struct and Array types that are single but not basic. This includes: * Renames `Type::getSingle` to `Type::getBasic` (NFC). Previously, its name was not representing its implementation (`isSingle` excluded `none` and `unreachable` while `getSingle` didn't, i.e. `getSingle` really was `getBasic`). Note that a hypothetical `Type::getSingle` cannot return `ValueType` anyway (new compound types are single but don't map to `ValueType`), so I figured it's best to skip implementing it until we actually need it. * Marks locations where we are (still) assuming that all single types are basic types, as suggested in https://github.com/WebAssembly/binaryen/pull/3012#discussion_r465356708, but using a macro, so we get useful errors once we start implementing the new types and can quickly traverse the affected locations. The macro is added where * there used to be a `switch (type.getSingle())` or similar that handled any basic type (NFC), but in the future will also have to handle single types that are not basic types. * we are not dealing with `Unary`, `Binary`, `Load`, `Store` or `AtomicXY` instructions, since these don't deal with compound types anyway.
* Add a builder.makeConst helper template (#2971)Alon Zakai2020-07-211-2/+2
|
* Rename anyref to externref to match proposal change (#2900)Jay Phelps2020-06-101-5/+5
| | | | | | | anyref future semantics were changed to only represent opaque host values, and thus renamed to externref. [Chromium](https://bugs.chromium.org/p/v8/issues/detail?id=7748#c360) was just updated to today (not yet released). I couldn't find a Mozilla bugzilla ticket mentioning externref so I don't immediately know if they've updated yet. https://github.com/WebAssembly/reference-types/pull/87