summaryrefslogtreecommitdiff
path: root/src/passes/RemoveUnusedModuleElements.cpp
Commit message (Collapse)AuthorAgeFilesLines
* RemoveUnusedModuleElements: Do not remove unused-but-trapping segments (#6242)Alon Zakai2024-01-251-6/+47
| | | | | | | | | | | | | | An out of bounds active segment traps during startup, which is an effect we must preserve. To avoid a regression here, ignore this in TNH mode (where the user assures us nothing will trap), and also check if a segment will trivially be in bounds and not trap (if so, it can be removed). Fixes the remove-unused-module-elements part of #6230 The small change to an existing testcase made a segment there be in bounds, to avoid this affecting it. Tests for this are in a new file.
* Remove empty _ARRAY/_VECTOR defines (NFC) (#6182)Heejin Ahn2023-12-141-3/+0
| | | | | | | `_VECTOR` or `_ARRAY` defines in `wasm-delegations-fields.def` are supposed to be defined in terms of their non-vector/array counterparts when undefined. This removes empty `_VECTOR`/`_ARRAY` defines when including `wasm-delegations-fields.def`, while adding definitions for `DELEGATE_GET_FIELD` in case it is missing.
* Partially revert #6026 (#6043)Alon Zakai2023-10-231-0/+5
| | | That optimization uncovered some LLVM and Binaryen bugs.
* RemoveUnusedModuleElements: Make exports skip trampolines (#6026)Alon Zakai2023-10-191-0/+61
| | | | | | | | | | | | | | | | | | | If we export a function that just calls another function, we can export that one instead. Then the one in the middle may be unused, function foo() { return bar(); } export foo; // can be an export of bar This saves a few bytes in rare cases, but probably more important is that it saves the trampoline, so if this is on a hot path, we save a call. Context: emscripten-core/emscripten#20478 (comment) In general this is not needed as inlining helps us out by inlining foo() into the caller (since foo is tiny, that always ends up happening). But exports are a case the inliner cannot handle, so we do it here.
* [NFC] RemoveUnusedModuleElements: Use delegations (#5961)Alon Zakai2023-09-191-93/+41
| | | | NFC, but fixes a current fuzz bug on table.fill not having an entry in this file. After this PR, there is no need for such entries.
* [NFC] Fix the use of "strict" in subtypes.h (#5804)Thomas Lively2023-07-061-1/+1
| | | | | | | | Previously we incorrectly used "strict" to mean the immediate subtypes of a type, when in fact a strict subtype of a type is any subtype excluding the type itself. Rename the incorrect `getStrictSubTypes` to `getImmediateSubTypes`, rename the redundant `getAllStrictSubTypes` to `getStrictSubTypes`, and rename the redundant `getAllSubTypes` to `getSubTypes`. Fixing the capitalization of "SubType" to "Subtype" is left as future work.
* [NFC] Track the kinds of items that names refer to in ↵Alon Zakai2023-05-051-0/+1
| | | | | | | | | | | | wasm-delegations-fields (#5690) This makes delegations-fields track Kinds. That is, rather than say a field is just a Name, we can say it is a name of kind Function. This allows users to track references to functions, tables, memories, etc., in a simple and generic way, avoiding duplicated code which we have atm. (In particular this will help wasm-merge in the future.) This also uses that functionality in two small places to show the benefits (see memory-utils.cpp and MemoryPacking.cpp).
* [NFC] Refactor each of ArrayNewSeg and ArrayInit into subclasses for ↵Alon Zakai2023-05-041-22/+11
| | | | | | | | | | | Data/Elem (#5692) ArrayNewSeg => ArrayNewSegData, ArrayNewSegElem ArrayInit => ArrayInitData, ArrayInitElem Basically we remove the opcode and use the class type to differentiate them. This adds some code but it makes the representation simpler and more compact in memory, and it will help with #5690
* [Wasm GC] Fuzz struct.get and array.get (#5651)Alon Zakai2023-04-101-4/+1
|
* Implement array.fill, array.init_data, and array.init_elem (#5637)Thomas Lively2023-04-061-0/+11
| | | | | These complement array.copy, which we already supported, as an initial complete set of bulk array operations. Replace the WIP spec tests with the upstream spec tests, lightly edited for compatibility with Binaryen.
* RemoveUnusedModuleElements: Add SIMD support for memory operations (#5632)Alon Zakai2023-04-051-0/+6
| | | | Fixes #5629
* Support multiple memories in RemoveUnusedModuleElements (#5604)Thomas Lively2023-04-041-101/+130
| | | | | | | | Add support for memory and data segment module elements and treat them uniformly with other module elements rather than as special cases. There is a cyclic dependency between memories (or tables) and their active segments because exported or accessed memories (or tables) keep their active segments alive, but active segments for imported memories (or tables) keep their memories (or tables) alive as well.
* Use Names instead of indices to identify segments (#5618)Thomas Lively2023-04-041-2/+1
| | | | | | | | | | All top-level Module elements are identified and referred to by Name, but for historical reasons element and data segments were referred to by index instead. Fix this inconsistency by using Names to refer to segments from expressions that use them. Also parse and print segment names like we do for other elements. The C API is partially converted to use names instead of indices, but there are still many functions that refer to data segments by index. Finishing the conversion can be done in the future once it becomes necessary.
* Do not treat `atomic.fence` as using a memory (#5603)Thomas Lively2023-03-291-1/+0
| | | | | | | | | * Do not treat `atomic.fence` as using a memory Update RemoveUnusedModuleElements so that it no longer keeps the memory alive due to an `atomic.fence` instruction and update validation to allow modules to use `atomic.fence` without a memory. * update wasm2js tests
* [NFC] Simplify initialization in RemoveUnusedModuleElements.cpp (#5601)Thomas Lively2023-03-291-25/+21
| | | Use copy-list-initialization to shorten the code and reduce visual redundancy.
* [Wasm GC] Fix RemoveUnusedModuleEffects struct field reads with ↵Alon Zakai2023-02-021-5/+31
| | | | | | | | | | | | | | | | | | call.without.effects (#5477) If we see (struct.new $A (call $call.without.effects ;; intrinsic (ref.func $getter) ) ) then we can ignore side effects in that call, as the intrinsic tells us to. However, that function is still called (it will be called normally, after intrinsics are lowered away), and we should not remove it. That is, such an intrinsic call can be removed, but it cannot be left as it is and also ignored as if it does not exist.
* Fix racing merge breakage (#5472)Alon Zakai2023-02-011-3/+1
|
* RemoveUnusedModuleElements: Support function subtyping (#5470)Alon Zakai2023-02-011-18/+32
| | | | When we see a call_ref or call_indirect of a heap type, we might be calling a subtype of that type.
* RemoveUnusedModuleElements: Optimize unread struct.new fields (#5445)Alon Zakai2023-02-011-3/+161
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This is similar to the existing optimization for function references: a RefFunc is just a reference, and we do not consider the function actually used unless some CallRef can actually call it. Here we optimize StructNew fields by tracking if they are read. Consider this object: (struct.new $object (global.get $vtable) ) Before this PR, when we saw that StructNew we'd make that global used, and process all of its contents, which look like this: (global $vtable (struct.new $object (ref.func $foo) ) ) With this PR we track which fields on the struct types are even read. If they are not then we do not add uses of the things they refer to. In this example, the global vtable would be referenced, but not used, and likewise the RefFunc inside it. The technical way we handle this is to walk StructNews carefully: when we see a field that has been read, we can just read it normally, but if it has not been read then we note it on the side, and only process it later if we see a read.
* RemoveUnusedModuleElements: Handle changes to tables (#5469)Alon Zakai2023-01-311-0/+5
| | | | | This is a long-standing bug - we ignored the possibility of table.set in this pass. We have few tests for this so it took a while for the fuzzer to notice it I suppose.
* [NFC] Refactor RemoveUsedModuleElements to clarify references and uses (#5444)Alon Zakai2023-01-311-196/+309
| | | | | | | | | | | | | | | | | | | This pass talked about reachability and uses and such, but it wasn't very clear on those things. This refactors it to clarify that we look for references and uses, and what that means. Specifically, a reference to something makes us keep it alive, as we need to refer to it; a use makes us also keep it identical in its contents so that when it is used no behavior changes. A function reference without a call_ref that can call it is an example of a reference without a use, which this pass already optimized. To make that more clear in the code, this refactors out the reference-finding logic into a new struct, ReferenceFinder. This also replaces the normal walking logic with a more manual traversal using ChildIterator. This is necessary to properly differentiate references from uses, but is not immediately useful in this PR except for clarity. It will be necessary in the next PR, which this prepares for, in which we'll optimize more reference-but-not-use things.
* [NFC] Work a little more efficiently in RemoveUnusedModuleElements (#5429)Alon Zakai2023-01-131-22/+33
| | | | | | Before, we'd potentially add a new item to the queue multiple times, then do nothing when popping it from the queue in the second and later times. With this PR we add a new item to the reachable set and to the queue at the same time, so items cannot appear more than once in the queue.
* [Wasm GC] Implement closed-world flag (#5303)Alon Zakai2022-11-301-7/+24
| | | | | | | | | | | | | With this change we default to an open world, that is, we do the safe thing by default: we no longer assume a closed world. Users that want a closed world must pass --closed-world. Atm we just do not run passes that assume a closed world. (We might later refine them to find which types don't escape and only optimize those.) The RemoveUnusedModuleElements is an exception in that the closed-world flag influences one part of its operation, but not the rest. Fixes #5292
* Switch from `typedef` to `using` in C++ code. NFC (#5258)Sam Clegg2022-11-151-1/+1
| | | | This is more modern and (IMHO) easier to read than that old C typedef syntax.
* Fix arithmetic in interpretation of ArrayNewSeg (#5251)Thomas Lively2022-11-141-2/+1
| | | | | | | | | | | The offset and size were previously being sign extended from 32 to 64 bits, which meant that negative sizes could make the bounds check pass and cause an exception to be thrown by an overly large allocation. Switch to using uint64_t from the start rather than mixing sizes and signs, and update the tests to reproduce the error more robustly in the absence of the fix. Also fix a bug in RemoveUnusedModuleElements triggered by the new test. Fixes #5249.
* Implement `array.new_data` and `array.new_elem` (#5214)Thomas Lively2022-11-071-1/+18
| | | | | | | | | In order to test them, fix the binary and text parsers to accept passive data segments even if a module has no memory. In addition to parsing and emitting the new instructions, also implement their validation and interpretation. Test the interpretation directly with wasm-shell tests adapted from the upstream spec tests. Running the upstream spec tests directly would require fixing too many bugs in the legacy text parser, so it will have to wait for the new text parser to be ready.
* Make `Name` a pointer, length pair (#5122)Thomas Lively2022-10-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | With the goal of supporting null characters (i.e. zero bytes) in strings. Rewrite the underlying interned `IString` to store a `std::string_view` rather than a `const char*`, reduce the number of map lookups necessary to intern a string, and present a more immutable interface. Most importantly, replace the `c_str()` method that returned a `const char*` with a `toString()` method that returns a `std::string`. This new method can correctly handle strings containing null characters. A `const char*` can still be had by calling `data()` on the `std::string_view`, although this usage should be discouraged. This change is NFC in spirit, although not in practice. It does not intend to support any particular new functionality, but it is probably now possible to use strings containing null characters in at least some cases. At least one parser bug is also incidentally fixed. Follow-on PRs will explicitly support and test strings containing nulls for particular use cases. The C API still uses `const char*` to represent strings. As strings containing nulls become better supported by the rest of Binaryen, this will no longer be sufficient. Updating the C and JS APIs to use pointer, length pairs is left as future work.
* Refactor interaction between Pass and PassRunner (#5093)Thomas Lively2022-09-301-1/+1
| | | | | | | | | | | | | | Previously only WalkerPasses had access to the `getPassRunner` and `getPassOptions` methods. Move those methods to `Pass` so all passes can use them. As a result, the `PassRunner` passed to `Pass::run` and `Pass::runOnFunction` is no longer necessary, so remove it. Also update `Pass::create` to return a unique_ptr, which is more efficient than having it return a raw pointer only to have the `PassRunner` wrap that raw pointer in a `unique_ptr`. Delete the unused template `PassRunner::getLast()`, which looks like it was intended to enable retrieving previous analyses and has been in the code base since 2015 but is not implemented anywhere.
* [Wasm GC] Support non-nullable locals in the "1a" form (#4959)Alon Zakai2022-08-311-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | An overview of this is in the README in the diff here (conveniently, it is near the top of the diff). Basically, we fix up nn locals after each pass, by default. This keeps things easy to reason about - what validates is what is valid wasm - but there are some minor nuances as mentioned there, in particular, we ignore nameless blocks (which are commonly added by various passes; ignoring them means we can keep more locals non-nullable). The key addition here is LocalStructuralDominance which checks which local indexes have the "structural dominance" property of 1a, that is, that each get has a set in its block or an outer block that precedes it. I optimized that function quite a lot to reduce the overhead of running that logic after each pass. The overhead is something like 2% on J2Wasm and 0% on Dart (0%, because in this mode we shrink code size, so there is less work actually, and it balances out). Since we run fixups after each pass, this PR removes logic to manually call the fixup code from various places we used to call it (like eh-utils and various passes). Various passes are now marked as requiresNonNullableLocalFixups => false. That lets us skip running the fixups after them, which we normally do automatically. This helps avoid overhead. Most passes still need the fixups, though - any pass that adds a local, or a named block, or moves code around, likely does. This removes a hack in SimplifyLocals that is no longer needed. Before we worked to avoid moving a set into a try, as it might not validate. Now, we just do it and let fixups happen automatically if they need to: in the common code they probably don't, so the extra complexity seems not worth it. Also removes a hack from StackIR. That hack tried to avoid roundtrip adding a nondefaultable local. But we have the logic to fix that up now, and opts will likely keep it non-nullable as well. Various tests end up updated here because now a local can be non-nullable - previous fixups are no longer needed. Note that this doesn't remove the gc-nn-locals feature. That has been useful for testing, and may still be useful in the future - it basically just allows nn locals in all positions (that can't read the null default value at the entry). We can consider removing it separately. Fixes #4824
* Mutli-Memories Support in IR (#4811)Ashley Nelson2022-08-171-7/+5
| | | | | | | This PR removes the single memory restriction in IR, adding support for a single module to reference multiple memories. To support this change, a new memory name field was added to 13 memory instructions in order to identify the memory for the instruction. It is a goal of this PR to maintain backwards compatibility with existing text and binary wasm modules, so memory indexes remain optional for memory instructions. Similarly, the JS API makes assumptions about which memory is intended when only one memory is present in the module. Another goal of this PR is that existing tests behavior be unaffected. That said, tests must now explicitly define a memory before invoking memory instructions or exporting a memory, and memory names are now printed for each memory instruction in the text format. There remain quite a few places where a hardcoded reference to the first memory persist (memory flattening, for example, will return early if more than one memory is present in the module). Many of these call-sites, particularly within passes, will require us to rethink how the optimization works in a multi-memories world. Other call-sites may necessitate more invasive code restructuring to fully convert away from relying on a globally available, single memory pointer.
* [GUFA] Handle GUFA + Intrinsics (#4839)Alon Zakai2022-08-011-1/+3
| | | | | | | | Like RemoveUnusedModuleElements, places that build graphs of function reachability must special-case the call-without-effects intrinsic. Without that, it looks like a call to an import. Normally a call to an import is fine - it makes us be super-pessimistic, as we think things escape all the way out - but in GC for now we are assuming a closed world, and so we end up broken. To fix that, properly handle the intrinsic case.
* First class Data Segments (#4733)Ashley Nelson2022-06-211-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Updating wasm.h/cpp for DataSegments * Updating wasm-binary.h/cpp for DataSegments * Removed link from Memory to DataSegments and updated module-utils, Metrics and wasm-traversal * checking isPassive when copying data segments to know whether to construct the data segment with an offset or not * Removing memory member var from DataSegment class as there is only one memory rn. Updated wasm-validator.cpp * Updated wasm-interpreter * First look at updating Passes * Updated wasm-s-parser * Updated files in src/ir * Updating tools files * Last pass on src files before building * added visitDataSegment * Fixing build errors * Data segments need a name * fixing var name * ran clang-format * Ensuring a name on DataSegment * Ensuring more datasegments have names * Adding explicit name support * Fix fuzzing name * Outputting data name in wasm binary only if explicit * Checking temp dataSegments vector to validateBinary because it's the one with the segments before we processNames * Pass on when data segment names are explicitly set * Ran auto_update_tests.py and check.py, success all around * Removed an errant semi-colon and corrected a counter. Everything still passes * Linting * Fixing processing memory names after parsed from binary * Updating the test from the last fix * Correcting error comment * Impl kripken@ comments * Impl tlively@ comments * Updated tests that remove data print when == 0 * Ran clang format * Impl tlively@ comments * Ran clang-format
* Handle call.without.effects in RemoveUnusedModuleElements (#4624)Alon Zakai2022-05-021-5/+39
| | | | | | | | | | | | | | | | | We assume a closed world atm in the GC space, but the call.without.effects intrinsic sort of breaks that: that intrinsic looks like an import, but we really need to care about what is sent to it even in a closed world: (call $call-without-effects (ref.func $target-keep) ) That reference cannot be ignored, as logically it is called just as if there were a call_ref there. This adds support for that, fixing the combination of #4621 and using call.without.effects. Also flip the vector of ref.func names to a set. I realized that in a very large program we might see the same name many times.
* RemoveUnusedModuleElements: Track CallRef/RefFunc more precisely (#4621)Alon Zakai2022-04-281-3/+91
| | | | | | | | | | | | | | | | | | If we see (ref.func $foo) that does not mean that $foo is reachable - we must also see a (call_ref ..) of the proper type. Only after seeing both should we mark the function as reachable, which this PR does. This adds some complexity as we need to track intermediate state as we go, since we could see the RefFunc before the CallRef or vice versa. We also need to handle the case of a RefFunc without a CallRef properly: We cannot remove the function, as the RefFunc must refer to it, but at least we can empty out the body since we know it is never reached. This removes an old wasm-opt test which is now superseded by a new lit test. On J2Wasm output this removes 3% of all functions, which account for 2.5% of total code size.
* Modernize code to C++17 (#3104)Max Graey2021-11-221-8/+8
|
* Add table.grow operation (#4245)Max Graey2021-10-181-0/+1
|
* Add table.size operation (#4224)Max Graey2021-10-081-0/+1
|
* Add table.set operation (#4215)Max Graey2021-10-071-0/+1
|
* Implement table.get (#4195)Alon Zakai2021-09-301-9/+10
| | | | Adds the part of the spec test suite that this passes (without table.set we can't do it all).
* [EH] Replace event with tag (#3937)Heejin Ahn2021-06-181-9/+9
| | | | | | | | | | | We recently decided to change 'event' to 'tag', and to 'event section' to 'tag section', out of the rationale that the section contains a generalized tag that references a type, which may be used for something other than exceptions, and the name 'event' can be confusing in the web context. See - https://github.com/WebAssembly/exception-handling/issues/159#issuecomment-857910130 - https://github.com/WebAssembly/exception-handling/pull/161
* RemoveUnusedModuleElements: The start function may be imported (#3884)Alon Zakai2021-05-131-1/+1
| | | | | Without this fix we can segfault, as it has no body. Fixes #3879
* ExtractFunction: Do not always remove the memory and table (#3877)Alon Zakai2021-05-111-0/+4
| | | | | | | | | Instead, run RemoveUnusedModuleElements, which does that sort of thing. That is, this pass just "extracts" the function by turning all others into imports, and then they should almost all be removable via RemoveUnusedModuleElements, depending on whether they are used in the table or not, whether the extracted function calls them, etc. Without this, we would error if a function was in the table, and so this fixes #3876
* [RT] Support expressions in element segments (#3666)Abbas Mashayekh2021-03-241-5/+4
| | | | | | This PR adds support for `ref.null t` as a valid element segment item. The abbreviated format of `(elem ... func $f $g...)` is kept in both printing and binary emitting if all items are `ref.func`s. Public APIs aren't updated in this PR.
* [reference-types] Support passive elem segments (#3572)Abbas Mashayekh2021-03-051-25/+42
| | | | | | | | | | | Passive element segments do not belong to any table, so the link between Table and elem needs to be weaker; i.e. an elem may have a table in case of active segments, or simply be a collection of function references in case of passive/declarative segments. This PR takes Table::Segment out and turns it into a first class module element just like tables and functions. It also implements early support for parsing, printing, encoding and decoding passive/declarative elem segments.
* [Wasm Exceptions] Scan catch events in RemoveUnusedModuleElements (#3582)Alon Zakai2021-02-181-23/+16
| | | | Also refactor away some annoying repeated code in that pass. visitTry is the only actual change.
* [reference-types] remove single table restriction in IR (#3517)Abbas Mashayekh2021-02-091-27/+40
| | | Adds support for modules with multiple tables. Adds a field for the table name to `CallIndirect` and updates the C/JS APIs accordingly.
* Remove exnref and br_on_exn (#3505)Heejin Ahn2021-01-221-6/+0
| | | This removes `exnref` type and `br_on_exn` instruction.
* Remove dead code and unused includes. NFC. (#3328)Sam Clegg2020-11-081-1/+0
| | | Specifically try to cleanup use of asm_v_wasm.h and asmjs constants.
* Refactor Host expression to MemorySize and MemoryGrow (#3137)Daniel Wirtz2020-09-171-5/+2
| | | Aligns the internal representations of `memory.size` and `memory.grow` with other more recent memory instructions by removing the legacy `Host` expression class and adding separate expression classes for `MemorySize` and `MemoryGrow`. Simplifies related APIs, but is also a breaking API change.
* Add support for reference types proposal (#2451)Heejin Ahn2019-12-301-0/+6
| | | | | | | | | | | | This adds support for the reference type proposal. This includes support for all reference types (`anyref`, `funcref`(=`anyfunc`), and `nullref`) and four new instructions: `ref.null`, `ref.is_null`, `ref.func`, and new typed `select`. This also adds subtype relationship support between reference types. This does not include table instructions yet. This also does not include wasm2js support. Fixes #2444 and fixes #2447.