summaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
...
* Fix MemoryPacking bug (#4579)Thomas Lively2022-04-051-2/+1
| | | | | | | | 247f4c20a1 introduced a bug that caused expressions that refer to data segments to be associated with the wrong segments in the presence of other segments that have no referring expressions at all. Fixes #4569. Fixes #4571.
* [Wasm GC] Fix unreachable local.gets of non-nullable locals in ↵Alon Zakai2022-04-051-1/+21
| | | | | | | | CoalesceLocals (#4574) Normally we just replace unreachable local.gets with a constant (0, or null), but if the local is non-nullable we can't do that. Fixes #4573
* Use LiteralUtils::canMakeZero before calling makeZero (#4568)Alon Zakai2022-04-013-9/+10
| | | | | Fixes #4562 Fixes #4564
* [NFC] Refactor Feature::All to match FeatureSet.setAll() (#4557)Alon Zakai2022-03-312-9/+13
| | | | | | | | | | | As we recently noted in #4555, that Feature::All and FeatureSet.setAll() are different is potentially confusing... I think the best thing is to make them identical. This does that, and adds a new Feature::AllPossible which is everything possible and not just the set of all features that are enabled by -all. This undoes part of #4555 as now the old/simpler code works properly.
* [Wasm GC] Fix stacky non-nullable tuples (#4561)Alon Zakai2022-03-311-7/+8
| | | | | #4555 fixed validation for such tuples, but we also did not handle them in "stacky" code using pops etc., due to a logic bug in the binary reading code.
* [MemoryPacking] Remove workaround for old V8 bug (#4558)Thomas Lively2022-03-311-10/+1
| | | | | | | V8 used to incorrectly parse the segment index on bulk memory instructions as a signed LEB rather than an unsigned LEB, which meant it only worked correctly with up to 63 data segments. That bug has been fixed for over two years now, so lift the maximum number of segments generated by memory packing from 63 to the specified max, 100k.
* [Wasm GC] Fix non-nullable tuples (#4555)Alon Zakai2022-03-304-7/+26
| | | | | | Apply the same logic to tuple fields as we do for all other fields, when checking whether a non-nullable value is valid. Fixes #4554
* [NFC][MemoryPacking] Avoid unnecessarily enormous memory allocations (#4556)Thomas Lively2022-03-301-26/+33
| | | | | | | | | | | The memory packing pass previously used vectors with a slot for each data segment to easily map segment indices to lists of "referrers," i.e. bulk memory expressions that referred to the segments. The parallel analysis pass to collect these referrers allocated one of these vectors for each function in a module. Unfortunately, for a particularly large module with over 200k functions and over 75k data segments, this resulted in hundreds of gigabytes of memory allocations. The vast majority of functions contain no referrers, so using a std::unordered_map makes much more sense than using a vector to perform this mapping.
* [Wasm GC] GlobalTypeOptimization: Remove fields from end based on subtypes ↵Alon Zakai2022-03-303-20/+74
| | | | | | | | | | | | | | (#4553) Previously we'd remove a field from a type if that field has no uses in any sub- or super-type. In that case we'd remove it from all the types at once. However, there is a case where we can remove a field only from a parent but not from its children, if the field is at the end: if A has fields {x, y, z} and its subtype B has fields {x, y, z, w}, and A pointers only access field y while B pointers access all the fields, then we can remove z from A. Removing it from the end is safe, and then B will not only add w as it did before but also add z. Note that we cannot remove x, because it is not at the end: removing it from just A but not B would shift the indexes, making them incompatible.
* [Wasm GC] Signature Pruning: Remove params passed constant values (#4551)Alon Zakai2022-03-281-13/+34
| | | | | This basically just adds a call to ParamUtils::applyConstantValues, however, we also need to be careful to not optimize in the presence of imports or exports, so this adds a boolean that indicates unoptimizability.
* Generalize PossibleConstantValues for immutable globals (#4549)Alon Zakai2022-03-283-42/+40
| | | | | | | | | | | | This moves more logic from ConstantFieldPropagation into PossibleConstantValues, that is, instead of handling the two cases of a Literal or a Name before calling PossibleConstantValues, move that code into the helper class. That way all users of PossibleConstantValues can benefit from it. In particular, this makes DeadArgumentElimination now support optimizing immutable globals, as well as ref.func and ref.null. (Changes to test/lit/passes/dae-gc-refine-params.wast are to avoid the new optimizations from kicking in, so that it still tests what it tested before.)
* [Wasm GC] Signature Pruning (#4545)Alon Zakai2022-03-258-22/+248
| | | | | | | | | | | | | This adds a new signature-pruning pass that prunes parameters from signature types where those parameters are never used in any function that has that type. This is similar to DeadArgumentElimination but works on a set of functions, and it can handle indirect calls. Also move a little code from SignatureRefining into a shared place to avoid duplication of logic to update signature types. This pattern happens in j2wasm code, for example if all method functions for some virtual method just return a constant and do not use the this pointer.
* [Docs] Add some comments on type updating in various places (#4548)Alon Zakai2022-03-252-7/+30
|
* [NFC] Move and generalize constant-parameter code from ↵Alon Zakai2022-03-253-38/+80
| | | | | | | DeadArgumentElimination (#4547) Similar to #4544, this moves the code to a utility function, and also slightly generalizes it to support a list of functions (and not just 1) and also a list of call_refs (and not just calls).
* [NFC] Refactor constant value finding code (#4546)Alon Zakai2022-03-252-116/+144
| | | | This just moves PossibleConstantValues to a new separate file (as a preparation for other passes using it too).
* [NFC] Move and generalize parameter-removing logic from ↵Alon Zakai2022-03-234-101/+267
| | | | | | | | | | | | DeadArgumentElimination (#4544) In preparation for removing dead arguments from all functions sharing a heap type (which seems useful for j2wasm output), first this PR refactors that code so it is reusable. This moves the code out of the pass into FunctionUtils, and also generalizes it slightly by supporting a set of functions and not just a single one, and receiving a list of call_refs and not just calls (no other changes to anything).
* [NFC] Refactor DeadArgumentElimination code to use LocalGraph (#4542)Alon Zakai2022-03-221-87/+24
| | | | | | | I believe this old code was written before we had LocalGraph. It does a flow analysis to see if the values arriving in parameters are actually used, but there is no reason to do it manually since we have LocalGraph now which makes that trivial, as it matches the values for each get, so we can just check if any get can read the param's incoming value.
* Remove nearly unused {Heap}Type::isCompound (#4541)Thomas Lively2022-03-222-5/+3
|
* Add support for extended-const proposal (#4529)Sam Clegg2022-03-1911-16/+62
| | | See https://github.com/WebAssembly/extended-const
* Fix errors when building in C++20 mode (#4528)Jakub Szewczyk2022-03-185-22/+16
| | | | | | | * use [[noreturn]] available since C++11 instead of compiler-specific attributes * replace deprecated std::is_pod with is_trivial&&is_standard_layout (also available since C++11/14) * explicitly capture this in [=] lambdas * extra const functions in FeatureSet, fix implicit cast warning by using the features field directly * Use CMAKE_CXX_STANDARD to ensure the C++ standard parameter is set on all targets, remove manual compiler flag workaround.
* [wasm2js] Support exports of Globals (#4523)magic-akari2022-03-174-10/+89
| | | | | | Export an object with a `.value` property like the wasm JS API does in browsers, and implement them with a getter and setter. Fixes #4522
* [memory64] Keep type of memory.size and memory.grow on copy (#4531)Clemens Backes2022-03-171-0/+2
| | | | | | | When copying a MemorySize or MemoryGrow instruction (e.g. for inlining), transfer the memory type also to the copy. Otherwise it will always be i32, even if memory64 should be used. This fixes issue #4530.
* MergeSimilarFunctions optimization pass (#4414)Yuta Saito2022-03-037-11/+682
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge similar functions that only differs constant values (like immediate operand of const and call insts) by parameterization. Performing this pass at post-link time can merge more functions across objects. Inspired by Swift compiler's optimization which is derived from LLVM's one: https://github.com/apple/swift/blob/main/lib/LLVMPasses/LLVMMergeFunctions.cpp https://github.com/llvm/llvm-project/blob/main/llvm/docs/MergeFunctions.rst The basic ideas here are constant value parameterization and direct callee parameterization by indirection. Constant value parameterization is like below: ;; Before (func $big-const-42 (result i32) [[many instr 1]] (i32.const 44) [[many instr 2]] ) (func $big-const-43 (result i32) [[many instr 1]] (i32.const 45) [[many instr 2]] ) ;; After (func $byn$mgfn-shared$big-const-42 (result i32) [[many instr 1]] (local.get $0) ;; parameterized!! [[many instr 2]] ) (func $big-const-42 (result i32) (call $byn$mgfn-shared$big-const-42 (i32.const 42) ) ) (func $big-const-43 (result i32) (call $byn$mgfn-shared$big-const-42 (i32.const 43) ) ) Direct callee parameterization is similar to the constant value parameterization, but it parameterizes callee function i by ref.func instead. Therefore it is enabled only when reference-types and typed-function-references features are enabled. I saw 1 ~ 2 % reduction for SwiftWasm binary and Ruby's wasm port using wasi-sdk, and 3 ~ 4.5% reduction for Unity WebGL binary when -Oz.
* [Wasm GC] Optimize static casts in br_on_cast* (#4520)Alon Zakai2022-02-251-2/+12
| | | | We were missing this particular case, which we can in fact handle when the cast is static.
* Include type names in error messages from building (#4517)Thomas Lively2022-02-181-1/+10
| | | | Instead of just reporting the type index that causes an error when building types, report the name of the responsible type when parsing the text format.
* Clarify in tools help message that -O == -Os. (#4516)t4lz2022-02-162-14/+24
| | | | | Introduce static consts with PassOptions Defaults. Add assertion to verify that the default options are the Os options. Also update the text in relevant tests.
* DeadArgumentElimination: Remove removable effects (#4514)Alon Zakai2022-02-101-3/+4
|
* Make IndexedTypeNameGenerator more powerful (#4511)Thomas Lively2022-02-092-5/+15
| | | | | Allow IndexedTypeNameGenerator to be configured with a custom prefix and also allow it to be parameterized with an explicit fallback generator. This allows multiple IndexedTypeNameGenerators to be composed together, for example.
* [wasm-split] Add an --asyncify option (#4513)Thomas Lively2022-02-093-0/+20
| | | | | | | Add an option for running the asyncify transformation on the primary module emitted by wasm-split. The idea is that the placeholder functions should be able to unwind the stack while the secondary module is asynchronously loaded, then once the placeholder functions have been patched out by the secondary module the stack should be rewound and end up in the correct secondary function.
* Fuzz structural type canonicalization (#4510)Thomas Lively2022-02-091-14/+308
| | | | | | | Add a new fuzz checker to wasm-type-fuzzer that builds copies of the originally built types, randomly selecting for each child type from all potential sources, including both the originally built types and the not-yet-built duplicate types. After building the new types, check that they are indeed identical to the old types, which means that nothing has gone wrong with canonicalization.
* Print recursion groups in wasm-fuzz-types (#4509)Thomas Lively2022-02-081-0/+17
|
* Eagerly canonicalize basic types (#4507)Thomas Lively2022-02-081-8/+21
| | | | | | | | | We were already eagerly canonicalizing basic HeapTypes when building types so the more complicated canonicalization algorithms would not have to handle noncanonical heap types, but we were not doing the same for Types. Equirecursive canonicalization was properly handling noncanonical Types everywhere, but isorecursive canonicalization was not. Rather than update isorecursive canonicalization in multiple places, remove the special handling from equirecursive canonicalization and canonicalize types in a single location.
* Generate heap type names when printing types (#4503)Thomas Lively2022-02-076-103/+330
| | | | | | | | | | | | | | | | | | | | | | The previous printing system in the Types API would print the full recursive structure of a Type or HeapType with special markers using de Bruijn indices to avoid infinite recursion and a separate special marker for when the size exceeded an arbitrary upper limit. In practice, the types printed by that system were not human readable, so all that complexity was not useful. Replace that system with a new system that always emits a HeapType name rather than recursing into the structure of inner HeapTypes. Add methods for printing Types and HeapTypes with custom HeapType name generators. Also add a new wasm-type-printing.h header with off-the-shelf type name generators that implement simple naming schemes sufficient for tests and the type fuzzer. Note that these new printing methods and the old printing methods they augment are not used for emitting text modules. Printing types as part of expressions and modules is handled by separate code in Print.cpp and the printing API modified in this PR is mostly used for debugging. However, the new printing methods are general enough that Print.cpp should be able to use them as well, so update the format used to print types in the modified printing system to match the text format in anticipation of making that change in a follow-up PR.
* Validate supertypes after isorecursive canonicalization (#4506)Thomas Lively2022-02-071-13/+22
| | | | | | | | | | | | | | Validating nominally declared supertypes depends on being able to test type equality, which in turn depends on the compared types having already been canonicalized. Previously we validated supertypes before canonicalization, so validation would fail in cases where it should have succeeded. Fix the bug by canonicalizing first. This means that the global type store can now end up holding invalid types, but those types will never be exposed to the user, so that's not a huge problem. Also fix an unrelated bug that was preventing the test from passing, which is that supertypes were being hashed and compared without the special logic for detecting self-references. This bug preventing the equivalent recursion groups in the test from being deduplicated.
* [NFC] Move type fuzzer method definitions out of line (#4505)Thomas Lively2022-02-041-91/+102
| | | | This makes it easier to get an overview of what methods exist by looking at the shorter struct definition.
* Isorecursive type fuzzing (#4501)Thomas Lively2022-02-043-26/+84
| | | | | | | | | | Add support for isorecursive types to wasm-fuzz-types by generating recursion groups and ensuring that children types are only selected from candidates through the end of the current group. For non-isorecursive systems, treat all the types as belonging to a single group so that their behavior is unchanged. Also fix two small bugs found by the fuzzer: LUB calculation was taking the wrong path for isorecursive types and isorecursive validation was not handling basic heap types properly.
* wasm-reduce: Add newer passes (#4502)Alon Zakai2022-02-031-0/+4
| | | | | | | | | | | | | | | | These might help reduction. Most newer passes, like say --type-refining, are not going to actually help by themselves without other passes, so those are not added (they get run in the -O2 etc. modes, which at least gives them a chance to help). DeadArgumentElimination: Might help by itself, if just removing arguments reduces code size. In some cases applying constants may increase code size, though, but the -optimizing variant helps there. GlobalTypeOptimization: This can remove type fields which can shrink the type section by a lot. This is the reason I realized I should open this PR, when I happened to notice that running that pass manually after reduction helped a lot more. SimplifyGlobals: Can remove unused globals, merge identical immutable ones, etc., all of which can help code size directly.
* CRTP topological sort utility (#4499)Thomas Lively2022-02-032-67/+152
| | | | | | | | Refactor the `TopologicalSortStack` into a `TopologicalSort` CRTP utility that manipulates the DFS stack internally instead of exposing it to the user. The only method users have to implement to use the utility is `pushPredecessors`, which should call the provided `push` method on all the predecessors of a given item. The public interface to `TopologicalSort` is an input iterator, so it can easily be used in range-based for loops.
* [Wasm GC] [ctor-eval] Evaluate and serialize GC data (#4491)Alon Zakai2022-02-031-4/+169
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This ended up simpler than I thought. We can simply emit global and local data as we go, creating globals as necessary to contain GC data, and referring to them using global.get later. That will ensure that data identity works (things referring to the same object in the interpreter will refer to the same object when the wasm is loaded). In more detail, each live GC item is created in a "defining global", a global that is immutable and of the precise type of that data. Then we just read from that location in any place that wants to refer to that data. That is, something like function foo() { var x = Bar(10); var y = Bar(20); var z = x; z.value++; // first object now contains 11 ... } will be evalled into something like var define$0 = Bar(11); // note the ++ has taken effect here var define$1 = Bar(20); function foo() { var x = define$0; var y = define$1; var z = define$0; ... } This PR should handle everything but "cycles", that is, GC data that at runtime ends up forming a loop. Leaving that for later work (not sure how urgent it is to fix).
* [Docs] Document wasm-ctor-eval (#4493)Alon Zakai2022-02-031-3/+2
|
* [Wasm GC] Fix TypeRefining corner case with uncreated types (#4500)Alon Zakai2022-02-031-0/+49
| | | | | | | | | | | | | | | | This pass ignores reads from structs - it only cares about writes (during a create or a struct.set). That makes sense since we want to refine the type of fields to more specific things based on what is actually written to them. However, a corner case was missed: If we ignore reads, the pass may "cleverly" optimize to something that is no longer valid to read from. How that happens is if there is no info at all for a type - no sets or news, so all we have is a read, which as mentioned before we ignore, so we think we have nothing at all for that type, and can do arbitrary stuff with it. But then the arbitrary replacement can be invalid to read from, say if it has fewer fields. To handle that, just emit an unreachable. If all we have is a get but no new then there cannot be an instance here at all. (That's only true in a closed world, of course, but this entire pass assumes that anyhow.)
* [NFC] Simplify TopologicalSortStack (#4498)Thomas Lively2022-02-031-13/+5
| | | | | | | Using a vector and lazily skipping finished items in `pop` rather than using a linked list and eagerly removing duplicated items makes the code simpler and more than twice as fast on my test case. The algorithmic complexity is unchanged because the work to skip duplicates remains linear in the number of duplicates added, it's just not spread out over the linear duplicate pushes any more.
* Isorecursive binary format (#4494)Thomas Lively2022-02-032-16/+53
| | | | | | | | | | | Write and parse recursion groups in binary type sections. Unlike in the text format, where we ignore recursion groups when not using isorecursive types, do not allow parsing binary recursion group when using other type systems. Doing so would produce incorrect results because recursions groups only count as single entries in the type system vector so we dynamically grow the TypeBuilder when we encounter them. That would change the mapping of later indices to types, and would change the meaning of previous type definitions that use those later indices. This is not a problem in the isorecursive system because in that system type definitions are not allowed to use later indices.
* Fix an assertion in the validator on call_ref heaptypes (#4496)Alon Zakai2022-02-031-4/+5
| | | | We can only call getHeapType if it is indeed a function type. Otherwise we should show the error and move on.
* Topological sorting of types in isorecursive output (#4492)Thomas Lively2022-02-026-97/+207
| | | | | | | | | | | | | | | | | | | Generally we try to order types by decreasing use count so that frequently used types get smaller indices. For the equirecursive and nominal systems, there are no contraints on the ordering of types, so we just have to sort them according to their use counts. For the isorecursive type system, however, there are a number of ordering constraints that have to be met for the type section to be valid. First, types in the same recursion group must be adjacent so they can be grouped together. Second, groups must be ordered topologically so that they only refer to types in themselves or prior groups. Update type ordering to produce a valid isorecursive output by performing a topological sort on the recursion groups. While performing the sort, prefer to visit and finish processing the most used groups first as a heuristic to improve the final ordering. Do not reorder types within groups, since doing so would change type identity and could affect the external interface of the module. Leave that reordering to an optimization pass (not yet implemented) that users can explicitly opt in to.
* Remove used wasm-emscripten-finalize option `--initial-stack-pointer` (#4490)Sam Clegg2022-02-011-8/+0
|
* Interpreter: Remove GlobalManager (#4486)Alon Zakai2022-01-312-110/+27
| | | | | | | | | | | | | | | | | | | | | | | | | GlobalManager is another class that added complexity in the interpreter logic, and did not help. In fact it hurts extensibility, as when one wants to extend the interpreter one has another class to customize, and it is templated on the main runner, so again as #4479 we end up with annoying template cycles. This simply removes that class. That makes the interpreter code strictly simpler. Applying that change to wasm-ctor-eval also ends up fixing a pre-existing bug, so this PR gets testing through that. The ctor-eval issue was that we did not extend the GlobalManager properly in the past: we checked for accesses on imported globals there, but not in the main class, i.e., not on global.get operations. Needing to do things in two places is an example of the previous complexity. The fix is simply to implement visitGlobalGet in one place, and remove all the GlobalManager logic added in ctor-eval, which then gets a lot simpler as well. The new imported-global-2.wast checks for that bug (a global.get of an import should stop us from evalling). Existing tests cover the other cases, like it being ok to read a non-imported global, etc. The existing test indirect-call3.wast required a slight change: There was a global.get of an imported global, which was ignored in the place it happened (an init of an elem segment); the new code checks all global.gets, so it now catches that.
* Isorecursive HeapType constructors (#4482)Thomas Lively2022-01-291-15/+65
| | | | | Update the HeapType constructors that take Signature, Structs, and Arrays to work properly with isorecursive typing. This is particularly important for the Signature constructor, which is used frequently throughout the code base.
* Isorecursive canonicalization (#4481)Thomas Lively2022-01-292-21/+416
| | | | | | | | | | | | | | | | | | | | | | | | | Isorecursive canonicalization is similar to equirecursive canonicalization in that it deduplicates types based on their structure, but unlike equirecursive canonicalization, isorecursive canonicalization does not need to minimize the type definition first. Another difference is that structures are deduplicated at the level of recursion groups rather than individual types, so we cannot reuse the existing `FiniteShapeHasher` and `FiniteShapeEquator` utilities. Instead, introduce a new `RecGroupStructure` wrapper that provides equality comparison and hashing very similar to the former utilities. Another feature of the isorecursive type system is that its constraints on the order of recursion groups means that they are already topologically sorted and can be incrementally canonicalized in a bottom-up manner. This incremental canonicalization means that the `RecGroupStructure` utility can assume that all child `HeapTypes` have already been canonicalized and can avoid traversing anything but the top-level type definitions in the wrapped recursion group. The only exception is self-references into the wrapped recursion group itself, which may not be canonical. That special case is detected and handled without any nontrivial extra work. Overall, the canonicalization algorithm traverses each type definition once to canonicalize its `HeapType` use sites, then once to hash its recursion group structure, then finally once to canonicalize its `Type` use sites in `globallyCanonicalize`, giving only linear time complexity.
* [NFC] Refactor ModuleInstanceBase+RuntimeExpressionRunner into a single ↵Alon Zakai2022-01-286-916/+894
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | class (#4479) As recently discussed, the interpreter code is way too complex. Trying to add ctor-eval stuff I need, I got stuck and ended up spending some time to get rid of some of the complexity. We had a ModuleInstanceBase class which was basically an instance of a module, that is, an execution of it. And internally we have RuntimeExpressionRunner which is a runner that integrates with the ModuleInstanceBase - basically, it uses the runtime info to execute code. For example, the MIB has globals info, and the RER would read it from there. But these two classes are really just one functionality - an execution of a module. We get rid of some complexity by removing the separation between them, ending up with a class that can run a module. One set of problems we avoid is that we can now extend the single class in a simple way. Before, we would need to extend both - and inform each other of those changes. That gets "fun" with CRTP which we use everywhere. In other words, each of the two classes depended on the other / would need to be templated on the other. Specifically, MIB.callFunction would need to be given the RER to run with, and so that would need to be templated on it. This ends up leading to a bunch more templating all around - all complexity that we just don't need. See the simplification to the wasm-ctor-eval for some of that (and even worse complexity would have been needed without this PR in the next steps for that tool to eval GC stuff). The final single class is now called ModuleRunner. Also fixes a pre-existing issue uncovered by this PR. We had the delegate target on the runner, but it should be tied to a function scope. This happened to not be a problem if one always created a new runner for each scope, but this PR makes the runner longer-lived, so the stale data ended up mattering. The PR moves that data to the proper place. Note: Diff without whitespace is far, far smaller.