summaryrefslogtreecommitdiff
path: root/test/lit/passes
Commit message (Collapse)AuthorAgeFilesLines
* Update StackCheck for memory64 (#4636)Sam Clegg2022-05-041-0/+93
|
* Remove externref (#4633)Thomas Lively2022-05-043-39/+29
| | | | | | Remove `Type::externref` and `HeapType::ext` and replace them with uses of anyref and any, respectively, now that we have unified these types in the GC proposal. For backwards compatibility, continue to parse `extern` and `externref` and maintain their relevant C API functions.
* Update nominal type ordering (#4631)Thomas Lively2022-05-038-38/+56
| | | | | | V8 requires that supertypes come before subtypes when it parses isorecursive (i.e. standards-track) type definitions. Since 2268f2a we are emitting nominal types using the standard isorecursive format, so respect the ordering requirement.
* Handle call.without.effects in RemoveUnusedModuleElements (#4624)Alon Zakai2022-05-021-0/+112
| | | | | | | | | | | | | | | | | We assume a closed world atm in the GC space, but the call.without.effects intrinsic sort of breaks that: that intrinsic looks like an import, but we really need to care about what is sent to it even in a closed world: (call $call-without-effects (ref.func $target-keep) ) That reference cannot be ignored, as logically it is called just as if there were a call_ref there. This adds support for that, fixing the combination of #4621 and using call.without.effects. Also flip the vector of ref.func names to a set. I realized that in a very large program we might see the same name many times.
* RemoveUnusedModuleElements: Track CallRef/RefFunc more precisely (#4621)Alon Zakai2022-04-281-0/+226
| | | | | | | | | | | | | | | | | | If we see (ref.func $foo) that does not mean that $foo is reachable - we must also see a (call_ref ..) of the proper type. Only after seeing both should we mark the function as reachable, which this PR does. This adds some complexity as we need to track intermediate state as we go, since we could see the RefFunc before the CallRef or vice versa. We also need to handle the case of a RefFunc without a CallRef properly: We cannot remove the function, as the RefFunc must refer to it, but at least we can empty out the body since we know it is never reached. This removes an old wasm-opt test which is now superseded by a new lit test. On J2Wasm output this removes 3% of all functions, which account for 2.5% of total code size.
* OptimizeInstructions: Refinalize after a cast removal (#4611)Alon Zakai2022-04-251-3/+55
| | | | | | | | | Casts can replace a type with a subtype, which normally has no downsides, but in a corner case of struct types it can lead to us needing to refinalize higher up too, see details in the comment. We have avoided any Refinalize calls in OptimizeInstructions, but the case handled here requires it sadly. I considered moving it to another pass, but this is a peephole optimization so there isn't really a better place.
* [NominalFuzzing] SignatureRefining: Ignore exported functions (#4601)Alon Zakai2022-04-221-0/+31
| | | This hits the fuzzer when it tries to call reference exports with a null.
* [NominalFuzzing] Fix getHeapTypeCounts() on unreachable casts (#4609)Alon Zakai2022-04-221-0/+24
| | | | | | | | | | | The cast instruction may be unreachable but the intended type for the cast still needs to be collected. Otherwise we end up with problems both during optimizations that look at heap types and in printing (which will use the heap type in code but not declare it). Diff without whitespace is much smaller: this just moves code around so that we can use a template to avoid code duplication. The actual change is just to scan ->intendedType unconditionally, and not ignore it if the cast is unreachable.
* [NominalFuzzing] GTO: trap on null ref in removed struct.set (#4607)Alon Zakai2022-04-211-3/+9
| | | | | | | | | | | | | | When a field has no reads, we remove all its writes, but we did this: (struct.set $foo A B) => (drop A) (drop B) We also need to trap if A, the reference, is null, which this PR fixes, (struct.set $foo A B) => (drop (ref.as_non_null A)) (drop B)
* [NominalFuzzing] MergeSimilarFunctions: handle nominal types properly (#4602)Alon Zakai2022-04-211-0/+323
| | | | | | This fixes two bugs: First, we need to compare the nominal types of function constants when looking for constants to "merge", not just their structure. Second, when creating the new function we must use the proper type of those constants, and not just another type.
* Rename asyncify-side-module to asyncify-relocatable (#4596)かめのこにょこにょこ2022-04-181-1/+1
| | | | | | | Related: emscripten-core/emscripten#15893 (comment) --pass-arg=asyncify-side-module option will be used not only from side modules, but also from main modules.
* [Inlining] Preserve return_calls when possible (#4589)Thomas Lively2022-04-111-0/+62
| | | | | | | | | We can preserve return_calls in inlined functions when the inlined call site is itself a return_call, since the call result types must transitively match in that case. This solves a problem where the previous inlining logic could introduce stack exhaustion by downgrading recursive return_calls to normal calls. Fixes #4587.
* Fix MemoryPacking bug (#4579)Thomas Lively2022-04-051-0/+26
| | | | | | | | 247f4c20a1 introduced a bug that caused expressions that refer to data segments to be associated with the wrong segments in the presence of other segments that have no referring expressions at all. Fixes #4569. Fixes #4571.
* [Wasm GC] Fix unreachable local.gets of non-nullable locals in ↵Alon Zakai2022-04-052-1/+26
| | | | | | | | CoalesceLocals (#4574) Normally we just replace unreachable local.gets with a constant (0, or null), but if the local is non-nullable we can't do that. Fixes #4573
* Use LiteralUtils::canMakeZero before calling makeZero (#4568)Alon Zakai2022-04-012-4/+91
| | | | | Fixes #4562 Fixes #4564
* Port memory-packing tests to lit (#4559)Thomas Lively2022-04-012-0/+2286
|
* [Wasm GC] GlobalTypeOptimization: Remove fields from end based on subtypes ↵Alon Zakai2022-03-301-0/+154
| | | | | | | | | | | | | | (#4553) Previously we'd remove a field from a type if that field has no uses in any sub- or super-type. In that case we'd remove it from all the types at once. However, there is a case where we can remove a field only from a parent but not from its children, if the field is at the end: if A has fields {x, y, z} and its subtype B has fields {x, y, z, w}, and A pointers only access field y while B pointers access all the fields, then we can remove z from A. Removing it from the end is safe, and then B will not only add w as it did before but also add z. Note that we cannot remove x, because it is not at the end: removing it from just A but not B would shift the indexes, making them incompatible.
* [Wasm GC] Signature Pruning: Remove params passed constant values (#4551)Alon Zakai2022-03-281-2/+193
| | | | | This basically just adds a call to ParamUtils::applyConstantValues, however, we also need to be careful to not optimize in the presence of imports or exports, so this adds a boolean that indicates unoptimizability.
* Generalize PossibleConstantValues for immutable globals (#4549)Alon Zakai2022-03-283-15/+314
| | | | | | | | | | | | This moves more logic from ConstantFieldPropagation into PossibleConstantValues, that is, instead of handling the two cases of a Literal or a Name before calling PossibleConstantValues, move that code into the helper class. That way all users of PossibleConstantValues can benefit from it. In particular, this makes DeadArgumentElimination now support optimizing immutable globals, as well as ref.func and ref.null. (Changes to test/lit/passes/dae-gc-refine-params.wast are to avoid the new optimizations from kicking in, so that it still tests what it tested before.)
* [Wasm GC] Signature Pruning (#4545)Alon Zakai2022-03-251-0/+597
| | | | | | | | | | | | | This adds a new signature-pruning pass that prunes parameters from signature types where those parameters are never used in any function that has that type. This is similar to DeadArgumentElimination but works on a set of functions, and it can handle indirect calls. Also move a little code from SignatureRefining into a shared place to avoid duplication of logic to update signature types. This pattern happens in j2wasm code, for example if all method functions for some virtual method just return a constant and do not use the this pointer.
* Fix merge-similar-functions test expections (#4534)Alon Zakai2022-03-211-119/+169
|
* [memory64] Keep type of memory.size and memory.grow on copy (#4531)Clemens Backes2022-03-171-0/+44
| | | | | | | When copying a MemorySize or MemoryGrow instruction (e.g. for inlining), transfer the memory type also to the copy. Otherwise it will always be i32, even if memory64 should be used. This fixes issue #4530.
* MergeSimilarFunctions optimization pass (#4414)Yuta Saito2022-03-032-0/+537
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge similar functions that only differs constant values (like immediate operand of const and call insts) by parameterization. Performing this pass at post-link time can merge more functions across objects. Inspired by Swift compiler's optimization which is derived from LLVM's one: https://github.com/apple/swift/blob/main/lib/LLVMPasses/LLVMMergeFunctions.cpp https://github.com/llvm/llvm-project/blob/main/llvm/docs/MergeFunctions.rst The basic ideas here are constant value parameterization and direct callee parameterization by indirection. Constant value parameterization is like below: ;; Before (func $big-const-42 (result i32) [[many instr 1]] (i32.const 44) [[many instr 2]] ) (func $big-const-43 (result i32) [[many instr 1]] (i32.const 45) [[many instr 2]] ) ;; After (func $byn$mgfn-shared$big-const-42 (result i32) [[many instr 1]] (local.get $0) ;; parameterized!! [[many instr 2]] ) (func $big-const-42 (result i32) (call $byn$mgfn-shared$big-const-42 (i32.const 42) ) ) (func $big-const-43 (result i32) (call $byn$mgfn-shared$big-const-42 (i32.const 43) ) ) Direct callee parameterization is similar to the constant value parameterization, but it parameterizes callee function i by ref.func instead. Therefore it is enabled only when reference-types and typed-function-references features are enabled. I saw 1 ~ 2 % reduction for SwiftWasm binary and Ruby's wasm port using wasi-sdk, and 3 ~ 4.5% reduction for Unity WebGL binary when -Oz.
* [Wasm GC] Optimize static casts in br_on_cast* (#4520)Alon Zakai2022-02-251-1/+104
| | | | We were missing this particular case, which we can in fact handle when the cast is static.
* DeadArgumentElimination: Remove removable effects (#4514)Alon Zakai2022-02-101-0/+34
|
* [Wasm GC] Fix TypeRefining corner case with uncreated types (#4500)Alon Zakai2022-02-031-0/+87
| | | | | | | | | | | | | | | | This pass ignores reads from structs - it only cares about writes (during a create or a struct.set). That makes sense since we want to refine the type of fields to more specific things based on what is actually written to them. However, a corner case was missed: If we ignore reads, the pass may "cleverly" optimize to something that is no longer valid to read from. How that happens is if there is no info at all for a type - no sets or news, so all we have is a read, which as mentioned before we ignore, so we think we have nothing at all for that type, and can do arbitrary stuff with it. But then the arbitrary replacement can be invalid to read from, say if it has fewer fields. To handle that, just emit an unreachable. If all we have is a get but no new then there cannot be an instance here at all. (That's only true in a closed world, of course, but this entire pass assumes that anyhow.)
* [OptimizeInstructions] Combine some relational ops joined Or/And (Part 7-8) ↵Max Graey2022-01-261-0/+102
| | | | | | | | | | | (#4399) Final part of #4265 (i32(x) >= 0) & (i32(y) >= 0) ==> i32(x | y) >= 0 (i64(x) >= 0) & (i64(y) >= 0) ==> i64(x | y) >= 0 (i32(x) == -1) & (i32(y) == -1) ==> i32(x & y) == -1 (i64(x) == -1) & (i64(y) == -1) ==> i64(x & y) == -1
* [OptimizeInstructions] Combine some relational ops joined Or/And (Part 5-6) ↵Max Graey2022-01-201-0/+102
| | | | | | | | | (#4372) (i32(x) >= 0) | (i32(y) >= 0) ==> i32(x & y) >= 0 (i64(x) >= 0) | (i64(y) >= 0) ==> i64(x & y) >= 0 (i32(x) != -1) | (i32(y) != -1) ==> i32(x & y) != -1 (i64(x) != -1) | (i64(y) != -1) ==> i64(x & y) != -1
* Allow import mutable globals used in Asyncify pass (#4427)かめのこにょこにょこ2022-01-141-0/+106
| | | | | | | | | | | This PR is part of the solution to emscripten-core/emscripten#15594. emscripten Asyncify won't work properly in side modules, because the globals, __asyncify_state and __asyncify_data, are not synchronized between main-module and side-modules. A new pass arg, asyncify-side-module, is added to make __asyncify_state and __asyncify_data imported in the instrumented wasm.
* Revert "[OptimizeInstructions] Optimize zero sized bulk memory ops even ↵Thomas Lively2022-01-141-58/+24
| | | | | without "ignoreImplicitTraps" (#4295)" (#4459) This reverts commit 5cf3521708cfada341285414df2dc7366d7e5454.
* [OptimizeInstructions] Optimize zero sized bulk memory ops even without ↵Max Graey2022-01-121-24/+58
| | | | "ignoreImplicitTraps" (#4295)
* [EH][GC] Fix nested pop after removing ref.cast (#4407)Heejin Ahn2021-12-281-0/+58
| | | | | | | | | | | | | | | | `ref.cast` can be statically removed when the ref's type is a subtype of the intended RTT type and either of `--ignore-implicit-traps` or `--traps-never-happen` is given: https://github.com/WebAssembly/binaryen/blob/083ab9842ec3d4ca278c95e1a33112ae7cd4d9e5/src/passes/OptimizeInstructions.cpp#L1603-L1624 Some more context: https://github.com/WebAssembly/binaryen/pull/4097#discussion_r694456784 But this can create a block in which a `pop` is nested, which makes the `catch` invalid. The test in this PR is the same as the example given by @kripken in #4237. This calls the fixup function `EHUtils::handleBlockNestedPops` at the end of the pass to fix this. Also, because this pass creates a lot of blocks in other patterns, I think it is possible there can be other patterns to cause this kind of `pop` nesting.
* [EH] Handle nested pops after inlining (#4404)Heejin Ahn2021-12-201-3/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Inlining creates additional `block`s at inlined call sites, which can be inside a `catch`. For example: ```wast (try (do) (catch $tag (call $callee (pop i32) ) ) ) ``` After inlining, this becomes ```wast (try (do) (catch $tag (block $__inlined_func$callee (local.set $0 (pop i32) ;; Invalid!! ) (nop) ) ) ) ``` Now the `pop` is nested in a `block`, which makes this invalid. This PR runs `EHUtils::handleBlockNestedPops` at the end to assign the `pop` to a local right after the `catch`, making the code valid again: ```wast (try (do) (catch $tag (local.set $new ;; New local to store `pop` result (pop i32) ) (block $__inlined_func$callee (local.set $0 (local.get $new) ) (nop) ) ) ) ```
* [Wasm GC] Refine results in SignatureRefining (#4380)Alon Zakai2021-12-141-0/+134
| | | | | | Similar to what DeadArgumentElimination does for individual functions, this can refine the results of a set of functions all using the same heap type, when they all return something more specific. After this PR SignatureRefining can refine both params and results and is basically complete.
* [OptimizeInstructions] Combine some relational ops joined Or/And (Part 4) ↵Max Graey2021-12-141-0/+32
| | | | | | (#4339) (i32(x) < 0) & (i32(y) < 0) ==> i32(x & y) < 0 (i64(x) < 0) & (i64(y) < 0) ==> i64(x & y) < 0
* [Precompute][SIMD] Enable constant folding for simd (#4381)Max Graey2021-12-131-1/+14
|
* SimplifyGlobals: Handle nested read-only-to-write patterns (#4365)Alon Zakai2021-12-081-0/+269
| | | | | | | | | | | | | | | | | | | The general pattern is if (!global) { global = 1 } This PR generalizes that to handle nested appearances, if ({ if (!global) { global = 1 } !global }) { global = 1 } With this I can finally see no more "once" global operations on the hottest function in the currently slowest j2wasm benchmark ("filter"). Also added a failing testcase for something we do not handle yet.
* [EH] Rename catch-pop-fixup.wast (#4371)Heejin Ahn2021-12-061-0/+0
| | | | All EH tests in test/lit/passes currently have the suffix `-eh`, so I think it's better be consistent for this one.
* [EH] Support try-delegate in EffectAnalyzer (#4368)Heejin Ahn2021-12-062-10/+173
| | | | | | | | | | | | | | | | This adds support for try-delegate in `EffectAnalyzer`. Without this support, the expresion below has been incorrectly classified as "cannot throw", because the previous code considered everything inside `try`-`catch_all` as "cannot throw". This is not the case when there is a `delegate` that can bypass the `catch_all`. ```wasm try $l0 try try throw $e delegate $l0 catch_all end end
* [OptimizeInstructions] Combine some relational ops joined Or/And (Part 3) ↵Max Graey2021-12-041-0/+85
| | | | | | | | (#4338) (i32(x) < 0) | (i32(y) < 0) ==> i32(x | y) < 0 (i32(x) != 0) | (i32(y) != 0) ==> i32(x | y) != 0 Likewise for i64.
* SimplifyGlobals: Ignore irrelevant effects in read-only-to-write (#4363)Alon Zakai2021-12-021-40/+189
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously this pass would see something like this and fail: if (foo() + global) { global = 1; } The call to foo() has side effects, so we did not optimize. However, in such a case the side effects are safe: they happen anyhow, regardless of the global that we are optimizing. That is, "global" is read only to be written, even though other things also influence the decision to write it. But "global" is not used in a way that is observable: we can remove it, and nothing will notice (except for things getting smaller/faster). In other words, this PR will let us optimize the above example, while it also needs to avoid optimizing the dangerous cases, like this: if (foo(global)) { global = 1; } Here "global" flows into a place that notices its value and may use it aside from deciding to write that global. A common case where we want to optimize is combined ifs, if (foo()) { if (global) { global = 1; } } which the optimizer turns into if (foo() & global) { global = 1; } With this PR we can handle those things too. This lets us optimize out some important globals in j2wasm like the initializer boolean for the Math object, reducing some total 0.5% of code size.
* Handle try in Flatten pass (#2567)Heejin Ahn2021-11-291-0/+234
| | | This adds handling of try in the Flatten pass.
* CoalesceLocals: Use ValueNumbering (#4355)Alon Zakai2021-11-242-20/+162
| | | | | | | | | | | | This removes the old hardcoded value numbering in that pass and makes it use the new code that was split into helper code. The immediate benefit of this is to make the code aware of identical constants: if two locals have the same constant then they do not interfere. Future improvements to numbering will also automatically help here. This changes some constants in existing tests so that they keep testing what they were testing before, and adds new tests for the new benefit here. This implements a proposed TODO from #4314
* SimplifyGlobals: If all writes write the initial value, they are unneeded ↵Alon Zakai2021-11-231-0/+143
| | | | (#4356)
* Add fixup function for nested pops in catch (#4348)Heejin Ahn2021-11-221-0/+396
| | | | | | | | | | | | | | | | | | | | | | | | | This adds `EHUtils::handleBlockNestedPops`, which can be called at the end of passes that has a possibility to put `pop`s inside `block`s. This method assumes there exists a `pop` in a first-descendant line, even though it can be nested within a block. This allows a `pop` to be nested within a `block` or a `try`, but not a `loop`, since that means the `pop` can run multile times. In case of `if`, `pop` can exist only in its condition; if a `pop` is in its true or false body, that's not in the first-descendant line. This can be useful when optimization passes create blocks to do transformations. Wrapping expressions wiith a block does not change semantics most of the time, but if pops happen to be inside a block generated by those passes, they can result in invalid binaries. To test this, this adds `passes/test_passes.cpp`, which is intended to contain multiple test passes that test a single (or more) utility functions separately. Without this kind of pass, it is hard to test various cases in which nested `pop`s can be generated in existing passes. This PR also adds `PassRegistry::registerTestPass`, which registers a pass that's intended only for internal testing and does not show up in `wasm-opt --help`. Fixes #4237.
* [Wasm GC] Signature Refining pass (#4326)Alon Zakai2021-11-191-0/+492
| | | | | | | | | | | | | | | | | | | This is fairly short and simple after the recent refactorings. This basically just finds all uses of each signature/function type, and then sees if it receives more specific types as params. It then rewrites the types if so. This just handles arguments so far, and not return types. This differs from DeadArgumentElimination's refineArguments() in that that pass modifies each function by itself, changing the type of the function as needed. That is only valid if the type is not observable, that is, if the function is called indirectly then DAE ignores it. This pass will work on the types themselves, so it considers all functions sharing a type as a whole, and when it upgrades that type it ends up affecting them all. This finds optimization opportunities on 4% of the total signature types in j2wasm. Those lead to some benefits in later opts, but the effect is not huge.
* [Wasm GC] Global Refining pass (#4344)Alon Zakai2021-11-181-0/+119
| | | | | | | | Fairly simple, this uses the existing infrastructure to find opportunities to refine the type of a global variable. This a common pattern in j2wasm for example, where a global begins as a null of $java.lang.Object (the least specific type) but it is in practice always assigned an object of some specific type.
* [Wasm GC] Update nulls to allow finding better LUBs (#4340)Alon Zakai2021-11-184-108/+607
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is common in GC code to have stuff like this: x = null; .. x = Data(); Nulls in wasm have a type, and if that initial null has say anyref then before this PR we would keep the type of x as anyref. However, while nulls have types, all null values are identical, and so we can in fact change x's type to a nullable reference of Data, by also changing the null's type to something more specific. LUBFinder now has an API that can return the best possible LUB so far, and that can be told to update nulls if we decide that the new LUB is worth using. This updates the passes using LUBFinder to use the new API. Note how TypeRefining becomes simpler because the special logic it had in a subclass of LUBFinder is now part of the main class (it used to remember if there was a null default; LUBFinder now handles both a null default as well as other nulls). This requires some changes to existing tests to avoid them from optimizing using nulls in ways that ends up not testing the original intent. Specifically the dae-gc-refine-params.wast now has calls to get a null of a type, instead of just having a ref.null of that type (which could be optimized now). And dae-gc-refine-return uses locals instead of ref.nulls.
* [OptimizeInstructions] Combine some relational ops joined Or/And (Part 2) ↵Max Graey2021-11-162-6/+35
| | | | | | (#4336) (i32(x) != 0) | (i32(y) != 0) ==> i32(x | y) != 0 (i64(x) != 0) | (i64(y) != 0) ==> i64(x | y) != 0
* [OptimizeInstructions] Combine some relational ops joined Or/And (Part 1) ↵Max Graey2021-11-161-0/+31
| | | | | | (#4333) (i32(x) == 0) & (i32(y) == 0) ==> i32(x | y) == 0 (i64(x) == 0) & (i64(y) == 0) ==> i64(x | y) == 0