summaryrefslogtreecommitdiff
path: root/test
Commit message (Collapse)AuthorAgeFilesLines
* OptimizeInstructions: Handle trivial ref.cast and ref.test (#4097)Alon Zakai2021-08-245-90/+324
| | | | | If the types are completely incompatible, we know the cast will fail. However, ref.cast does allow a null to pass through, which makes it a little more complicated.
* wasm-split: accept file in keep-funcs/split-funcs (#4053)Aleksander Guryanov2021-08-236-2/+39
|
* [Wasm GC] Nulls compare equal regardless of type (#4094)Alon Zakai2021-08-191-0/+11
|
* Enable LocalCSE by default (#4089)Alon Zakai2021-08-196-130/+147
| | | | | | | | | | | | Enable it in -O3 and -Os and higher. This helps very little on output from LLVM, but also it does not alter compile times much anyhow. On code that has not been run through an optimizing compiler already, this can help quite a lot, e.g., 15% of code size on some wasm GC samples. This will not normally help with speed, as optimizing VMs do such things anyhow. However, this can help baseline compilers and interpreters and so forth.
* [Wasm GC] Effects: Differentiate Struct and Array types (#4088)Alon Zakai2021-08-181-2/+53
| | | | | | | | | | | This allows common patterns in J2CL to be optimized, where we write to various array indices and get the values or the reference from a struct. It would be nice to do even better here, and look at actually specific types, but I think we should be careful to keep the runtime constant. That seems hard to do if we accumulate a list of types and do Type::isSubType on them etc. But maybe someone has a better idea than this PR?
* Add TrapsNeverHappen to SideEffects's API (#4086)Max Graey2021-08-172-1/+3
|
* LocalCSE: ignore traps (#4085)Alon Zakai2021-08-172-9/+50
| | | | | | | | | | | | | | | | | | | | If we replace A A A with (local.set A) (local.get) (local.get) then it is ok for A to trap (so long as it does so deterministically), as if it does trap then the first appearance will do so, and the others not be reached anyhow. This helps GC code as often there are repeated struct.gets and such that may trap.
* TrapsNeverHappen mode (#4059)Alon Zakai2021-08-173-0/+149
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The goal of this mode is to remove obviously-unneeded code like (drop (i32.load (local.get $x))) In general we can't remove it, as the load might trap - we'd be removing a side effect. This is fairly rare in general, but actually becomes quite annoying with wasm GC code where such patterns are more common, and we really need to remove them. Historically the IgnoreImplicitTraps option was meant to help here. However, in practice it did not quite work well enough for most production code, as mentioned e.g. in #3934 . TrapsNeverHappen mode is an attempt to fix that, based on feedback from @askeksa in that issue, and also I believe this implements an idea that @fitzgen mentioned a while ago (sorry, I can't remember where exactly...). So I'm hopeful this will be generally useful and not just for GC. The idea in TrapsNeverHappen mode is that traps are assumed to not actually happen at runtime. That is, if there is a trap in the code, it will not be reached, or if it is reached then it will not trap. For example, an (unreachable) would be assumed to never be reached, which means that the optimizer can remove it and any code that executes right before it: (if (..condition..) (block (..code that can be removed, if it does not branch out..) (..code that can be removed, if it does not branch out..) (..code that can be removed, if it does not branch out..) (unreachable))) And something like a load from memory is assumed to not trap, etc., which in particular would let us remove that dropped load from earlier. This mode should be usable in production builds with assertions disabled, if traps are seen as failing assertions. That might not be true of all release builds (maybe some use traps for other purposes), but hopefully in some. That is, if traps are like assertions, then enabling this new mode would be like disabling assertions in release builds and living with the fact that if an assertion would have been hit then that is "undefined behavior" and the optimizer might have removed the trap or done something weird. TrapsNeverHappen (TNH) is different from IgnoreImplicitTraps (IIT). The old IIT mode would just ignore traps when computing effects. That is a simple model, but a problem happens with a trap behind a condition, like this: if (x != 0) foo(1 / x); We won't trap on integer division by zero here only because of the guarding if. In IIT, we'd compute no side effects on 1 / x, and then we might end up moving it around, depending on other code in the area, and potentially out of the if - which would make it happen unconditionally, which would break. TNH avoids that problem because it does not simply ignore traps. Instead, there is a new hasUnremovableSideEffects() method that must be opted-in by passes. That checks if there are no side effects, or if there are, if we can remove them - and we know we can remove a trap if we are running under TrapsNeverHappen mode, as the trap won't happen by assumption. A pass must only use that method where it is safe, that is, where it would either remove the side effect (in which case, no problem), or if not, that it at least does not move it around (avoiding the above problem with IIT). This PR does not implement all optimizations possible with TNH, just a small initial set of things to get started. It is already useful on wasm GC code, including being as good as IIT on removing unnecessary casts in some cases, see the test suite updates here. Also, a significant part of the 18% speedup measured in #4052 (comment) is due to my testing with this enabled, as otherwise the devirtualization there leaves a lot of unneeded code.
* LocalCSE rewrite (#4079)Alon Zakai2021-08-175-1282/+775
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Technically this is not a new pass, but it is a rewrite almost from scratch. Local Common Subexpression Elimination looks for repeated patterns, stuff like this: x = (a + b) + c y = a + b => temp = a + b x = temp + c y = temp The old pass worked on flat IR, which is inefficient, and was overly complicated because of that. The new pass uses a new algorithm that I think is pretty simple, see the detailed comment at the top. This keeps the pass enabled only in -O4, like before - right after flattening the IR. That is to make this as minimal a change as possible. Followups will enable the pass in the main pipeline, that is, we will finally be able to run it by default. (Note that to make the pass work well after flatten, an extra simplify-locals is added - the old pass used to do part of simplify-locals internally, which was one source of complexity. Even so, some of the -O4 tests have changes, due to minor factors - they are just minor orderings etc., which can be seen by inspecting the outputs before and after using e.g. --metrics) This plus some followup work leads to large wins on wasm GC output. On j2cl there is a common pattern of repeated struct.gets, so common that this pass removes 85% of all struct.gets, which makes the total binary 15% smaller. However, on LLVM-emitted code the benefit is minor, less than 1%.
* [Wasm GC] ConstantFieldPropagation: Ignore copies (#4084)Alon Zakai2021-08-161-0/+161
| | | | | | | | When looking for all values written to a field, we can ignore values that are loaded from that same field, i.e., are copied from something already present there. Such operations never introduce new values. This helps by a small but non-zero amount on j2cl.
* Support nominal typing in wasm-reduce (#4080)Alon Zakai2021-08-161-25/+114
| | | | | Use ToolOptions there, which adds --nominal support. We must also pass --nominal to the sub-commands we run.
* [Wasm GC] Fix OptimizeInstructions on folding of identical code with nominal ↵Alon Zakai2021-08-161-0/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | typing (#4069) (if (result i32) (local.get $x) (struct.get $B 1 (ref.null $B) ) (struct.get $C 1 (ref.null $C) ) ) With structural typing it is safe to turn this into this: (struct.get $A 1 (if (result (ref $A)) (local.get $x) (ref.null $B) (ref.null $C) ) ) Here $A is the LUB of the others. This works since $A must have field 1 in it. But with nominal types it is possible that the LUB in fact does not have that field, and we would not validate. This actually seems like a more general issue that might happen with other things, even though atm perhaps it can't. For simplicity, avoid this pattern in both nominal and structural typing, to avoid making a difference between them.
* Add nominal typing mode to a test in prep for #4069 (#4081)Alon Zakai2021-08-131-1/+513
|
* [JS/C API] Expose zeroFilledMemory option for JS and C API (#4071)Max Graey2021-08-132-0/+59
|
* Add a shallowHash() method (#4077)Alon Zakai2021-08-121-2/+31
| | | | | | This adds and tests the new method. It will be used in a new pass later, where computing shallow hashes allows it to be done in linear time. 99% of the diff is whitespace.
* [Wasm GC] Fix LocalSubtyping on unreachable sets with incompatible values ↵Alon Zakai2021-08-111-0/+49
| | | | | | | | (#4051) We ignore sets in unreachable code, but their values may not be compatible with a new type we specialize a local for. That is, the validator cares about unreachable sets, while logically we don't need to, and this pass doesn't. Fix up such unreachable sets at the end.
* SimplifyGlobals: Optimize away globals that are only read in order to write ↵Alon Zakai2021-08-101-0/+240
| | | | | | | | | | | | | | | | | | themselves (#4070) If the only uses of a global are if (global == 0) { global = 1; } Then we do not need that global: while it has both reads and writes, the value in the global does not cause anything observable. It is read, but only to write to itself, and nothing else. This happens in real-world code from j2cl quite a lot, as they have an initialization pattern with globals, and in some cases we can optimize away the work done in the initialization, leaving only the globals in this pattern.
* Improve optimization of call_ref into direct calls (#4068)Alon Zakai2021-08-103-26/+206
| | | | | | | | | | | | | First, move the tiny pattern of call-ref-of-ref-func from Directize into OptimizeInstructions. This is important because Directize is a global optimization pass - it looks at the table to see if a CallIndirect can be turned into a direct call. We only run global passes at the end of the pipeline, but we don't need any global data for call-ref of a ref-func, and OptimizeInstructions is the place for such patterns. Second, extend that to also handle fallthrough values. This is less simple, but as call_ref is so inefficient, it's worth doing all we can.
* Improve inlining limits for recursion (#4067)Alon Zakai2021-08-101-1/+203
| | | | | | | | | | | | | | Previously we would keep doing iterations of inlining until we hit the number of functions in the module (at which point, we would definitely know we are recursing). This prevents infinite recursion, but it can take a very very long time to notice that in a huge module with one tiny recursive function and 100,000 other ones. To do better than that, track how many times we've inlined into a function. After a fixed number of such inlinings, stop. Aside from avoding very slow compile times, such infinite recursion likely is not that beneficial to do a great many times, so anyhow it is best to stop after a few iterations.
* [Wasm GC] RefEq(x, null) => RefIsNull(x) (#4066)Alon Zakai2021-08-091-0/+41
|
* Fix BrOn logic in RemoveUnusedBrs (#4062)Alon Zakai2021-08-091-0/+31
| | | | | | | | | | | This only moves code around. visitBrOn was in the main part of the pass, which was incorrect as it could interfere with other work being done there. Specifically, we have a stack of Ifs there, and if we replace a BrOn with an If, an assertion was hit. To fix this, run it like sinkBlocks(), in a separate interleaved phase. This fixes a bug reported by askeksa-google here: https://github.com/WebAssembly/gc/issues/226#issuecomment-868739853
* [Wasm GC] Track struct.new and struct.set separately in ↵Alon Zakai2021-08-091-31/+266
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | ConstantFieldPropagation (#4064) Previously we tracked them in the same way. That means that we did the same when seeing if either a struct.new or a struct.set can write to the memory that is read by a struct.get, where the rule is that if either type is a subtype of the other then they might. But with struct.new we know the precise type, which means we can do better. Specifically, if we see a new of type B, then only a get of a supertype of B can possibly read that data: it is not possible for our struct of type B to appear in a location that requires a subtype of B. Conceptually: A = type struct B = type extends A C = type extends B x = struct.new<B> struct.get<A>(y) // x might appear here, as it can be assigned to a // variable y of a supertype struct.get<C>(y) // x cannot appear here This allows more devirtualization. It is a followup for #4052 that implements a TODO from there. The diff without whitespace is simpler.
* [OptimizeForJS] Fix use of localized variable in popcnt(x) == 1 rule (#4057)Max Graey2021-08-051-0/+29
| | | | Localizer may have added a new tee and a local, and we should apply the tee in the right place.
* [Wasm GC] Constant Field Propagation (#4052)Alon Zakai2021-08-053-43/+1481
| | | | | | | | | | | | | | | | | | | | | | | | | | | | A field in a struct is constant if we can see that in the entire program we only ever write the same constant value to it. For example, imagine a vtable type that we construct with the same funcrefs each time, then (if we have no struct.sets, or if we did, and they had the same value), we could replace a get with that constant value, since it cannot be anything else: (struct.new $T (i32.const 10) (rtt)) ..no other conflicting values.. (struct.get $T 0) => (i32.const 10) If the value is a function reference, then this may allow other passes to turn what was a call_ref into a direct call and perhaps also get inlined, effectively a form of devirtualization. This only works in nominal typing, as we need to know the supertype of each type. (It could work in theory in structural, but we'd need to do hard work to find all the possible supertypes, and it would also become far less effective.) This deletes a trivial test for running -O on GC content. We have many more tests for GC at this point, so that test is not needed, and this PR also optimizes the code into something trivial and uninteresting anyhow.
* [Wasm GC] Fix printing of more unreachable struct/array instructions (#4049)Alon Zakai2021-08-034-4/+44
| | | | | | We were missing StructNew and ArrayLen. Also refactor the helper method to allow for shorter code in each caller.
* Fix a bug in nominal LUB calculation (#4046)Thomas Lively2021-08-021-0/+14
| | | The wrong variable was null checked, leading to segfaults.
* [Wasm GC] Handle unreachability in LocalSubtyping (#4044)Alon Zakai2021-08-021-0/+35
|
* Test GC lit tests with --nominal as well (#4043)Thomas Lively2021-08-0225-82/+556
| | | | | | | | | | | Add a new run line to every list test containing a struct type to run the test again with nominal typing. In cases where tests do not use any struct subtyping, this does not change the test output. In cases where struct subtyping is used, a new check prefix is introduced to capture the difference that `(extends ...)` clauses are emitted in nominal mode but not in equirecursive mode. There are no other test differences. Some tests are cleaned up along the way. Notably, O_all-features{,-ignore-implicit-traps}.wast is consolidated to a single file.
* [JS] Add a new OptimizeForJS pass (#4033)Max Graey2021-08-022-0/+57
| | | | | Add a new OptimizeForJS pass which contains rewriting rules specific to JavaScript. LLVM usually lowers x != 0 && (x & (x - 1)) == 0 (isPowerOf2) to popcnt(x) == 1 which is ok for wasm and other targets but is quite expensive for JavaScript. In this PR we lower the popcnt pattern back to the isPowerOf2 pattern.
* Properly print struct field names with --nominal (#4042)Thomas Lively2021-07-311-0/+30
| | | | The field name lookup was previously failing because the lookup used a fresh struct type rather than using the intended struct type.
* Port test/passes/d* to lit (#4008)Thomas Lively2021-07-3019-2360/+2387
| | | | But skip duplicate elimination tests until we can consider better inserting checks for removed items.
* Do not crash in ExtractFunction if an export already exists (#4040)Alon Zakai2021-07-301-0/+2
| | | | | | | | We just cleared the list of exports, but the exportMap was still populated, so the data was in an inconsistent state. This fixes the case of running --extract-function multiple times on a file (which is usually not useful, unless one of the passes you are debugging adds new functions to the file - which some do).
* [Wasm GC] Optimize repeated identical ref.casts (#4039)Alon Zakai2021-07-291-8/+129
|
* [Wasm GC] Handle call_ref in DeadArgument return refinement (#4038)Alon Zakai2021-07-291-0/+61
|
* Refactor DeadArgumentElimination GC tests into separate files. NFC (#4037)Alon Zakai2021-07-283-633/+656
| | | Test content is only moved around.
* [Wasm GC] Refine return types of tail calling functions in ↵Alon Zakai2021-07-281-3/+109
| | | | | DeadArgumentElimination (#4036) To do this we need to look at tail calls and not just returns.
* [Wasm GC] Handle uses of default values in LocalSubtyping (#4024)Alon Zakai2021-07-282-1/+64
| | | | | | It is ok to use the default value of a reference even if we refine the type, as it would be a more specifically-typed null, and all nulls compare the same. However, if the default is used then we *cannot* alter the type to be non-nullable, as then we'd use a null where that is not allowed.
* [Wasm GC] Allow tail call subtyping in DeadArgumentElimination (#4035)Alon Zakai2021-07-281-4/+4
| | | | Partially reverts #4025, removing the code and updates the test to show we do the optimization.
* [Simd] Implement extra convert, trunc, demote and promote ops for ↵Max Graey2021-07-281-0/+23
| | | | interpreter (#4023)
* Support subtyping in tail calls (#4032)Thomas Lively2021-07-282-11/+78
| | | | | | | | | The tail call spec does not include subtyping because it is based on the upstream spec, which does not contain subtyping. However, there is no reason that subtyping shouldn't apply to tail calls like it does for any other call or return. Update the validator to allow subtyping and avoid a related null pointer dereference while we're at it. Do not run the test in with --nominal because it is buggy in that mode.
* [Wasm GC] DeadArgumentElimination: Update tees after refining param types ↵Alon Zakai2021-07-281-0/+36
| | | | (#4031)
* Parse and print pops of compound types (#4030)Thomas Lively2021-07-271-0/+42
| | | This is necessary when using GC and EH together, for instance.
* [Wasm GC] DeadArgumentElimination: Do not refine return types of tail ↵Alon Zakai2021-07-271-0/+17
| | | | | | callees (#4025) If there is a tail call, we can't change the return type of the function, as it must match in the functions doing a tail call of it.
* [Wasm GC] Refine return types (#4020)Alon Zakai2021-07-261-1/+249
| | | | | | Corresponds to #4014 which did the same for parameter types. This sees whether the return types actually returned from a function allow us to use a more specific type for the function's return. If so, we update that type, as well as calls to the function.
* [Simd] Add extending pairwise adds to interpreter (#4022)Max Graey2021-07-261-1/+29
|
* [SIMD] Add extend + mul simd operations to interpreter (#4021)Max Graey2021-07-261-1/+100
|
* [Simd] Add extension from i32x4 to i64x2 ops to interpreter (#4016)Max Graey2021-07-261-0/+4
|
* [Wasm GC] Handle nondefaultable types in LocalSubtyping (#4019)Alon Zakai2021-07-231-0/+21
| | | | | The pass handled non-nullability, but another case is a tuple with nullable values in it that is assigned non-nullable values, and in general, other stuff that is nondefaultable (but not non-nullable). Ignore those.
* [Wasm GC] Fix Heap2Local + non-nullable locals (#4017)Alon Zakai2021-07-231-30/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | Given (local $x (ref $foo)) (local.set $x ..) (.. (local.get $x)) If we remove the local.set but not the get, then we end up with (local $x (ref $foo)) (.. (local.get $x)) It looks like the local.get reads the initial value of a non-nullable local, which is not allowed. In practice, this would crash in precompute-propagate which would try to propagate the initial value to the get. Add an assertion there with a clear message, as until we have full validation of non-nullable locals (and the spec for that is in flux), that pass is where bugs will end up being noticed. To fix this, replace the get as well. We can replace it with a null for simplicity; it will never be used anyhow. This also uncovered a small bug with reached not containing all the things we reached - it was missing local.gets.
* [Wasm GC] Local-Subtyping pass (#3765)Alon Zakai2021-07-234-3/+248
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If a local is say anyref, but all values assigned to it are something more specific like funcref, then we can make the type of the local more specific. In principle that might allow further optimizations, as the local.gets of that local will have a more specific type that their users can see, like this: (import .. (func $get-funcref (result funcref))) (func $foo (result i32) (local $x anyref) (local.set $x (call $get-funcref)) (ref.is_func (local.get $x)) ) => (func $foo (result i32) (local $x funcref) ;; updated to a subtype of the original (local.set $x (call $get-funcref)) (ref.is_func (local.get $x)) ;; this can now be optimized to "1" ) A possible downside is that using more specific types may not end up allowing optimizations but may end up increasing the size of the binary (say, replacing lots of anyref with various specific types that compress more poorly; also, for recursive types the LUB may be a unique type appearing nowhere else in the wasm). We should investigate the code size factors more later.