| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We normally like to move brs after ifs into the if, when in a loop:
(loop $loop
(if
..
(unreachable)
(code)
)
(br $loop)
)
=>
(loop $loop
(if
..
(unreachable)
(block
(code)
(br $loop) ;; moved in
)
)
)
However this may be invalid to do if the if condition is unreachable, as
then one arm may be concrete (`code` in the example could be an `i32`,
for example). As this is dead code anyhow, leave it for DCE.
|
|
|
|
|
|
|
|
|
|
| |
RemoveUnusedBrs sinks blocks into If arms when those arms contain
branches to the blocks and the other arm and condition do not. Now that
we type Ifs with unreachable conditions as unreachable, it is possible
for the If arms to have a different type than the block that would be
sunk, so sinking the block would produce invalid IR. Fix the problem by
never sinking blocks into Ifs with unreachable conditions.
Fixes #7128.
|
|
|
|
| |
Replacing an if with a select may have refined the type. Without this fix,
the sharper stale type checks complain.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We previously allowed valid expressions to have stale types as long as
those stale types were supertypes of the most precise possible types for
the expressions. Allowing stale types like this could mask bugs where we
failed to propagate precise type information, though.
Make validation stricter by requiring all expressions except for control
flow structures to have the most precise possible types. Control flow
structures are exempt because many passes that can refine types wrap the
refined expressions in blocks with the old type to avoid the need for
refinalization. This pattern would be broken and we would need to
refinalize more frequently without this exception for control flow
structures.
Now that all non-control flow expressions must have precise types,
remove functionality relating to building select instructions with
non-precise types. Since finalization of selects now always calculates a
LUB rather than using a provided type, remove the type parameter from
BinaryenSelect in the C and JS APIs.
Now that stale types are no longer valid, fix a bug in TypeSSA where it
failed to refinalize module-level code. This bug previously would not
have caused problems on its own, but the stale types could cause
problems for later runs of Unsubtyping. Now the stale types would cause
TypeSSA output to fail validation.
Also fix a bug where Builder::replaceWithIdenticalType was in fact
replacing with refined types.
Fixes #7087.
|
|
|
|
|
|
|
|
| |
unrefine the output (#7036)
Paradoxically, when a BrOn's castType is refined, its own type (the type it flows out)
can get un-refined: making the castType non-nullable means nulls no longer
flow on the branch, so they may flow out directly, making the BrOn nullable.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows
(block $out (result i32)
(try_table (catch..)
..
(br $out
(i32.const 42)
)
)
)
=>
(block $out (result i32)
(try_table (result i32) (catch..) ;; add a result
..
(i32.const 42) ;; remove the br around the value
)
)
|
|
|
|
|
|
|
|
| |
#6980 was missing the logic to reset flows after replacing a throw. The process
of replacing the throw introduces new code and in particular a drop, which
blocks branches from flowing to their targets.
In the testcase here, the br was turned into nop before this fix.
|
|
|
|
| |
We ignored legacy Trys in #6980, but they can also catch.
|
|
|
|
|
|
|
|
| |
This change is NFC on all things we previously optimized, but also makes
us optimize TryTable, BrOn, etc., by replacing hard-coded logic for Break
with generic code.
Also simplify the code there a little - we didn't really need ControlFlowWalker.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
E.g.
(try_table (catch_all $catch)
(throw $e)
)
=>
(try_table (catch_all $catch)
(br $catch)
)
This can then allow other passes to remove the TryTable, if no throwing
things remain.
|
|
|
|
| |
As the name of a class, uppercase seems better here.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We marked various expressions as having cost "Unacceptable", fixed at 100, to
ensure we never moved them out from an If arm, etc. Giving them such a high
cost avoids that problem - the cost is higher than the limit we have for moving
code from conditional to unconditional execution - but it also means the total
cost is unrealistic. For example, a function with one such instruction + an add
(cost 1) would end up with cost 101, and removing the add would look
insignificant, which causes issues for things that want to compare costs
(like Monomorphization).
To fix this, adjust some costs. The main change here is to give casts a cost of 5.
I measured this in depth, see the attached benchmark scripts, and it looks
clear that in both V8 and SpiderMonkey the cost of a cast is high enough to
make it not worth turning an if with ref.test arm into a select (which would
always execute the test).
Other costs adjusted here matter a lot less, because they are on operations
that have side effects and so the optimizer will anyhow not move them from
conditional to unconditional execution, but I tried to make them a bit more
realistic while I was removing "Unacceptable":
* Give most atomic operations the 10 cost we've been using for atomic loads/
stores. Perhaps wait and notify should be slower, however, but it seems like
assuming fast switching might be more relevant.
* Give growth operations a cost of 20, and throw operations a cost of 10. These
numbers are entirely made up as I am not even sure how to measure them in
a useful way (but, again, this should not matter much as they have side
effects).
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#5989)
Fixes #5983: The testcase from there is used here in a new testcase
remove-unused-brs_levels in which we check if we are willing to unconditionally
do a division operation. Turning an if with an arm that does a division into a
select, which always does the division, is almost 5x slower, so we should probably
be extremely careful about doing that.
I took some measurements and have some suggestions for changes in this PR:
* Raise the cost of div/rem to what I measure on my machine, which is 5x slower
than an add, or worse.
* For some reason we added the if arms rather than take the max of them, so
fix that. This does not help the issue, but was confusing.
* Adjust TooCostlyToRunUnconditionally in the pass from 9 to 8 (this helps
balance the last point).
* Use half that value when not optimizing for size. That is, we allow only 4 extra
unconditional work normally, and 8 in -Os, and when -Oz then we allow any
extra amount.
Aside from the new testcases, some existing ones changed. They all appear to
change in a reasonable way, to me.
We should perhaps go even further than this, and not even run a division
unconditionally in -Os, but I wasn't sure it makes sense to go that far as
other benchmarks may be affected. For now, this makes the benchmark in
#5983 run at full speed in -O3 or -Os, and it remains slow in -Oz. The
modified version of the benchmark that only divides in the if (no other
operations) is still fast in -O3, but it become slow in -Os as we do turn that
if into a select (but again, I didn't want to go that far as to overfit on that one
benchmark).
|
|
|
|
|
| |
Replace i31.new with ref.i31 in the printer, tests, and source code. Continue
parsing i31.new for the time being to allow a graceful transition. Also update
the JS API to reflect the new instruction name.
|
|
|
|
|
|
|
|
| |
The improvements to RemoveUnusedBrs in #5887 also introduced a regression where
the pass did not correctly handle unreachable fallthrough values and crashed
with an assertion failure. Fix the problem by returning early when a fallthrough
value is unreachable and add a regression test.
Fixes #5892.
|
|
|
|
|
|
|
| |
Optimize both the known-null and known-non-null cases for BrOnNull and
BrOnNonNull and optimize for more cast behaviors such as SuccessOnlyIfNonNull
and Unreachable for BrOnCast and BrOnCastFail. Leave optimizing
SuccessOnlyIfNull to future work, since that's more complicated. Use type
information from fallthrough values to inform all the optimizations.
|
|
|
|
|
| |
Remove old, experimental instructions and type encodings that will not be
shipped as part of WasmGC. Updating the encodings and text format to match the
final spec is left as future work.
|
|
|
|
|
|
|
|
|
|
|
| |
Casting (ref nofunc) to (ref func) seems like it can succeed based on the rule
of "if it's a subtype, it can cast ok." But the fuzzer found a corner case where that
leads to a validation error (see testcase).
Refactor the cast evaluation logic to handle uninhabitable refs directly, and
return Unreachable for them (since the cast cannot even be reached).
Also reorder the rule checks there to always check for a non-nullable cast
of a bottom type (which always fails).
|
|
|
|
|
|
|
|
| |
Since we refactored all the old kind-checking instructions to be represented as
general cast instructions, `GCTypeUtils::evaluateKindCheck` had become a
vestigial wrapper around `GCTypeUtils::evaluateCastCheck` that was only used in
RemoveUnusedBrs. Remove `evaluateKindCheck` and use `evaluateCastCheck` in
RemoveUnusedBrs without changing any functionality. A future PR may use the
extra information from `evaluateCastCheck` to further optimize branching casts.
|
|
|
|
|
|
|
|
|
| |
As well as br_on_cast_fail null. Unlike the existing br_on_cast* instructions,
these new instructions treat the cast as succeeding when the input is a null.
Update the internal representation of the cast type in `BrOn` expressions to be
a `Type` rather than a `HeapType` so it will include nullability information.
Also update and improve `RemoveUnusedBrs` to handle the new instructions
correctly and optimize in more cases.
|
|
|
|
| |
This is more modern and (IMHO) easier to read than that old C typedef
syntax.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously only WalkerPasses had access to the `getPassRunner` and
`getPassOptions` methods. Move those methods to `Pass` so all passes can use
them. As a result, the `PassRunner` passed to `Pass::run` and
`Pass::runOnFunction` is no longer necessary, so remove it.
Also update `Pass::create` to return a unique_ptr, which is more efficient than
having it return a raw pointer only to have the `PassRunner` wrap that raw
pointer in a `unique_ptr`.
Delete the unused template `PassRunner::getLast()`, which looks like it was
intended to enable retrieving previous analyses and has been in the code base
since 2015 but is not implemented anywhere.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes what looks like it might be a regression in #4943. It's not actually
an issue since it just affects wat files, but it did uncover an existing
inefficiency. The situation is this:
(block
..
(br $somewhere)
(nop)
)
Removing such a nop is always helpful, as the pass might see that that
br goes to where control flow is going anyhow, and the nop would
confuse it. We used to remove such nops only when the block had a name,
which is why wat testcases looks optimal, but we were actually doing the
less-efficient thing on real-world code. It was a minor inefficiency, though, as
the nop is quickly removed by later passes anyhow. Still, the fix is trivial (to
always remove such nops, regardless of a name on the block or not).
|
|
|
|
|
| |
Casts involve branches in the VM, so adding a cast in return for removing a branch
(like If=>Select) is not beneficial. We don't want to ever do any more casts than we
already are.
|
|
|
|
|
| |
The BrOn logic there is incremental in optimizing and updating types, and so
we cannot assume that at every point in the middle the types are fully
updated.
|
|
|
|
| |
This makes Binaryen match LLVM on a real-world case, which is probably
the safest heuristic to use.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
if (A) {
if (B) {
C
}
}
=>
if (A ? B : 0) {
C
}
when B has no side effects, and is fast enough to consider running
unconditionally. In that case, we replace an if with a select and a
zero, which is the same size, but should be faster and may be
further optimized.
As suggested in #4168
|
|
|
|
|
|
|
|
|
|
|
|
| |
tablify() attempts to turns a sequence of br_ifs into a single
br_table. This PR adds some flexibility to the specific pattern it
looks for, specifically:
* Accept i32.eqz as a comparison to zero, and not just to look
for i32.eq against a constant.
* Allow the first condition to be a tee. If it is, compare later
conditions to local.get of that local.
This will allow more br_tables to be emitted in j2cl output.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Knowing the module will allow us to do more analysis in the
effect analyzer. For now, this just refactors the code to allow
providing a module instead of features, and to infer the
features from the module. This actually shortens the code in
most places which is nice (just pass module instead of
module->features).
This modifies basically all callers to use the new module
form, except for the fallthrough logic. That would require
some more refactoring, so to keep this PR reasonably small
that is not yet done.
|
|
|
|
|
|
|
|
|
|
|
| |
This only moves code around. visitBrOn was in the main part of the pass,
which was incorrect as it could interfere with other work being done there.
Specifically, we have a stack of Ifs there, and if we replace a BrOn with
an If, an assertion was hit. To fix this, run it like sinkBlocks(), in a separate
interleaved phase.
This fixes a bug reported by askeksa-google here:
https://github.com/WebAssembly/gc/issues/226#issuecomment-868739853
|
|
|
| |
Similar to #4005 but on OptimizeInstructions instead of RemoveUnusedBrs.
|
|
|
|
|
| |
The spec disallows that.
Fixes #3990
|
|
|
|
|
|
| |
They are basically the flip versions. The only interesting part in the impl is that their
returned typed and sent types are different.
Spec: https://docs.google.com/document/d/1DklC3qVuOdLHSXB5UXghM_syCh-4cMinQ50ICiXnK3Q/edit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The existing restructuring code could turn a block+br_if into an if in
simple cases, but it had some TODOs that I noticed were helpful on
GC benchmarks.
One limitation was that we need to reorder the condition and the value,
(block
(br_if
(value)
(condition)
)
(...)
)
=>
(if
(condition)
(value)
(...)
)
The old code checked for side effects in the condition. But it is ok for it
to have side effects if they can be reordered with the value (for example,
if the value is a constant then it definitely does not care about side effects
in the condition).
The other missing TODO is to use a select when we can't use an if:
(block
(drop
(br_if
(value)
(condition)
)
)
(...)
)
=>
(select
(value)
(...)
(condition)
)
In this case we do not reorder the condition and the value, but we do
reorder the condition with the rest of the block.
|
|
|
|
|
|
|
|
|
|
|
|
| |
#3719 optimized the case where the kind is what we want, like br_on_func of
a function. This handles the opposite case, where we know the kind is wrong, and
so the break is not taken.
This seems equally useful for "polymorphic" code that does a bunch of checks
and routes to different code for each case, as after inlining or other optimizations
we may see which paths are taken and which are not.
Also refactor the checking code to a shared location as RefIs/As will also want to
use it.
|
|
|
|
| |
The type may prove the value is not null, and may also show it to be
of the type we are casting to. In that case, we can simplify things.
|
|
|
|
|
| |
We have the if's type, and when replacing it with a select, can use that
type. This could be more efficient. It also avoids a current crash
after the removal of LUBs, but it's worth doing regardless of that.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I'm not entirely sure how LUB removal made this noticeable, as it seems
to be a pre-existing bug. However, somehow before #3669 it was not
noticable - perhaps the finalize code worked around it.
The bug is that RemoveUnusedBrs was moving code around and
finalizing the parent before the child. The correct pattern is always to
work from the children outwards, as otherwise the parent is trying to
finalize itself based on non-finalized children.
The fix is to just not finalize in the stealSlice method. The caller can
do it after finishing any other work it has. As part of this refactoring,
move stealSlice into the single pass that uses it; aside from that being
more orderly, this method is really not a general-purpose tool, it is
quite specific to what RemoveUnusedBrs does, and it might easily
be used incorrectly elsewhere.
|
|
|
|
|
| |
Selectify turns an if-else into a select where possible. Previously we abandoned
hope if any part of the if had a side effect. But it's fine for the condition to have a
side effect, so long as moving it to the end doesn't invalidate the arms.
|
|
|
|
|
|
| |
(#3220)
We turn a br_table with a single target into a br, but we reverse the order of the
condition and the value when doing so, which we forgot to take into account.
|
|
|
|
|
| |
Extends the `Type` hash-consing infrastructure to handle type-parameterized and constructed types introduced in the typed function references and GC proposals. This should be a non-functional change since the new types are not used anywhere yet. Recursive type construction and canonicalization is also left as future work.
Co-authored-by: Thomas Lively <tlively@google.com>
|
| |
|
|
|
|
|
| |
RemoveUnusedBrs produces selects for some patterns, but selects of
multivalue types are not valid. This change checks that types are not
tuple types before producing selects.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds EH support to `EffectAnalyzer`. Before `throw` and `rethrow`
conservatively set property. Now `EffectAnalyzer` has a new property
`throws` to represent an expression that can throw, and expression that
can throw sets `throws` correctly.
When EH is enabled, any calls can throw too, so we cannot reorder them
with another expression with any side effects, meaning all calls should
be treated in the same way as branches when evaluating `invalidate`.
This prevents many reorderings, so this patch sets `throws` for calls
only when the exception handling features is enabled. This is also why I
passed `--disable-exception-handling` to `wasm2js` tests. Most of code
changes outside of `EffectAnalyzer` class was made in order to pass
`FeatureSet` to it.
`throws` isn't always set whenever an expression contains a throwable
instruction. When an throwable instruction is within an inner try, it
will be caught by the corresponding inner catch, so it does not set
`throws`.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to the current spec, `local.tee`'s return type should be the
same as its local's type. (Discussions on whether we should change this
rule is going on in WebAssembly/reference-types#55, but here I will
assume this spec does not change. If this changes, we should change many
parts of Binaryen transformation anyway...)
But currently in Binaryen `local.tee`'s type is computed from its
value's type. This didn't make any difference in the MVP, but after we
have subtype relationship in #2451, this can become a problem. For
example:
```
(func $test (result funcref) (local $0 anyref)
(local.tee $0
(ref.func $test)
)
)
```
This shouldn't validate in the spec, but this will pass Binaryen
validation with the current `local.tee` implementation.
This makes `local.tee`'s type computed from the local's type, and makes
`LocalSet::makeTee` get a type parameter, to which we should pass the
its corresponding local's type. We don't embed the local type in the
class `LocalSet` because it may increase memory size.
This also fixes the type of `local.get` to be the local type where
`local.get` and `local.set` pair is created from `local.tee`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
That was needed for super-old wasm type system, where we allowed
(block $x
(br_if $x
(unreachable)
(nop)
)
)
That is, we differentiated "taken" branches from "named" ones (just
referred to by name, but not actually taken as it's in unreachable code).
We don't need to differentiate those any more. Remove the ReFinalize
code that considered it, and also remove the named/taken distinction in
other places.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds the ability to create multivalue types from vectors of concrete value
types. All types are transparently interned, so their representation is still a
single uint32_t. Types can be extracted into vectors of their component parts,
and all the single value types expand into vectors containing themselves.
Multivalue types are not yet used in the IR, but their creation and inspection
functionality is exposed and tested in the C and JS APIs.
Also makes common type predicates methods of Type and improves the ergonomics of
type printing.
|
|
|
|
|
|
|
| |
An if whose body is a br_if can be turned into a br_if of a combined condition (if side effects allow it).
The naive size in bytes is identical between the patterns, but the select may avoid a hardware branch, and also the select may be further optimized. On the benchmark suite this helps every single benchmark, but by quite small amounts (e.g. 100 bytes on sqlite, which is 1MB).
This was noticed in emscripten-core/emscripten#8941
|