diff options
author | Alon Zakai <azakai@google.com> | 2021-08-17 11:10:58 -0700 |
---|---|---|
committer | GitHub <noreply@github.com> | 2021-08-17 11:10:58 -0700 |
commit | d1bea49163be364d6f8b277b35510555211374b5 (patch) | |
tree | 8ee45af56492b0425833009ddc1a69640b6adc1e /src | |
parent | ef654c63819a7d00fd0cc4181f170111fa4c15d2 (diff) | |
download | binaryen-d1bea49163be364d6f8b277b35510555211374b5.tar.gz binaryen-d1bea49163be364d6f8b277b35510555211374b5.tar.bz2 binaryen-d1bea49163be364d6f8b277b35510555211374b5.zip |
TrapsNeverHappen mode (#4059)
The goal of this mode is to remove obviously-unneeded code like
(drop
(i32.load
(local.get $x)))
In general we can't remove it, as the load might trap - we'd be removing
a side effect. This is fairly rare in general, but actually becomes quite
annoying with wasm GC code where such patterns are more common,
and we really need to remove them.
Historically the IgnoreImplicitTraps option was meant to help here. However,
in practice it did not quite work well enough for most production code, as
mentioned e.g. in #3934 . TrapsNeverHappen mode is an attempt to fix that,
based on feedback from @askeksa in that issue, and also I believe this
implements an idea that @fitzgen mentioned a while ago (sorry, I can't
remember where exactly...). So I'm hopeful this will be generally useful
and not just for GC.
The idea in TrapsNeverHappen mode is that traps are assumed to not
actually happen at runtime. That is, if there is a trap in the code, it will
not be reached, or if it is reached then it will not trap. For example, an
(unreachable) would be assumed to never be reached, which means
that the optimizer can remove it and any code that executes right before
it:
(if
(..condition..)
(block
(..code that can be removed, if it does not branch out..)
(..code that can be removed, if it does not branch out..)
(..code that can be removed, if it does not branch out..)
(unreachable)))
And something like a load from memory is assumed to not trap, etc.,
which in particular would let us remove that dropped load from earlier.
This mode should be usable in production builds with assertions
disabled, if traps are seen as failing assertions. That might not be true
of all release builds (maybe some use traps for other purposes), but
hopefully in some. That is, if traps are like assertions, then enabling
this new mode would be like disabling assertions in release builds
and living with the fact that if an assertion would have been hit then
that is "undefined behavior" and the optimizer might have removed
the trap or done something weird.
TrapsNeverHappen (TNH) is different from IgnoreImplicitTraps (IIT).
The old IIT mode would just ignore traps when computing effects.
That is a simple model, but a problem happens with a trap behind
a condition, like this:
if (x != 0) foo(1 / x);
We won't trap on integer division by zero here only because of the
guarding if. In IIT, we'd compute no side effects on 1 / x, and then
we might end up moving it around, depending on other code in
the area, and potentially out of the if - which would make it happen
unconditionally, which would break.
TNH avoids that problem because it does not simply ignore traps.
Instead, there is a new hasUnremovableSideEffects() method
that must be opted-in by passes. That checks if there are no side
effects, or if there are, if we can remove them - and we know we can
remove a trap if we are running under TrapsNeverHappen mode,
as the trap won't happen by assumption. A pass must only use that
method where it is safe, that is, where it would either remove the
side effect (in which case, no problem), or if not, that it at least does
not move it around (avoiding the above problem with IIT).
This PR does not implement all optimizations possible with
TNH, just a small initial set of things to get started. It is already
useful on wasm GC code, including being as good as IIT on removing
unnecessary casts in some cases, see the test suite updates here.
Also, a significant part of the 18% speedup measured in
#4052 (comment)
is due to my testing with this enabled, as otherwise the devirtualization
there leaves a lot of unneeded code.
Diffstat (limited to 'src')
-rw-r--r-- | src/ir/effects.h | 33 | ||||
-rw-r--r-- | src/pass.h | 30 | ||||
-rw-r--r-- | src/passes/OptimizeInstructions.cpp | 26 | ||||
-rw-r--r-- | src/passes/Vacuum.cpp | 21 | ||||
-rw-r--r-- | src/tools/optimization-options.h | 8 |
5 files changed, 105 insertions, 13 deletions
diff --git a/src/ir/effects.h b/src/ir/effects.h index 2de5c54f6..f5bb0e6cb 100644 --- a/src/ir/effects.h +++ b/src/ir/effects.h @@ -30,6 +30,7 @@ public: FeatureSet features, Expression* ast = nullptr) : ignoreImplicitTraps(passOptions.ignoreImplicitTraps), + trapsNeverHappen(passOptions.trapsNeverHappen), debugInfo(passOptions.debugInfo), features(features) { if (ast) { walk(ast); @@ -37,6 +38,7 @@ public: } bool ignoreImplicitTraps; + bool trapsNeverHappen; bool debugInfo; FeatureSet features; @@ -143,10 +145,37 @@ public: return globalsRead.size() || readsMemory || readsHeap || isAtomic || calls; } - bool hasSideEffects() const { + bool hasNonTrapSideEffects() const { return localsWritten.size() > 0 || danglingPop || writesGlobalState() || - trap || throws || transfersControlFlow(); + throws || transfersControlFlow(); } + + bool hasSideEffects() const { return trap || hasNonTrapSideEffects(); } + + // Check if there are side effects, and they are of a kind that cannot be + // removed by optimization passes. + // + // The difference between this and hasSideEffects is subtle, and only related + // to trapsNeverHappen - if trapsNeverHappen then any trap we see is removable + // by optimizations. In general, you should call hasSideEffects, and only call + // this method if you are certain that it is a place that would not perform an + // unsafe transformation with a trap. Specifically, if a pass calls this + // and gets the result that there are no unremovable side effects, then it + // must either + // + // 1. Remove any side effects present, if any, so they no longer exists. + // 2. Keep the code exactly where it is. + // + // If instead of 1&2 a pass kept the side effect and also reordered the code + // with other things, then that could be bad, as the side effect might have + // been behind a condition that avoids it occurring. + // + // TODO: Go through the optimizer and use this in all places that do not move + // code around. + bool hasUnremovableSideEffects() const { + return hasNonTrapSideEffects() || (trap && !trapsNeverHappen); + } + bool hasAnything() const { return hasSideEffects() || accessesLocal() || readsMemory || accessesGlobal(); diff --git a/src/pass.h b/src/pass.h index 0be45e2d1..08237b271 100644 --- a/src/pass.h +++ b/src/pass.h @@ -99,6 +99,36 @@ struct PassOptions { InliningOptions inlining; // Optimize assuming things like div by 0, bad load/store, will not trap. bool ignoreImplicitTraps = false; + // Optimize assuming a trap will never happen at runtime. This is similar to + // ignoreImplicitTraps, but different: + // + // * ignoreImplicitTraps simply ignores the side effect of trapping when it + // computes side effects, and then passes work with that data. + // * trapsNeverHappen assumes that if an instruction with a possible trap is + // reached, then it does not trap, and an (unreachable) - that always + // traps - is never reached. + // + // The main difference is that in trapsNeverHappen mode we will not move + // around code that might trap, like this: + // + // (if (condition) (code)) + // + // If (code) might trap, ignoreImplicitTraps ignores that trap, and it might + // end up moving (code) to happen before the (condition), that is, + // unconditionally. trapsNeverHappen, on the other hand, does not ignore the + // side effect of the trap; instead, it will potentially remove the trapping + // instruction, if it can - it is always safe to remove a trap in this mode, + // as the traps are assumed to not happen. Where it cannot remove the side + // effect, it will at least not move code around. + // + // A consequence of this difference is that code that puts a possible trap + // behind a condition is unsafe in ignoreImplicitTraps, but safe in + // trapsNeverHappen. In general, trapsNeverHappen is safe on production code + // where traps are either fatal errors or assertions, and it is assumed + // neither of those can happen (and it is undefined behavior if they do). + // + // TODO: deprecate and remove ignoreImplicitTraps. + bool trapsNeverHappen = false; // Optimize assuming that the low 1K of memory is not valid memory for the // application to use. In that case, we can optimize load/store offsets in // many cases. diff --git a/src/passes/OptimizeInstructions.cpp b/src/passes/OptimizeInstructions.cpp index 43d0d912f..de7400ae0 100644 --- a/src/passes/OptimizeInstructions.cpp +++ b/src/passes/OptimizeInstructions.cpp @@ -1201,7 +1201,7 @@ struct OptimizeInstructions void visitRefEq(RefEq* curr) { // Identical references compare equal. - if (areConsecutiveInputsEqual(curr->left, curr->right)) { + if (areConsecutiveInputsEqualAndRemovable(curr->left, curr->right)) { replaceCurrent( Builder(*getModule()).makeConst(Literal::makeOne(Type::i32))); return; @@ -1270,7 +1270,7 @@ struct OptimizeInstructions auto passOptions = getPassOptions(); - if (passOptions.ignoreImplicitTraps) { + if (passOptions.ignoreImplicitTraps || passOptions.trapsNeverHappen) { // A ref.cast traps when the RTTs do not line up, which can be of one of // two sorts of issues: // 1. The value being cast is not even a subtype of the cast type. In @@ -1281,7 +1281,7 @@ struct OptimizeInstructions // fit. That indicates a difference between RTT subtyping and static // subtyping. That is, the type may be right but the chain of rtt.subs // is not. - // If we ignore implicit traps then we would like to assume that neither + // If we ignore a possible trap then we would like to assume that neither // of those two situations can happen. However, we still cannot do // anything if point 1 is a problem, that is, if the value is not a // subtype of the cast type, as we can't remove the cast in that case - @@ -1466,15 +1466,23 @@ private: // problem here (and which is the case we care about in this pass, which does // simple peephole optimizations - all we care about is a single instruction // at a time, and its inputs). - bool areConsecutiveInputsEqual(Expression* left, Expression* right) { + // + // This also checks that the inputs are removable. + bool areConsecutiveInputsEqualAndRemovable(Expression* left, + Expression* right) { // First, check for side effects. If there are any, then we can't even - // assume things like local.get's of the same index being identical. + // assume things like local.get's of the same index being identical. (It is + // also ok to have side effects here, if we can remove them, as we are also + // checking if we can remove the two inputs anyhow.) auto passOptions = getPassOptions(); auto features = getModule()->features; - if (EffectAnalyzer(passOptions, features, left).hasSideEffects() || - EffectAnalyzer(passOptions, features, right).hasSideEffects()) { + if (EffectAnalyzer(passOptions, features, left) + .hasUnremovableSideEffects() || + EffectAnalyzer(passOptions, features, right) + .hasUnremovableSideEffects()) { return false; } + // Ignore extraneous things and compare them structurally. left = Properties::getFallthrough(left, passOptions, features); right = Properties::getFallthrough(right, passOptions, features); @@ -2717,7 +2725,7 @@ private: Expression* optimizeMemoryCopy(MemoryCopy* memCopy) { PassOptions options = getPassOptions(); - if (options.ignoreImplicitTraps) { + if (options.ignoreImplicitTraps || options.trapsNeverHappen) { if (ExpressionAnalyzer::equal(memCopy->dest, memCopy->source)) { // memory.copy(x, x, sz) ==> {drop(x), drop(x), drop(sz)} Builder builder(*getModule()); @@ -2734,7 +2742,7 @@ private: switch (bytes) { case 0: { - if (options.ignoreImplicitTraps) { + if (options.ignoreImplicitTraps || options.trapsNeverHappen) { // memory.copy(dst, src, 0) ==> {drop(dst), drop(src)} return builder.makeBlock({builder.makeDrop(memCopy->dest), builder.makeDrop(memCopy->source)}); diff --git a/src/passes/Vacuum.cpp b/src/passes/Vacuum.cpp index 25b0116b3..14a464297 100644 --- a/src/passes/Vacuum.cpp +++ b/src/passes/Vacuum.cpp @@ -253,9 +253,8 @@ struct Vacuum : public WalkerPass<ExpressionStackWalker<Vacuum>> { } } } else { - // no else + // This is an if without an else. If the body is empty, we do not need it. if (curr->ifTrue->is<Nop>()) { - // no nothing replaceCurrent(Builder(*getModule()).makeDrop(curr->condition)); } } @@ -281,6 +280,24 @@ struct Vacuum : public WalkerPass<ExpressionStackWalker<Vacuum>> { replaceCurrent(set); return; } + + // If the value has no side effects, or it has side effects we can remove, + // do so. This basically means that if noTrapsHappen is set then we can + // use that assumption (that no trap actually happens at runtime) and remove + // a trapping value. + // + // TODO: A complete CFG analysis for noTrapsHappen mode, removing all code + // that definitely reaches a trap, *even if* it has side effects. + // + // Note that we check the type here to avoid removing unreachable code - we + // leave that for DCE. + if (curr->type == Type::none && + !EffectAnalyzer(getPassOptions(), getModule()->features, curr) + .hasUnremovableSideEffects()) { + ExpressionManipulator::nop(curr); + return; + } + // if we are dropping a block's return value, we might be able to remove it // entirely if (auto* block = curr->value->dynCast<Block>()) { diff --git a/src/tools/optimization-options.h b/src/tools/optimization-options.h index b7db5d531..58a8e3980 100644 --- a/src/tools/optimization-options.h +++ b/src/tools/optimization-options.h @@ -179,6 +179,14 @@ struct OptimizationOptions : public ToolOptions { [this](Options*, const std::string&) { passOptions.ignoreImplicitTraps = true; }) + .add("--traps-never-happen", + "-tnh", + "Optimize under the helpful assumption that no trap is reached at " + "runtime (from load, div/mod, etc.)", + Options::Arguments::Zero, + [this](Options*, const std::string&) { + passOptions.trapsNeverHappen = true; + }) .add("--low-memory-unused", "-lmu", "Optimize under the helpful assumption that the low 1K of memory is " |