diff options
author | Alon Zakai <azakai@google.com> | 2021-11-09 17:01:47 -0800 |
---|---|---|
committer | GitHub <noreply@github.com> | 2021-11-10 01:01:47 +0000 |
commit | c30152666af8db5ef8634165a5a3ff192d3a6c98 (patch) | |
tree | 07330c059c5861480db31e396c188879b6800beb /test/wasm2js/unary-ops.2asm.js.opt | |
parent | b260f1cc65096a7784da8ef8ad25a067e0480e5b (diff) | |
download | binaryen-c30152666af8db5ef8634165a5a3ff192d3a6c98.tar.gz binaryen-c30152666af8db5ef8634165a5a3ff192d3a6c98.tar.bz2 binaryen-c30152666af8db5ef8634165a5a3ff192d3a6c98.zip |
CoalesceLocals: Rewrite the algorithm to be linear and to ignore copies (#4314)
The old algorithm can be summarized as: In each basic block, start at the beginning.
Each pair of live locals there might interfere with each other, as they might arrive from
different entry blocks with different values. Afterwards, go through the block and find
overlapping live ranges, and mark interferences there as well.
This is non-linear because at the start of the block we do a double-loop over all
pairs of live locals, which in general can be O(N^2) (N - number of locals). It also
has the downside of ignoring copies: if two locals have overlapping live ranges but
they must have identical values on those ranges, they do not actually interfere,
for example
x = 10;
y = x;
.. // live ranges overlap here
foo(x, y); // live ranges end here.
We can ignore this overlap since the copy shows they are identical there, but the
pass did not take this into account. To some extent other passes can remove such
copies (SimplifyLocals, MergeLocals, RedundantSetElimination), but in general
this was a weak spot for the optimizer.
I realized there is a solution to both these problems: In Wasm, given that we have
a default value for all locals, if a local is live at the start of a block then it must be
live at the end of all the blocks reaching it. That is so because the liveness will
extend backwards all the way to some set of the local, possibly all the way to
the zero-initialization at the start of the function, and it extends that way through
all predecessor blocks. A consequence of this is that there are no interferences
between locals that only occur during a merge: The live ranges include the
predecessor blocks, and theirs, and so forth, until we reach a block where one
of the locals is assigned a value different than the other. That is a necessary and
sufficient condition for intererence, and therefore when processing a block we
only need to look at its contents, and can ignore the merging of control flow,
which allows us to be linear.
More details on this and on the new algorithm in comments in the source, but
the basic idea is that it simply goes through each block in a linear way, finding
which values are assigned to each local (using a numbering of unique values),
and noting which are live at each time. If two locals are live and one is assigned
a value that is not the same as the value in the other, mark them as interfering.
This is of substantial benefit to j2wasm output, I believe because it is common
there to find local subexpression elimination opportunities after inlining, and
each time we find one we add a local. If we inline different functions into the
same target, we may end up with copied locals for each of them. (This was
not noticed in the past because it is very rare on LLVM output, which has
already had inlining and GVN etc. done.)
There is a small benefit to LLVM output as well, though just a few
percent at best. However, it is enough to be noticeable on some of
the code size tests.
This is also faster than the previous pass. It's normally not noticeable
as this pass is not one of the slowest anyhow, but I found some real-world
codebases where the pass becomes 50% faster. I have not found any
case where it is slower than the old algorithm.
Fuzzed over several days to be sure this is correct, and also verified
on the emscripten test suite.
Diffstat (limited to 'test/wasm2js/unary-ops.2asm.js.opt')
-rw-r--r-- | test/wasm2js/unary-ops.2asm.js.opt | 28 |
1 files changed, 12 insertions, 16 deletions
diff --git a/test/wasm2js/unary-ops.2asm.js.opt b/test/wasm2js/unary-ops.2asm.js.opt index 86eee0e0a..7bb08bde4 100644 --- a/test/wasm2js/unary-ops.2asm.js.opt +++ b/test/wasm2js/unary-ops.2asm.js.opt @@ -16,9 +16,8 @@ function asmFunc(env) { var i64toi32_i32$HIGH_BITS = 0; function $1($0) { $0 = $0 | 0; - var $1_1 = 0, $2 = 0; + var $1_1 = 0; while (1) { - $2 = $1_1; if ($0) { $0 = $0 - 1 & $0; $1_1 = $1_1 + 1 | 0; @@ -26,7 +25,7 @@ function asmFunc(env) { } break; }; - return $2 | 0; + return $1_1 | 0; } function $6($0) { @@ -45,24 +44,21 @@ function asmFunc(env) { } function legalstub$2($0, $1_1, $2, $3) { - var $4 = 0, $5 = 0, $6_1 = 0, $7_1 = 0; - $5 = $0; - $4 = $1_1; + var $4 = 0, $5 = 0, $6_1 = 0; + $4 = $0; while (1) { - $1_1 = $6_1; - $0 = $7_1; - if ($5 | $4) { - $0 = $5; - $5 = $0 - 1 & $0; - $4 = $4 - !$0 & $4; - $6_1 = $6_1 + 1 | 0; - $7_1 = $6_1 ? $7_1 : $7_1 + 1 | 0; + if ($1_1 | $4) { + $0 = $4; + $4 = $4 - 1 & $4; + $1_1 = $1_1 - !$0 & $1_1; + $5 = $5 + 1 | 0; + $6_1 = $5 ? $6_1 : $6_1 + 1 | 0; continue; } break; }; - i64toi32_i32$HIGH_BITS = $0; - return ($1_1 | 0) == ($2 | 0) & ($3 | 0) == (i64toi32_i32$HIGH_BITS | 0); + i64toi32_i32$HIGH_BITS = $6_1; + return ($2 | 0) == ($5 | 0) & ($3 | 0) == (i64toi32_i32$HIGH_BITS | 0); } function legalstub$3($0, $1_1, $2) { |