diff options
author | Alon Zakai <azakai@google.com> | 2021-05-12 07:43:35 -0700 |
---|---|---|
committer | GitHub <noreply@github.com> | 2021-05-12 07:43:35 -0700 |
commit | bfd01369a6dbb4629e88d227f085f959549e3dd5 (patch) | |
tree | b8cc90e26721f0338646a31c9956e41cf2fed2d8 /src/wasm-interpreter.h | |
parent | 4cfbb5d90bd253c066d92affa685dbab5d824699 (diff) | |
download | binaryen-bfd01369a6dbb4629e88d227f085f959549e3dd5.tar.gz binaryen-bfd01369a6dbb4629e88d227f085f959549e3dd5.tar.bz2 binaryen-bfd01369a6dbb4629e88d227f085f959549e3dd5.zip |
Heap2Local: Use escape analysis to turn heap allocations into local data (#3866)
If we allocate some GC data, and do not let the reference escape, then we can
replace the allocation with locals, one local for each field in the allocation
basically. This avoids the allocation, and also allows us to optimize the locals
further.
On the Dart DeltaBlue benchmark, this is a 24% speedup (making it faster than
the JS version, incidentially), and also a 6% reduction in code size.
The tests are not the best way to show what this does, as the pass assumes
other passes will clean up after. Here is an example to clarify. First, in pseudocode:
ref = new Int(42)
do {
ref.set(ref.get() + 1)
} while (import(ref.get())
That is, we allocate an int on the heap and use it as a counter. Unnecessarily,
as it could be a normal int on the stack.
Wat:
(module
;; A boxed integer: an entire struct just to hold an int.
(type $boxed-int (struct (field (mut i32))))
(import "env" "import" (func $import (param i32) (result i32)))
(func "example"
(local $ref (ref null $boxed-int))
;; Allocate a boxed integer of 42 and save the reference to it.
(local.set $ref
(struct.new_with_rtt $boxed-int
(i32.const 42)
(rtt.canon $boxed-int)
)
)
;; Increment the integer in a loop, looking for some condition.
(loop $loop
(struct.set $boxed-int 0
(local.get $ref)
(i32.add
(struct.get $boxed-int 0
(local.get $ref)
)
(i32.const 1)
)
)
(br_if $loop
(call $import
(struct.get $boxed-int 0
(local.get $ref)
)
)
)
)
)
)
Before this pass, the optimizer could do essentially nothing with this.
Even with this pass, running -O1 has no effect, as the pass is only
used in -O2+. However, running --heap2local -O1 leads to this:
(func $0
(local $0 i32)
(local.set $0
(i32.const 42)
)
(loop $loop
(br_if $loop
(call $import
(local.tee $0
(i32.add
(local.get $0)
(i32.const 1)
)
)
)
)
)
)
All the GC heap operations have been removed, and we just
have a plain int now, allowing a bunch of other opts to run. That
output is basically the optimal code, I think.
Diffstat (limited to 'src/wasm-interpreter.h')
-rw-r--r-- | src/wasm-interpreter.h | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/src/wasm-interpreter.h b/src/wasm-interpreter.h index 4cb26d74a..b38272acd 100644 --- a/src/wasm-interpreter.h +++ b/src/wasm-interpreter.h @@ -1443,7 +1443,7 @@ public: // We must have a module in order to perform the cast, to get the type. If // we do not have one, or if the function is not present (which may happen // if we are optimizing a function before the entire module is built), - // then this is not something we cannot precompute. + // then this is something we cannot precompute. auto* func = module ? module->getFunctionOrNull(cast.originalRef.getFunc()) : nullptr; |