diff options
author | Alon Zakai <alonzakai@gmail.com> | 2019-03-01 10:28:07 -0800 |
---|---|---|
committer | GitHub <noreply@github.com> | 2019-03-01 10:28:07 -0800 |
commit | 689fe405a3417fbfd59456035add6f6f53149f35 (patch) | |
tree | d6f1dcaf0cbb85eb3ae830f68a46c9a6627d1562 /test/passes/safe-heap_low-memory-unused.wast | |
parent | f59c3033e678ced61bc8c78e8ac9fbee31ef0210 (diff) | |
download | binaryen-689fe405a3417fbfd59456035add6f6f53149f35.tar.gz binaryen-689fe405a3417fbfd59456035add6f6f53149f35.tar.bz2 binaryen-689fe405a3417fbfd59456035add6f6f53149f35.zip |
Consistently optimize small added constants into load/store offsets (#1924)
See #1919 - we did not do this consistently before.
This adds a lowMemoryUnused option to PassOptions. It can be passed on the commandline with --low-memory-unused. If enabled, we run the new optimize-added-constants pass, which does the real work here, replacing older code in post-emscripten.
Aside from running at the proper time (unlike the old pass, see #1919), this also has a -propagate mode, which can do stuff like this:
y = x + 10
[..]
load(y)
[..]
load(y)
=>
y = x + 10
[..]
load(x, offset=10)
[..]
load(x, offset=10)
That is, it can propagate such offsets to the loads/stores. This pattern is common in big interpreter loops, where the pointers are offsets into a big struct of state.
The pass does this propagation by using a new feature of LocalGraph, which can verify which locals are in SSA mode. Binaryen IR is not SSA (intentionally, since it's a later IR), but if a local only has a single set for all gets, that means that local is in such a state, and can be optimized. The tricky thing is that all locals are initialized to zero, so there are at minimum two sets. But if we verify that the real set dominates all the gets, then the zero initialization cannot reach them, and we are safe.
This PR also makes safe-heap aware of lowMemoryUnused. If so, we check for not just an access of 0, but the range 0-1023.
This makes zlib 5% faster, with either the wasm backend or asm2wasm. It also makes it 0.5% smaller. Also helps sqlite (1.5% faster) and lua (1% faster)
Diffstat (limited to 'test/passes/safe-heap_low-memory-unused.wast')
-rw-r--r-- | test/passes/safe-heap_low-memory-unused.wast | 56 |
1 files changed, 56 insertions, 0 deletions
diff --git a/test/passes/safe-heap_low-memory-unused.wast b/test/passes/safe-heap_low-memory-unused.wast new file mode 100644 index 000000000..a2754b469 --- /dev/null +++ b/test/passes/safe-heap_low-memory-unused.wast @@ -0,0 +1,56 @@ +(module + (memory (shared 100 100)) + (func $loads + (drop (i32.load (i32.const 1))) + (drop (i32.atomic.load (i32.const 1))) + (drop (i32.load offset=31 (i32.const 2))) + (drop (i32.load align=2 (i32.const 3))) + (drop (i32.load align=1 (i32.const 4))) + (drop (i32.load8_s (i32.const 5))) + (drop (i32.load16_u (i32.const 6))) + (drop (i64.load8_s (i32.const 7))) + (drop (i64.load16_u (i32.const 8))) + (drop (i64.load32_s (i32.const 9))) + (drop (i64.load align=4 (i32.const 10))) + (drop (i64.load (i32.const 11))) + (drop (f32.load (i32.const 12))) + (drop (f64.load (i32.const 13))) + (drop (v128.load (i32.const 14))) + ) + (func $stores + (i32.store (i32.const 1) (i32.const 100)) + (i32.atomic.store (i32.const 1) (i32.const 100)) + (i32.store offset=31 (i32.const 2) (i32.const 200)) + (i32.store align=2 (i32.const 3) (i32.const 300)) + (i32.store align=1 (i32.const 4) (i32.const 400)) + (i32.store8 (i32.const 5) (i32.const 500)) + (i32.store16 (i32.const 6) (i32.const 600)) + (i64.store8 (i32.const 7) (i64.const 700)) + (i64.store16 (i32.const 8) (i64.const 800)) + (i64.store32 (i32.const 9) (i64.const 900)) + (i64.store align=4 (i32.const 10) (i64.const 1000)) + (i64.store (i32.const 11) (i64.const 1100)) + (f32.store (i32.const 12) (f32.const 1200)) + (f64.store (i32.const 13) (f64.const 1300)) + (v128.store (i32.const 14) (v128.const i32 1 2 3 4)) + ) +) +;; not shared +(module + (memory 100 100) + (func $loads + (drop (i32.load (i32.const 1))) + ) +) +;; pre-existing +(module + (type $FUNCSIG$v (func)) + (import "env" "DYNAMICTOP_PTR" (global $DYNAMICTOP_PTR i32)) + (import "env" "segfault" (func $segfault)) + (import "env" "alignfault" (func $alignfault)) + (memory $0 (shared 100 100)) + (func $actions + (drop (i32.load (i32.const 1))) + (i32.store (i32.const 1) (i32.const 100)) + ) +) |