summaryrefslogtreecommitdiff
path: root/test
diff options
context:
space:
mode:
authorAlon Zakai <azakai@google.com>2024-11-11 14:41:56 -0800
committerGitHub <noreply@github.com>2024-11-11 14:41:56 -0800
commitf6cfc5647259c12036f58f0a3b6b3cfc83e4b914 (patch)
treefb13b2c529722fdda9910d33a307df80053fd01f /test
parent9e11c5fd12a7c6392ac62979cc16c62a368f6e33 (diff)
downloadbinaryen-f6cfc5647259c12036f58f0a3b6b3cfc83e4b914.tar.gz
binaryen-f6cfc5647259c12036f58f0a3b6b3cfc83e4b914.tar.bz2
binaryen-f6cfc5647259c12036f58f0a3b6b3cfc83e4b914.zip
Fix PickLoadSigns on SignExt feature instructions (#7069)
I believe the history here is that 1. We added a PickLoadSigns pass. It checks if a load from memory is stored in a local that is only every used in a signed or an unsigned manner. If it is, we can adjust the sign of the load (load8_u/s) to do the sign/unsign during the load. 2. The pass finds each LocalGet and looks either 2 or 3 parents above it. For a sign operation, we need to look up 3, since the operation is x << K >> K. For an unsigned, we need only 2, since we have x & M. We hardcoded those numbers 2 and 3. 3. We added the SignExt feature, which adds i32.extend8_s. This does a sign extend with a single instruction, not two nested ones, so now we can sign- extend at depth 2, unlike before. Properties::getSignExtValue was updated for this, but not the pass PickLoadSigns. The bug that is fixed here is that we looked at depth 3 for a sign-extend, and we blindly accepted it if we found one. So we ended up accepting (i32.extend8_s (ANYTHING (x))), which is a sign-extend of something, but not of x, which is bad. We were also missing an optimization opportunity, as we didn't look for depth 2 sign extends. This bug is quite old, from when Properties got SignExt support, in #3910. But the blame isn't there - to notice this then, we'd have had to check each caller of getSignExtValue throughout the codebase, which isn't reasonable. The fault is mine, from the first write-up of PickLoadSigns in 2017: the code should have been fully general, handling 2/3 and checking the output when it does so (adding == curr, that the sign/zero-extended value is the one we expect). That is what this PR does.
Diffstat (limited to 'test')
-rw-r--r--test/lit/passes/pick-load-signs_sign-ext.wast80
1 files changed, 80 insertions, 0 deletions
diff --git a/test/lit/passes/pick-load-signs_sign-ext.wast b/test/lit/passes/pick-load-signs_sign-ext.wast
new file mode 100644
index 000000000..c50af32c5
--- /dev/null
+++ b/test/lit/passes/pick-load-signs_sign-ext.wast
@@ -0,0 +1,80 @@
+;; NOTE: Assertions have been generated by update_lit_checks.py and should not be edited.
+;; RUN: wasm-opt %s --pick-load-signs -all -S -o - | filecheck %s
+
+(module
+ ;; CHECK: (memory $0 16 17 shared)
+ (memory $0 16 17 shared)
+
+ ;; CHECK: (func $load-other-use (type $0) (result i32)
+ ;; CHECK-NEXT: (local $temp i32)
+ ;; CHECK-NEXT: (block $label (result i32)
+ ;; CHECK-NEXT: (local.set $temp
+ ;; CHECK-NEXT: (i32.load8_u
+ ;; CHECK-NEXT: (i32.const 22)
+ ;; CHECK-NEXT: )
+ ;; CHECK-NEXT: )
+ ;; CHECK-NEXT: (drop
+ ;; CHECK-NEXT: (i32.extend8_s
+ ;; CHECK-NEXT: (br_if $label
+ ;; CHECK-NEXT: (local.get $temp)
+ ;; CHECK-NEXT: (i32.const 1)
+ ;; CHECK-NEXT: )
+ ;; CHECK-NEXT: )
+ ;; CHECK-NEXT: )
+ ;; CHECK-NEXT: (unreachable)
+ ;; CHECK-NEXT: )
+ ;; CHECK-NEXT: )
+ (func $load-other-use (result i32)
+ (local $temp i32)
+ ;; The load here is unsigned, while the value in the local is used in two
+ ;; ways: it is sign-extended, and it is sent on a branch by a br_if. We must
+ ;; not ignore the branch, as if we did then we'd think the only use is signed,
+ ;; and we'd optimize load8_u => load8_s.
+ (block $label (result i32)
+ (local.set $temp
+ (i32.load8_u
+ (i32.const 22)
+ )
+ )
+ (drop
+ (i32.extend8_s
+ (br_if $label
+ (local.get $temp)
+ (i32.const 1)
+ )
+ )
+ )
+ (unreachable)
+ )
+ )
+
+ ;; CHECK: (func $load-valid (type $1)
+ ;; CHECK-NEXT: (local $temp i32)
+ ;; CHECK-NEXT: (local.set $temp
+ ;; CHECK-NEXT: (i32.load8_s
+ ;; CHECK-NEXT: (i32.const 22)
+ ;; CHECK-NEXT: )
+ ;; CHECK-NEXT: )
+ ;; CHECK-NEXT: (drop
+ ;; CHECK-NEXT: (i32.extend8_s
+ ;; CHECK-NEXT: (local.get $temp)
+ ;; CHECK-NEXT: )
+ ;; CHECK-NEXT: )
+ ;; CHECK-NEXT: )
+ (func $load-valid
+ (local $temp i32)
+ ;; As above, but remove the br_if in the middle. Now this is a valid case to
+ ;; optimize, load8_u => load8_s.
+ (local.set $temp
+ (i32.load8_u
+ (i32.const 22)
+ )
+ )
+ (drop
+ (i32.extend8_s
+ (local.get $temp)
+ )
+ )
+ )
+)
+