summaryrefslogtreecommitdiff
path: root/src/passes/PickLoadSigns.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Fix PickLoadSigns on SignExt feature instructions (#7069)Alon Zakai2024-11-111-17/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I believe the history here is that 1. We added a PickLoadSigns pass. It checks if a load from memory is stored in a local that is only every used in a signed or an unsigned manner. If it is, we can adjust the sign of the load (load8_u/s) to do the sign/unsign during the load. 2. The pass finds each LocalGet and looks either 2 or 3 parents above it. For a sign operation, we need to look up 3, since the operation is x << K >> K. For an unsigned, we need only 2, since we have x & M. We hardcoded those numbers 2 and 3. 3. We added the SignExt feature, which adds i32.extend8_s. This does a sign extend with a single instruction, not two nested ones, so now we can sign- extend at depth 2, unlike before. Properties::getSignExtValue was updated for this, but not the pass PickLoadSigns. The bug that is fixed here is that we looked at depth 3 for a sign-extend, and we blindly accepted it if we found one. So we ended up accepting (i32.extend8_s (ANYTHING (x))), which is a sign-extend of something, but not of x, which is bad. We were also missing an optimization opportunity, as we didn't look for depth 2 sign extends. This bug is quite old, from when Properties got SignExt support, in #3910. But the blame isn't there - to notice this then, we'd have had to check each caller of getSignExtValue throughout the codebase, which isn't reasonable. The fault is mine, from the first write-up of PickLoadSigns in 2017: the code should have been fully general, handling 2/3 and checking the output when it does so (adding == curr, that the sign/zero-extended value is the one we expect). That is what this PR does.
* [NFC] Early-exit PickLoadSigns if there are no memories (#6971)Alon Zakai2024-09-261-0/+5
| | | | In WasmGC modules there is often no memory at all, and we can skip walking the code in this pass in such cases.
* Refactor interaction between Pass and PassRunner (#5093)Thomas Lively2022-09-301-1/+3
| | | | | | | | | | | | | | Previously only WalkerPasses had access to the `getPassRunner` and `getPassOptions` methods. Move those methods to `Pass` so all passes can use them. As a result, the `PassRunner` passed to `Pass::run` and `Pass::runOnFunction` is no longer necessary, so remove it. Also update `Pass::create` to return a unique_ptr, which is more efficient than having it return a raw pointer only to have the `PassRunner` wrap that raw pointer in a `unique_ptr`. Delete the unused template `PassRunner::getLast()`, which looks like it was intended to enable retrieving previous analyses and has been in the code base since 2015 but is not implemented anywhere.
* Modernize code to C++17 (#3104)Max Graey2021-11-221-3/+1
|
* PickLoadSigns fuzz fix: cannot make an atomic operation signed (#3235)Alon Zakai2020-10-131-0/+4
|
* Reflect instruction renaming in code (#2128)Heejin Ahn2019-05-211-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | - Reflected new renamed instruction names in code and tests: - `get_local` -> `local.get` - `set_local` -> `local.set` - `tee_local` -> `local.tee` - `get_global` -> `global.get` - `set_global` -> `global.set` - `current_memory` -> `memory.size` - `grow_memory` -> `memory.grow` - Removed APIs related to old instruction names in Binaryen.js and added APIs with new names if they are missing. - Renamed `typedef SortedVector LocalSet` to `SetsOfLocals` to prevent name clashes. - Resolved several TODO renaming items in wasm-binary.h: - `TableSwitch` -> `BrTable` - `I32ConvertI64` -> `I32WrapI64` - `I64STruncI32` -> `I64SExtendI32` - `I64UTruncI32` -> `I64UExtendI32` - `F32ConvertF64` -> `F32DemoteI64` - `F64ConvertF32` -> `F64PromoteF32` - Renamed `BinaryenGetFeatures` and `BinaryenSetFeatures` to `BinaryenModuleGetFeatures` and `BinaryenModuleSetFeatures` for consistency.
* Apply format changes from #2048 (#2059)Alon Zakai2019-04-261-11/+15
| | | Mass change to apply clang-format to everything. We are applying this in a PR by me so the (git) blame is all mine ;) but @aheejin did all the work to get clang-format set up and all the manual work to tidy up some things to make the output nicer in #2048
* Consistently optimize small added constants into load/store offsets (#1924)Alon Zakai2019-03-011-16/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | See #1919 - we did not do this consistently before. This adds a lowMemoryUnused option to PassOptions. It can be passed on the commandline with --low-memory-unused. If enabled, we run the new optimize-added-constants pass, which does the real work here, replacing older code in post-emscripten. Aside from running at the proper time (unlike the old pass, see #1919), this also has a -propagate mode, which can do stuff like this: y = x + 10 [..] load(y) [..] load(y) => y = x + 10 [..] load(x, offset=10) [..] load(x, offset=10) That is, it can propagate such offsets to the loads/stores. This pattern is common in big interpreter loops, where the pointers are offsets into a big struct of state. The pass does this propagation by using a new feature of LocalGraph, which can verify which locals are in SSA mode. Binaryen IR is not SSA (intentionally, since it's a later IR), but if a local only has a single set for all gets, that means that local is in such a state, and can be optimized. The tricky thing is that all locals are initialized to zero, so there are at minimum two sets. But if we verify that the real set dominates all the gets, then the zero initialization cannot reach them, and we are safe. This PR also makes safe-heap aware of lowMemoryUnused. If so, we check for not just an access of 0, but the range 0-1023. This makes zlib 5% faster, with either the wasm backend or asm2wasm. It also makes it 0.5% smaller. Also helps sqlite (1.5% faster) and lua (1% faster)
* notation change: AST => IR (#1245)Alon Zakai2017-10-241-1/+1
| | | The IR is indeed a tree, but not an "abstract syntax tree" since there is no language for which it is the syntax (except in the most trivial and meaningless sense).
* SSA pass (#1049)Alon Zakai2017-06-131-0/+4
| | | | | | | * Add SSA pass which ensures a single assign for each local, except for merged locals where we ensure exactly a single assign from one of the paths leading to that use * Also add InstrumentLocals pass, useful for debugging locals (similar to InstrumentMemory but for locals) * Fix a PickLoadSigns bug with tees not being ignored, which was not noticed until now because we ran it on flatter output by default, but the ssa pass uncovered the bug
* Default Walker subclasses to using Visitor<SubType> (#921)jgravelle-google2017-02-231-2/+2
| | | | Most module walkers use PostWalker<T, Visitor<T>>, let that pattern be expressed as simply PostWalker<T>
* finish PickLoadSigns passAlon Zakai2017-02-161-8/+7
|
* refactor sign/zero extension code into nice headers, and prepare ↵Alon Zakai2017-02-161-0/+108
PickLoadSigns pass