| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Reflected new renamed instruction names in code and tests:
- `get_local` -> `local.get`
- `set_local` -> `local.set`
- `tee_local` -> `local.tee`
- `get_global` -> `global.get`
- `set_global` -> `global.set`
- `current_memory` -> `memory.size`
- `grow_memory` -> `memory.grow`
- Removed APIs related to old instruction names in Binaryen.js and added
APIs with new names if they are missing.
- Renamed `typedef SortedVector LocalSet` to `SetsOfLocals` to prevent
name clashes.
- Resolved several TODO renaming items in wasm-binary.h:
- `TableSwitch` -> `BrTable`
- `I32ConvertI64` -> `I32WrapI64`
- `I64STruncI32` -> `I64SExtendI32`
- `I64UTruncI32` -> `I64UExtendI32`
- `F32ConvertF64` -> `F32DemoteI64`
- `F64ConvertF32` -> `F64PromoteF32`
- Renamed `BinaryenGetFeatures` and `BinaryenSetFeatures` to
`BinaryenModuleGetFeatures` and `BinaryenModuleSetFeatures` for
consistency.
|
|
|
| |
Applies the changes in #2065, and temprarily disables the hook since it's too slow to run on a change this large. We should re-enable it in a later commit.
|
|
|
| |
Mass change to apply clang-format to everything. We are applying this in a PR by me so the (git) blame is all mine ;) but @aheejin did all the work to get clang-format set up and all the manual work to tidy up some things to make the output nicer in #2048
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See #1919 - we did not do this consistently before.
This adds a lowMemoryUnused option to PassOptions. It can be passed on the commandline with --low-memory-unused. If enabled, we run the new optimize-added-constants pass, which does the real work here, replacing older code in post-emscripten.
Aside from running at the proper time (unlike the old pass, see #1919), this also has a -propagate mode, which can do stuff like this:
y = x + 10
[..]
load(y)
[..]
load(y)
=>
y = x + 10
[..]
load(x, offset=10)
[..]
load(x, offset=10)
That is, it can propagate such offsets to the loads/stores. This pattern is common in big interpreter loops, where the pointers are offsets into a big struct of state.
The pass does this propagation by using a new feature of LocalGraph, which can verify which locals are in SSA mode. Binaryen IR is not SSA (intentionally, since it's a later IR), but if a local only has a single set for all gets, that means that local is in such a state, and can be optimized. The tricky thing is that all locals are initialized to zero, so there are at minimum two sets. But if we verify that the real set dominates all the gets, then the zero initialization cannot reach them, and we are safe.
This PR also makes safe-heap aware of lowMemoryUnused. If so, we check for not just an access of 0, but the range 0-1023.
This makes zlib 5% faster, with either the wasm backend or asm2wasm. It also makes it 0.5% smaller. Also helps sqlite (1.5% faster) and lua (1% faster)
|
|
|
|
|
|
| |
Automated renaming according to
https://github.com/WebAssembly/spec/issues/884#issuecomment-426433329.
|
|
|
|
|
| |
* Remove the Action class - we just need a pointer to a get or set. This simplifies the code and saves a little memory, but doesn't seem to have any impact on speed.
* Miscellaneous code style and comment changes.
|
|
|
|
|
|
|
|
|
|
| |
* LocalGraph : Replace seen unordered_set by boolean check.
* LocalGraph : use unordered_map to store index -> last set_local instead of vector.
* LocalGraph :
- Use internal counter to avoid invalidation at each cycle.
- Move all blocks structs into a contiguous vector of smaller ones.
|
|
|
|
|
| |
This simplifies the logic there into a more standard flow operation. This is not always faster, but it is much faster on the worst cases we saw before like sqlite, and it is simpler.
The rewrite also fixes a fuzz bug.
|
|
|
|
|
|
|
|
|
| |
This optimizes the situation described in #1331. Namely, when x is copied into y, then on subsequent gets of x we could use y instead, and vice versa, as their value is equal. Specifically, this seems to get rid of the definite overlap in the live ranges of x and y, as removing it allows coalesce-locals to merge them. The pass therefore does nothing if the live range of y ends there anyhow.
The danger here is that we may extend the live range so that it causes more conflicts with other things, so this is a heuristic, but I've tested it on every codebase I can find and it always produces a net win, even on one I saw a 0.4% reduction of code size, which surprised me.
This is a fairly slow pass, because it uses LocalGraph which isn't much optimized. This PR includes a minor optimization for it, but we should rewrite it. Meanwhile this is just enabled in -O3 and -Oz.
This PR also includes some fuzzing improvements, to better test stuff like this.
|
|
The IR is indeed a tree, but not an "abstract syntax tree" since there is no language for which it is the syntax (except in the most trivial and meaningless sense).
|