| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
| |
Mass change to apply clang-format to everything. We are applying this in a PR by me so the (git) blame is all mine ;) but @aheejin did all the work to get clang-format set up and all the manual work to tidy up some things to make the output nicer in #2048
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See #1919 - we did not do this consistently before.
This adds a lowMemoryUnused option to PassOptions. It can be passed on the commandline with --low-memory-unused. If enabled, we run the new optimize-added-constants pass, which does the real work here, replacing older code in post-emscripten.
Aside from running at the proper time (unlike the old pass, see #1919), this also has a -propagate mode, which can do stuff like this:
y = x + 10
[..]
load(y)
[..]
load(y)
=>
y = x + 10
[..]
load(x, offset=10)
[..]
load(x, offset=10)
That is, it can propagate such offsets to the loads/stores. This pattern is common in big interpreter loops, where the pointers are offsets into a big struct of state.
The pass does this propagation by using a new feature of LocalGraph, which can verify which locals are in SSA mode. Binaryen IR is not SSA (intentionally, since it's a later IR), but if a local only has a single set for all gets, that means that local is in such a state, and can be optimized. The tricky thing is that all locals are initialized to zero, so there are at minimum two sets. But if we verify that the real set dominates all the gets, then the zero initialization cannot reach them, and we are safe.
This PR also makes safe-heap aware of lowMemoryUnused. If so, we check for not just an access of 0, but the range 0-1023.
This makes zlib 5% faster, with either the wasm backend or asm2wasm. It also makes it 0.5% smaller. Also helps sqlite (1.5% faster) and lua (1% faster)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes #1649
This moves us to a single object for functions, which can be imported or nor, and likewise for globals (as a result, GetGlobals do not need to check if the global is imported or not, etc.). All imported things now inherit from Importable, which has the module and base of the import, and if they are set then it is an import.
For convenient iteration, there are a few helpers like
ModuleUtils::iterDefinedGlobals(wasm, [&](Global* global) {
.. use global ..
});
as often iteration only cares about imported or defined (non-imported) things.
|
|
|
| |
The IR is indeed a tree, but not an "abstract syntax tree" since there is no language for which it is the syntax (except in the most trivial and meaningless sense).
|
|
|
|
|
|
| |
* optimize pow(x,2) => x*x
* optimize pow(x, 0.5) => sqrt(x)
|
|
|
|
| |
Most module walkers use PostWalker<T, Visitor<T>>, let that pattern be
expressed as simply PostWalker<T>
|
|
|
|
|
|
| |
* fix regression from #850 - it is not always safe to fold added offsets into load/store offsets, as the add wraps but offset does not
* small refactoring to simplify asm2wasm pass invocation
|
|
|
|
| |
due to linker dead code elimination. Fixes #577.
|
|
|
|
| |
efficient parallel execution (#564)
|
| |
|
|
|
|
| |
function, to avoid confusion with having both visit* and visitExpression in a single pass (#357)
|
|
|
|
| |
* allow traversals to mark themselves as function-parallel, in which case we run them using a thread pool. also mark some thread-safety risks (interned strings, arena allocators) with assertions they modify only on the main thread
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
* refactor core walking to not recurse
* add a simplify-locals test
* reuse parent's non-branchey scan logic in SimpleExecutionWalker, reduce code duplication
* update wasm.js
* rename things following comments
|
| |
|
|
|