| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
We were doing a debug logging for every LEB byte. It turns out that the
isDebugEnabled() calls are expensive when called so frequently: in a
release+assertion build, even with debug disabled, these checks are the
highest thing in the profile. This PR removes the checks, which makes
binary reading 12% faster.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each pass instance can now store an argument for it, which can be different.
This may be a breaking change for the corner case of running a pass multiple
times and setting the pass's argument multiple times as well (before, the last
pass argument affected them all; now, it affects the last instance only). This
only affects arguments with the name of a pass; others remain global, as
before (and multiple passes can read them, in fact). See the CHANGELOG for
details.
Fixes #6646
|
|
|
|
| |
Helps emscripten-core/emscripten#17380 by logging all the reasons why we
instrument a function, and not just the first as we did before.
|
|
|
|
| |
Specifically offsets larger than 2^32 which were being interpreted
misinterpreted here as very large int64_t values.
|
|
|
|
|
|
|
|
| |
argument (#6074)
In wasm64, function pointers are 64-bit like all pointers.
fixes #6073
|
| |
|
|
|
|
|
|
|
|
|
|
| |
All top-level Module elements are identified and referred to by Name, but for
historical reasons element and data segments were referred to by index instead.
Fix this inconsistency by using Names to refer to segments from expressions that
use them. Also parse and print segment names like we do for other elements.
The C API is partially converted to use names instead of indices, but there are
still many functions that refer to data segments by index. Finishing the
conversion can be done in the future once it becomes necessary.
|
|
|
|
|
|
|
| |
In this mode we don't remove the start/stop_em_asm symbols or data.
This is because with side modules we read this information directly
from the wasm binaryen at runtime.
See https://github.com/emscripten-core/emscripten/pull/18228
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the goal of supporting null characters (i.e. zero bytes) in strings.
Rewrite the underlying interned `IString` to store a `std::string_view` rather
than a `const char*`, reduce the number of map lookups necessary to intern a
string, and present a more immutable interface.
Most importantly, replace the `c_str()` method that returned a `const char*`
with a `toString()` method that returns a `std::string`. This new method can
correctly handle strings containing null characters. A `const char*` can still
be had by calling `data()` on the `std::string_view`, although this usage should
be discouraged.
This change is NFC in spirit, although not in practice. It does not intend to
support any particular new functionality, but it is probably now possible to use
strings containing null characters in at least some cases. At least one parser
bug is also incidentally fixed. Follow-on PRs will explicitly support and test
strings containing nulls for particular use cases.
The C API still uses `const char*` to represent strings. As strings containing
nulls become better supported by the rest of Binaryen, this will no longer be
sufficient. Updating the C and JS APIs to use pointer, length pairs is left as
future work.
|
|
|
|
| |
As an NFC preliminary change that will minimize the diff in #5122, which moves
IString to the wasm namespace.
|
|
|
| |
These are only needed for the metadata extraction in emcc.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously only WalkerPasses had access to the `getPassRunner` and
`getPassOptions` methods. Move those methods to `Pass` so all passes can use
them. As a result, the `PassRunner` passed to `Pass::run` and
`Pass::runOnFunction` is no longer necessary, so remove it.
Also update `Pass::create` to return a unique_ptr, which is more efficient than
having it return a raw pointer only to have the `PassRunner` wrap that raw
pointer in a `unique_ptr`.
Delete the unused template `PassRunner::getLast()`, which looks like it was
intended to enable retrieving previous analyses and has been in the code base
since 2015 but is not implemented anywhere.
|
|
|
|
| |
Rather than doing it as a side effect of dumping the metadata in
wasm-emscripten-finalize.
|
|
|
|
|
|
|
|
|
| |
PostEmscripten will turn an invoke of a constant function
pointer index into a direct call. However, due to UB it is possible to
have invalid function pointers, and we should not crash on that
(and do nothing to optimize, of course).
Mostly whitespace; to avoid deep nesting, I added more
early returns.
|
|
|
|
|
|
|
|
|
|
|
| |
Passive element segments do not belong to any table, so the link between
Table and elem needs to be weaker; i.e. an elem may have a table in case
of active segments, or simply be a collection of function references in
case of passive/declarative segments.
This PR takes Table::Segment out and turns it into a first class module
element just like tables and functions. It also implements early support
for parsing, printing, encoding and decoding passive/declarative elem
segments.
|
|
|
| |
Adds support for modules with multiple tables. Adds a field for the table name to `CallIndirect` and updates the C/JS APIs accordingly.
|
|
|
|
|
|
|
|
|
|
| |
CallRef (#3355)
This is in preparation for CallRef, which takes a reference to a function
and calls it.
CallGraphPropertyAnalysis needs to be aware of anything that is not a
direct call, and "NonDirect" is meant to cover both CallIndirect and
CallRef.
|
|
|
| |
We no longer build modules that import `global.Math`.
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we are renaming invoke wrappers and `emscripten_longjmp_jmpbuf`
in the wasm backend, this deletes all related renaming routines and
relevant tests. Depends on #3192.
Addresses: #3043 and #3081
Companions:
https://reviews.llvm.org/D88697
emscripten-core/emscripten#12399
|
|
|
|
|
| |
PostEmscripten (#3161)
These were removed completely from the emscripten side in #12057
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of finalize renaming emscripten_longjmp_jmpbuf to emscripten_longjmp,
do nothing in finalize. But in the optional --post-emscripten pass, rename it there if
both exist, so that we don't end up using two imports (other optimization passes
can then remove an unneeded import).
Depends on emscripten-core/emscripten#12157 to land first so that emscripten
can handle both names, and it is just an optimization to have one or the other.
See https://github.com/WebAssembly/binaryen/issues/3043
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This logs out the decisions made about instrumenting functions, which
can help figure out why a function is instrumented, or to get a list of
what might need to be.
As the test shows, it can print things like this:
[asyncify] import is an import that can change the state
[asyncify] calls-import can change the state due to import
[asyncify] calls-calls-import can change the state due to calls-import
[asyncify] calls-calls-calls-import can change the state due to calls-calls-import
(the test has calls-calls-calls-import => calls-calls-import => calls-import -> import).
|
| |
|
|
|
|
|
|
|
| |
This allows emscripten to statically set the initial value of the
stack pointer.
Should allow use to avoid doing it dynamically at startup:
https://github.com/emscripten-core/emscripten/pull/11031
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Depends on emscripten-core/emscripten#10741
which ensures that table indexes are unique. With that guarantee,
a main module can just add its function pointers into the table, and
use them based on that index. The loader will then see them in the
table and then give other modules the identical function pointer for
a function, ensuring function pointer equality.
This avoids calling fp$ functions during startup for the main
module's own functions (which are slow). We do still call fp$s
of things we import from outside, as we don't have anything to
put in the table for them, we depend on the loader for that.
I suspect this can also be done with SIDE_MODULES, but did not
want to try too much at once.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We ignored them, which is a bad default, as typically they imply
we can call anything in the table (and the table might change).
Instead, notice indirect calls during traversal, and force the user
to decide whether to ignore them or not.
This was only an issue in PostEmscripten because the other
user, Asyncify, already had indirect call analysis because it
needed it for other things.
Fixes a bug uncovered by #2619 and fixes the current binaryen
roll.
|
|
|
|
|
|
|
| |
We should be looking at the import name when determining if a function
is an invoke function.
This is a precursor to re-landing the fix for
https://github.com/emscripten-core/emscripten/issues/9950.
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we see invoke_ calls in emscripten-generated code, we know
they call into JS just to do a try-catch for exceptions. If the target being
called cannot throw, which we check in a whole-program manner, then
we can simply skip the invoke.
I confirmed that this fixes the regression in emscripten-core/emscripten#9817 (comment)
(that is, with this optimization, upstream is around as fast as fastcomp).
When we have native wasm exception handling, this can be
extended to optimize that as well.
|
|
|
|
|
| |
We've had an option to set the location of the sbrk ptr, but not the value.
Applying the value as well is necessary for standalone wasm, as otherwise we set
it in JS.
|
|
|
|
|
|
|
| |
(which is called DYNAMICTOP_PTR in emscripten) (#2325)
The assumption is that an import env.emscripten_get_sbrk_ptr exists, and we replace the value returned from there with a constant. (We can't do all this in wasm-emscripten-finalize, as it happens before the JS compiler runs, which can add more static allocations; we only know where the sbrk ptr is later in compilation.) This just replaces the import with a function returning the constant; inlining etc. can help more later.
By setting this at compile time we can reduce code size and avoid importing it at runtime, which makes us more compatible with wasi (less special imports).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Reflected new renamed instruction names in code and tests:
- `get_local` -> `local.get`
- `set_local` -> `local.set`
- `tee_local` -> `local.tee`
- `get_global` -> `global.get`
- `set_global` -> `global.set`
- `current_memory` -> `memory.size`
- `grow_memory` -> `memory.grow`
- Removed APIs related to old instruction names in Binaryen.js and added
APIs with new names if they are missing.
- Renamed `typedef SortedVector LocalSet` to `SetsOfLocals` to prevent
name clashes.
- Resolved several TODO renaming items in wasm-binary.h:
- `TableSwitch` -> `BrTable`
- `I32ConvertI64` -> `I32WrapI64`
- `I64STruncI32` -> `I64SExtendI32`
- `I64UTruncI32` -> `I64UExtendI32`
- `F32ConvertF64` -> `F32DemoteI64`
- `F64ConvertF32` -> `F64PromoteF32`
- Renamed `BinaryenGetFeatures` and `BinaryenSetFeatures` to
`BinaryenModuleGetFeatures` and `BinaryenModuleSetFeatures` for
consistency.
|
|
|
| |
Applies the changes in #2065, and temprarily disables the hook since it's too slow to run on a change this large. We should re-enable it in a later commit.
|
|
|
| |
Mass change to apply clang-format to everything. We are applying this in a PR by me so the (git) blame is all mine ;) but @aheejin did all the work to get clang-format set up and all the manual work to tidy up some things to make the output nicer in #2048
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See #1919 - we did not do this consistently before.
This adds a lowMemoryUnused option to PassOptions. It can be passed on the commandline with --low-memory-unused. If enabled, we run the new optimize-added-constants pass, which does the real work here, replacing older code in post-emscripten.
Aside from running at the proper time (unlike the old pass, see #1919), this also has a -propagate mode, which can do stuff like this:
y = x + 10
[..]
load(y)
[..]
load(y)
=>
y = x + 10
[..]
load(x, offset=10)
[..]
load(x, offset=10)
That is, it can propagate such offsets to the loads/stores. This pattern is common in big interpreter loops, where the pointers are offsets into a big struct of state.
The pass does this propagation by using a new feature of LocalGraph, which can verify which locals are in SSA mode. Binaryen IR is not SSA (intentionally, since it's a later IR), but if a local only has a single set for all gets, that means that local is in such a state, and can be optimized. The tricky thing is that all locals are initialized to zero, so there are at minimum two sets. But if we verify that the real set dominates all the gets, then the zero initialization cannot reach them, and we are safe.
This PR also makes safe-heap aware of lowMemoryUnused. If so, we check for not just an access of 0, but the range 0-1023.
This makes zlib 5% faster, with either the wasm backend or asm2wasm. It also makes it 0.5% smaller. Also helps sqlite (1.5% faster) and lua (1% faster)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes #1649
This moves us to a single object for functions, which can be imported or nor, and likewise for globals (as a result, GetGlobals do not need to check if the global is imported or not, etc.). All imported things now inherit from Importable, which has the module and base of the import, and if they are set then it is an import.
For convenient iteration, there are a few helpers like
ModuleUtils::iterDefinedGlobals(wasm, [&](Global* global) {
.. use global ..
});
as often iteration only cares about imported or defined (non-imported) things.
|
|
|
| |
The IR is indeed a tree, but not an "abstract syntax tree" since there is no language for which it is the syntax (except in the most trivial and meaningless sense).
|
|
|
|
|
|
| |
* optimize pow(x,2) => x*x
* optimize pow(x, 0.5) => sqrt(x)
|
|
|
|
| |
Most module walkers use PostWalker<T, Visitor<T>>, let that pattern be
expressed as simply PostWalker<T>
|
|
|
|
|
|
| |
* fix regression from #850 - it is not always safe to fold added offsets into load/store offsets, as the add wraps but offset does not
* small refactoring to simplify asm2wasm pass invocation
|
|
|
|
| |
due to linker dead code elimination. Fixes #577.
|
|
|
|
| |
efficient parallel execution (#564)
|
| |
|
|
|
|
| |
function, to avoid confusion with having both visit* and visitExpression in a single pass (#357)
|
|
|
|
| |
* allow traversals to mark themselves as function-parallel, in which case we run them using a thread pool. also mark some thread-safety risks (interned strings, arena allocators) with assertions they modify only on the main thread
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
* refactor core walking to not recurse
* add a simplify-locals test
* reuse parent's non-branchey scan logic in SimpleExecutionWalker, reduce code duplication
* update wasm.js
* rename things following comments
|
| |
|
|
|