| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
Each pass instance can now store an argument for it, which can be different.
This may be a breaking change for the corner case of running a pass multiple
times and setting the pass's argument multiple times as well (before, the last
pass argument affected them all; now, it affects the last instance only). This
only affects arguments with the name of a pass; others remain global, as
before (and multiple passes can read them, in fact). See the CHANGELOG for
details.
Fixes #6646
|
|
|
|
| |
#6457 added a test that exposed existing nondeterminism.
|
|
|
|
| |
Helps emscripten-core/emscripten#17380 by logging all the reasons why we
instrument a function, and not just the first as we did before.
|
|
|
|
|
| |
The new asyncify flag --pass-arg=asyncify-propagate-addlist changes the
behavior of --pass-arg=asyncify-addlist : with it, callers of functions in the
asyncify-addlist will be also instrumented.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Asyncify gained a way to wrap a pass so that it only runs on a given set of
functions, rather than on all functions, so the wrapper "filters" what the pass
operates on. That was useful in Asyncify as we wanted to only do work on
functions that Asyncify actually instrumented.
There is another place in the code that needs such functionality,
optimizeAfterInlining, which runs optimizations after we inline; again, we
only want to optimize on the functions we know are relevant because they
changed. To do that, move that logic out to a general place so it can be
reused. This makes the code there a lot less hackish.
While doing so make the logic only work on function-parallel passes. It
never did anyhow, but now it asserts on that. (It can't run on a general
pass because a general one does not provide an interface to affect which
functions it operates on; a general pass is entirely opaque in that way.)
|
|
|
|
|
|
|
| |
If there are newlines in the list, then we split using them in a simple manner
(that does not take into account nesting of any other delimiters).
Fixes #6047
Fixes #5271
|
|
|
|
|
| |
All logging/instrumentation passes need to do this, to avoid us using stale
global effects that are too low (too high is not optimal either, but at least it
cannot cause bugs).
|
|
|
|
|
|
|
|
| |
This fixes some outdated comments and typos in Asyncify and improves
some other comments. This tries to make code comments more readable by
making them more accurate and also by using the three state (normal,
unwinding, and rewinding) consistently.
Drive-by fix: Typo fixes in SimplifyGlobals and wasm-reduce option.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
```wast
(if (result i32)
(expr0)
(i32.const 1)
(expr1)
)
```
can be written as
```wast
(i32.or
(expr0)
(expr1)
)
```
Also this removes some unused variables and methods.
This also adds an optimization for
```wast
(i32.eqz
(global.get $__asyncify_state)
)
```
in `--mod-asyncify-always-and-only-unwind` to fix an unexpected
regression caused by this.
|
|
|
|
|
|
| |
We have `WasmBinaryBuilder` that read binary into Binaryen IR and
`WasmBinaryWriter` that writes Binaryen IR to binary. To me
`WasmBinaryBuilder` sounds similar to `WasmBinaryWriter`, which builds
binary. How about renaming it to `WasmBinaryReader`?
|
|
|
|
| |
This code predates our adoption of C++14 and can now be removed in favor of
`std::make_unique`, which should be more efficient.
|
|
|
|
|
|
|
|
|
|
|
| |
Followup to #5293, this fixes a small regression there regarding assertions. We do have
a need to visit non-instrumented functions if we want assertions, as we assert on some
things there, namely that such functions do not change the state (if they changed it,
we'd need to instrument them to handle that properly).
This moves that logic into a new pass. We run that pass when assertions are enabled.
Test diff basically undoes part the test diff from that earlier PR for that one file.
|
|
|
|
|
|
|
|
|
| |
Add a way to proxy passes and the addition of passes in pass runners. With
that we can make Asyncify only modify functions it actually needs to. On a
project that Asyncify only needs to modify a few functions on, this can save
a huge amount of time as it avoids flattening+optimizing the majority of
the module.
Fixes #4822
|
| |
|
| |
|
|
|
|
| |
This is more modern and (IMHO) easier to read than that old C typedef
syntax.
|
|
|
| |
Adds support for the Asyncify pass to use Multi-Memories. This is specified by passing flag --asyncify-in-secondary-memory. Another flag, --asyncify-secondary-memory-size, is used to specify the initial and max size of the secondary memory.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the goal of supporting null characters (i.e. zero bytes) in strings.
Rewrite the underlying interned `IString` to store a `std::string_view` rather
than a `const char*`, reduce the number of map lookups necessary to intern a
string, and present a more immutable interface.
Most importantly, replace the `c_str()` method that returned a `const char*`
with a `toString()` method that returns a `std::string`. This new method can
correctly handle strings containing null characters. A `const char*` can still
be had by calling `data()` on the `std::string_view`, although this usage should
be discouraged.
This change is NFC in spirit, although not in practice. It does not intend to
support any particular new functionality, but it is probably now possible to use
strings containing null characters in at least some cases. At least one parser
bug is also incidentally fixed. Follow-on PRs will explicitly support and test
strings containing nulls for particular use cases.
The C API still uses `const char*` to represent strings. As strings containing
nulls become better supported by the rest of Binaryen, this will no longer be
sufficient. Updating the C and JS APIs to use pointer, length pairs is left as
future work.
|
|
|
|
|
| |
The emscripten side is a little tricky but I've got some tests passing.
Currently blocked on:
https://github.com/emscripten-core/emscripten/issues/17969
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously only WalkerPasses had access to the `getPassRunner` and
`getPassOptions` methods. Move those methods to `Pass` so all passes can use
them. As a result, the `PassRunner` passed to `Pass::run` and
`Pass::runOnFunction` is no longer necessary, so remove it.
Also update `Pass::create` to return a unique_ptr, which is more efficient than
having it return a raw pointer only to have the `PassRunner` wrap that raw
pointer in a `unique_ptr`.
Delete the unused template `PassRunner::getLast()`, which looks like it was
intended to enable retrieving previous analyses and has been in the code base
since 2015 but is not implemented anywhere.
|
|
|
|
|
|
|
| |
This PR removes the single memory restriction in IR, adding support for a single module to reference multiple memories. To support this change, a new memory name field was added to 13 memory instructions in order to identify the memory for the instruction.
It is a goal of this PR to maintain backwards compatibility with existing text and binary wasm modules, so memory indexes remain optional for memory instructions. Similarly, the JS API makes assumptions about which memory is intended when only one memory is present in the module. Another goal of this PR is that existing tests behavior be unaffected. That said, tests must now explicitly define a memory before invoking memory instructions or exporting a memory, and memory names are now printed for each memory instruction in the text format.
There remain quite a few places where a hardcoded reference to the first memory persist (memory flattening, for example, will return early if more than one memory is present in the module). Many of these call-sites, particularly within passes, will require us to rethink how the optimization works in a multi-memories world. Other call-sites may necessitate more invasive code restructuring to fully convert away from relying on a globally available, single memory pointer.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in a function. (#4567)
* Lift the restriction in liveness-traversal.h that supported max 65535 locals in a function.
* Lint
* Fix typo
* Fix static
* Lint
* Lint
* Lint
* Add needed canRun function
* lint
* Use either a sparse or a dense matrix for tracking liveness copies, depending on the locals count.
* Lint
* Fix lint
* Lint
* Implement sparse_square_matrix class and use that as a backing.
* Lint
* Lint
* Lint #includes
* Lint
* Lint includes
* Remove unnecessary code
* Fix canonical accesses to copies matrix
* Lint
* Add missing variable update
* Remove canRun() function
* Address review
* Update expected test results
* Update test name
* Add asserts to sparse_square_matrix set and get functions that they are not out of bound.
* Lint includes
* Update test expectation
* Use .clear() + .resize() to reset totalCopies vector
|
|
|
|
|
|
|
| |
Related: emscripten-core/emscripten#15893 (comment)
--pass-arg=asyncify-side-module option will be used not only from
side modules, but also from main modules.
|
|
|
|
|
| |
Rewrite AsyncifyFlow.process to use stack instead of recursive call.
This patch resolves #4401
|
|
|
|
|
|
|
|
|
|
|
| |
This PR is part of the solution to emscripten-core/emscripten#15594.
emscripten Asyncify won't work properly in side modules, because the
globals, __asyncify_state and __asyncify_data, are not synchronized
between main-module and side-modules.
A new pass arg, asyncify-side-module, is added to make
__asyncify_state and __asyncify_data imported in the instrumented
wasm.
|
| |
|
|
|
| |
Helps #3739
|
|
|
|
| |
relevantLiveLocals (#4108)
|
|
|
|
|
|
|
|
|
| |
When using nominal types, func.ref of two functions with identical signatures
but different HeapTypes will yield different types. To preserve these semantics,
Functions need to track their HeapTypes, not just their Signatures.
This PR replaces the Signature field in Function with a HeapType field and adds
new utility methods to make it almost as simple to update and query the function
HeapType as it was to update and query the Function Signature.
|
| |
|
|
|
|
|
|
|
|
|
| |
As found in #3682, the current implementation of type ordering is not correct,
and although the immediate issue would be easy to fix, I don't think the current
intended comparison algorithm is correct in the first place. Rather than try to
switch to using a correct algorithm (which I am not sure I know how to
implement, although I have an idea) this PR removes Type ordering entirely. In
places that used Type ordering with std::set or std::map because they require
deterministic iteration order, this PR uses InsertOrdered{Set,Map} instead.
|
|
|
|
|
|
|
|
|
| |
When Functions, Globals, Events, and Exports are added to a module, if they are
not already in std::unique_ptrs, they are wrapped in a new std::unique_ptr owned
by the Module. This adds an extra layer of indirection when accessing those
elements that can be avoided by allocating those elements as std::unique_ptrs.
This PR updates wasm-builder to allocate module elements via std::make_unique
rather than `new`. In the future, we should remove the raw pointer versions of
Module::add* to encourage using std::unique_ptrs more broadly.
|
|
|
|
|
|
|
|
|
|
| |
CallRef (#3355)
This is in preparation for CallRef, which takes a reference to a function
and calls it.
CallGraphPropertyAnalysis needs to be aware of anything that is not a
direct call, and "NonDirect" is meant to cover both CallIndirect and
CallRef.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The DCE pass is one of the oldest in binaryen, and had quite a lot of
cruft from the changes in unreachability and other stuff in wasm and
binaryen's history. This PR rewrites it from scratch, making it about
1/3 the size.
I noticed this when looking for places to use code autogeneration.
The old version had annoying boilerplate, while the new one avoids
any need for it.
There may be noticeable differences, as the old pass did more than
it needed to. It overlapped with remove-unused-names for some
reason I don't remember. The new pass leaves that to the other
pass to do. I added another run of remove-unused-names to avoid
noticeable differences in optimized builds, but you can see
differences in the testcases that only run DCE by itself. (The test
differences in this PR are mostly whitespace.)
(The overlap is that if a block ended up not needed, that is, all
branches to it were removed, the old DCE would remove the block.)
This pass is about 15% faster than the old version. However, when
adding another run of remove-unused-names the difference
basically vanishes, so this isn't a speedup.
|
|
|
| |
Since they make the code clearer and more self-documenting.
|
|
|
| |
This leads to simpler code and is a prerequisite for #3012, which makes it so that not all `Type`s are backed by vectors that `expand` could return.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This logs out the decisions made about instrumenting functions, which
can help figure out why a function is instrumented, or to get a list of
what might need to be.
As the test shows, it can print things like this:
[asyncify] import is an import that can change the state
[asyncify] calls-import can change the state due to import
[asyncify] calls-calls-import can change the state due to calls-import
[asyncify] calls-calls-calls-import can change the state due to calls-calls-import
(the test has calls-calls-calls-import => calls-calls-import => calls-import -> import).
|
| |
|
|
|
|
|
|
|
|
|
| |
This finds out which locals are live at call sites that might pause/resume,
which is the set of locals we need to actually save/load. That is, if a local
is not alive at any call site in the function, then it's value doesn't need to
stay alive while sleeping.
This saves about 10% of locals that are saved/loaded, and about 1.5%
in final code size.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#2913)
When doing manual tuning of calls using asyncify lists, we want it to
be possible to write out all the functions that can be on the stack when
pausing, and for that to work. This did not quite work right with the
ignore-indirect option: that would ignore all indirect calls all the
time, so that if foo() calls bar() indirectly, that indirect call was
not instrumented (we didn't check for a pause around it), even if
both foo() and bar() were listed. There was no way to make that
work (except for not ignoring indirect calls at all).
This PR makes the add-list and only-lists fully instrument the functions
mentioned in them: both themselves, and indirect calls from them.
(Note that direct calls need no special handling - we can just add
the direct call target to the add-list or only-list.)
This may add some overhead to existing users, but only in a function
that is instrumented anyhow, and also indirect calls are slow anyhow,
so it's probably fine. And it is simpler to do it this way instead of
adding another list for indirect call handling.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Asyncify does a whole-program analysis to figure out the list of functions
to instrument. In
emscripten-core/emscripten#10746 (comment)
we realized that we need another type of list there, an "add list" which is a
list of functions to add to the instrumented functions list, that is, that we
should definitely instrument.
The use case in that link is that we disable indirect calls, but there is
one special indirect call that we do need to instrument. Being able to add
just that one can be much more efficient than assuming all indirect calls in
a big codebase need instrumentation. Similar issues can come up if we
add a profile-guided option to asyncify, which we've discussed.
The existing lists were not good enough to allow that, so a new option
is needed. I took the opportunity to rename the old ones to something
better and more consistent, so after this PR we have 3 lists as follows:
* The old "remove list" (previously "blacklist") which removes functions
from the list of functions to be instrumented.
* The new "add list" which adds to that list (note how add/remove are
clearly parallel).
* The old "only list" (previously "whitelist") which simply replaces the
entire list, and so only those functions are instrumented and no other.
This PR temporarily still supports the old names in the commandline
arguments, to avoid immediate breakage for our CI.
|
|
|
|
|
| |
Instead of adding globals for hardcoded basic types, traverse the
module to collect all call types that might need to be handled and
emit a global for each of them. Adapted from #2712.
|
|
|
|
|
|
|
|
| |
This reverts commit 5ddda8d2e6a3287ff6adcd69493e1e1c8b6c3872.
We decided it would be easier to allow tuple-typed globals than to
make calls work here after all. Reverts that change, but keeps small
improvements it made for clarity.
|
|
|
|
|
| |
This will be easier to extend for tuples.
Also add more clarifying comments.
|
|
|
| |
Iterate over tuple locals and separately load or store each component.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We assumed that the imports were already named (in their
internal name) properly. When processing a binary file without
names, or if the names don't match in general, that's not true.
To fix this, use ModuleUtils::renameFunctions to do a proper
renaming up front.
Also fix renameFunctions to not assert on the case of
renaming a function to the same name it already has.
Helps #2680
|
|
|
|
|
|
|
|
|
|
| |
Normally, a wrapper has to track state separately to know when to
unwind/rewind and when to actually call import functions.
Exposing Asyncify state can help avoid this duplication and avoid
subtle bugs when internal and wrapper state get out of sync.
Since this is a tiny function and it's useful for any Asyncify
embedder, I've decided to expose it by default rather than hide behind an option.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We ignored them, which is a bad default, as typically they imply
we can call anything in the table (and the table might change).
Instead, notice indirect calls during traversal, and force the user
to decide whether to ignore them or not.
This was only an issue in PostEmscripten because the other
user, Asyncify, already had indirect call analysis because it
needed it for other things.
Fixes a bug uncovered by #2619 and fixes the current binaryen
roll.
|