| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Generally we try to order types by decreasing use count so that frequently used
types get smaller indices. For the equirecursive and nominal systems, there are
no contraints on the ordering of types, so we just have to sort them according
to their use counts. For the isorecursive type system, however, there are a
number of ordering constraints that have to be met for the type section to be
valid. First, types in the same recursion group must be adjacent so they can be
grouped together. Second, groups must be ordered topologically so that they only
refer to types in themselves or prior groups.
Update type ordering to produce a valid isorecursive output by performing a
topological sort on the recursion groups. While performing the sort, prefer to
visit and finish processing the most used groups first as a heuristic to improve
the final ordering.
Do not reorder types within groups, since doing so would change type identity
and could affect the external interface of the module. Leave that reordering to
an optimization pass (not yet implemented) that users can explicitly opt in to.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GlobalManager is another class that added complexity in the interpreter logic,
and did not help. In fact it hurts extensibility, as when one wants to extend the
interpreter one has another class to customize, and it is templated on the main
runner, so again as #4479 we end up with annoying template cycles.
This simply removes that class. That makes the interpreter code strictly
simpler. Applying that change to wasm-ctor-eval also ends up fixing a
pre-existing bug, so this PR gets testing through that.
The ctor-eval issue was that we did not extend the GlobalManager properly
in the past: we checked for accesses on imported globals there, but not in
the main class, i.e., not on global.get operations. Needing to do things in
two places is an example of the previous complexity. The fix is simply to
implement visitGlobalGet in one place, and remove all the GlobalManager
logic added in ctor-eval, which then gets a lot simpler as well.
The new imported-global-2.wast checks for that bug (a global.get of an
import should stop us from evalling). Existing tests cover the other cases,
like it being ok to read a non-imported global, etc. The existing test
indirect-call3.wast required a slight change: There was a global.get of
an imported global, which was ignored in the place it happened (an init
of an elem segment); the new code checks all global.gets, so it now
catches that.
|
|
|
|
|
| |
Update the HeapType constructors that take Signature, Structs, and Arrays to
work properly with isorecursive typing. This is particularly important for the
Signature constructor, which is used frequently throughout the code base.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Isorecursive canonicalization is similar to equirecursive canonicalization in
that it deduplicates types based on their structure, but unlike equirecursive
canonicalization, isorecursive canonicalization does not need to minimize the
type definition first. Another difference is that structures are deduplicated at
the level of recursion groups rather than individual types, so we cannot reuse
the existing `FiniteShapeHasher` and `FiniteShapeEquator` utilities. Instead,
introduce a new `RecGroupStructure` wrapper that provides equality comparison
and hashing very similar to the former utilities.
Another feature of the isorecursive type system is that its constraints on the
order of recursion groups means that they are already topologically sorted and
can be incrementally canonicalized in a bottom-up manner. This incremental
canonicalization means that the `RecGroupStructure` utility can assume that all
child `HeapTypes` have already been canonicalized and can avoid traversing
anything but the top-level type definitions in the wrapped recursion group. The
only exception is self-references into the wrapped recursion group itself, which
may not be canonical. That special case is detected and handled without any
nontrivial extra work.
Overall, the canonicalization algorithm traverses each type definition once to
canonicalize its `HeapType` use sites, then once to hash its recursion group
structure, then finally once to canonicalize its `Type` use sites in
`globallyCanonicalize`, giving only linear time complexity.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
class (#4479)
As recently discussed, the interpreter code is way too complex. Trying to add
ctor-eval stuff I need, I got stuck and ended up spending some time to get rid
of some of the complexity.
We had a ModuleInstanceBase class which was basically an instance of a
module, that is, an execution of it. And internally we have RuntimeExpressionRunner
which is a runner that integrates with the ModuleInstanceBase - basically, it uses
the runtime info to execute code. For example, the MIB has globals info, and the
RER would read it from there.
But these two classes are really just one functionality - an execution of a module.
We get rid of some complexity by removing the separation between them, ending
up with a class that can run a module.
One set of problems we avoid is that we can now extend the single class in a
simple way. Before, we would need to extend both - and inform each other of
those changes. That gets "fun" with CRTP which we use everywhere. In other
words, each of the two classes depended on the other / would need to be
templated on the other. Specifically, MIB.callFunction would need to be given
the RER to run with, and so that would need to be templated on it. This ends up
leading to a bunch more templating all around - all complexity that we just
don't need. See the simplification to the wasm-ctor-eval for some of that (and
even worse complexity would have been needed without this PR in the next
steps for that tool to eval GC stuff).
The final single class is now called ModuleRunner.
Also fixes a pre-existing issue uncovered by this PR. We had the delegate
target on the runner, but it should be tied to a function scope. This happened
to not be a problem if one always created a new runner for each scope, but
this PR makes the runner longer-lived, so the stale data ended up mattering.
The PR moves that data to the proper place.
Note: Diff without whitespace is far, far smaller.
|
|
|
|
|
| |
We emitted the right text to stdout to indicate a trap in one code path, but did
not return a Trap from the function. As a result, we'd continue and hit the
assert on the next line.
|
| |
|
|
|
|
|
| |
Storing the rec group index on the HeapTypeInfo avoids having to do a linear
scan through the rec group to find the index for a particular type. This will
be important for isorecursive canonicalization, which uses rec group indices.
|
|
|
|
|
|
|
| |
Since isorecursive canonicalization will happen incrementally in a bottom-up
manner, it will be more efficient if can more precisely control which types get
updated at each step. Refactor the `CanonicalizationState` interface to
separately expose shallow updating, which updates only the top-level state, and
deep updating of the use sites within HeapTypeInfos.
|
|
|
|
|
|
|
|
|
| |
Add a base class for it, that is templated and can be extended in general
ways, and make callFunction templated on the runner to use as well.
This allows the interpreter's behavior to be customized in a way that we
couldn't so far. wasm-ctor-eval wants to use a special Runner when it
evals a function, so that it can track certain operations, which this will
enable.
|
|
|
|
| |
After emscripten-core/emscripten#15905 lands Emscripten will no longer use it,
and nothing else needs it AFAIK.
|
|
|
|
|
|
|
|
| |
In asm2wasm we modelled debugInfo using special imports. And we tried
to not move them around much. Current debugInfo is tracked on
instructions and is not affected by removing this.
This may have some tiny effect beneficial effect on code size in debug
builds, perhaps.
|
|
|
|
|
|
|
|
|
|
|
| |
(#4399)
Final part of #4265
(i32(x) >= 0) & (i32(y) >= 0) ==> i32(x | y) >= 0
(i64(x) >= 0) & (i64(y) >= 0) ==> i64(x | y) >= 0
(i32(x) == -1) & (i32(y) == -1) ==> i32(x & y) == -1
(i64(x) == -1) & (i64(y) == -1) ==> i64(x & y) == -1
|
|
|
|
|
|
|
| |
When building isorecursive types, validate their relationships according to the
rules described in https://github.com/WebAssembly/gc/pull/243. Specifically,
supertypes must be declared before their subtypes to statically prevent cycles
and child types must be declared either before or in the same recursion group as
their parents.
|
|
|
|
|
| |
Rewrite AsyncifyFlow.process to use stack instead of recursive call.
This patch resolves #4401
|
|
|
|
|
|
|
|
|
|
|
| |
It is possible for type building to fail, for example if the declared nominal
supertypes form a cycle or are structurally invalid. Previously we would report
a fatal error and kill the program from inside `TypeBuilder::build()` in these
situations, but this handles errors at the wrong layer of the code base and is
inconvenient for testing the error cases.
In preparation for testing the new error cases introduced by isorecursive
typing, make type building fallible and add new tests for existing error cases.
Also fix supertype cycle detection, which it turns out did not work correctly.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In `--hybrid` isorecursive mode, associate each defined type with a recursion
group, represented as a `(rec ...)` wrapping the type definitions in the text
format. Parse that text format, create the rec groups using a new TypeBuilder
method, and print the rec groups in the printer.
The only semantic difference rec groups currently make is that if one type in a
rec group will be included in the output, all the types in that rec group will
be included. This is because changing a rec group in any way (for example by
removing a type) changes the identity of the types in that group in the
isorecursive type system. Notably, rec groups do not yet participate in
validation, so `--hybrid` is largely equivalent to `--nominal` for now.
|
|
|
|
|
|
|
| |
This function call now takes the address (which by defintion is outside
of the stack range) that the program was attempting to set SP to.
This allows emscripten to provide a more useful error message on stack
over/under flow.
|
|
|
|
|
|
|
| |
Add a utility class for defining all the common operations like pre- and post-
increment and decrement, addition and subtraction, and assigning addition and
subtraction for iterators that are comprised of a parent object and an index
into that parent object. Use the new utility to reduce the boilerplate in
wasm-type.h. Add a new test of the iterator behavior.
|
|
|
|
|
| |
Add a `destroyAllTypes` function to clear the global state of the type system
and use it in a custom gtest test fixture to ensure that each test starts and
ends with a fresh state.
|
|
|
|
|
|
|
|
|
| |
(#4372)
(i32(x) >= 0) | (i32(y) >= 0) ==> i32(x & y) >= 0
(i64(x) >= 0) | (i64(y) >= 0) ==> i64(x & y) >= 0
(i32(x) != -1) | (i32(y) != -1) ==> i32(x & y) != -1
(i64(x) != -1) | (i64(y) != -1) ==> i64(x & y) != -1
|
|
|
|
|
|
|
|
| |
This field was originally added with the goal of allowing types from multiple
type systems to coexist by determining the type system on a per-type level
rather than globally. This goal was never fully achieved and the `isNominal`
field is not used outside of tests. Now that we are working on implementing the
hybrid isorecursive system, it does not look like having types from multiple
systems coexist will be useful in the near term, so clean up this tech debt.
|
|
|
|
|
|
|
|
|
|
|
| |
There is no reason the `__stack_pointer` global can't be exported from
the module, and in fact I'm experimenting with a non-relocatable main
module that requires this.
See https://github.com/emscripten-core/emscripten/issues/12682
This heuristic still kind of sucks but should always be good enough
for llvm output that always puts the stack pointer first.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add gtest as a git submodule in third_party and integrate it into the build the
same way WABT does. Adds a new executable, `binaryen-unittests`, to execute
`gtest_main`. As a nontrivial example test, port one of the `TypeBuilder` tests
from example/ to gtest/.
Using gtest has a number of advantages over the current example tests:
- Tests are compiled and linked at build time rather than runtime, surfacing
errors earlier and speeding up test execution.
- Tests are all built into a single binary, reducing overall link time and
further reducing test overhead.
- Tests are built from the same CMake project as the rest of Binaryen, so
compiler settings (e.g. sanitizers) are applied uniformly rather than having
to be separately set via the COMPILER_FLAGS environment variable.
- Using the industry-standard gtest rather than our own script reduces our
maintenance burden.
Using gtest will lower the barrier to writing C++ tests and will hopefully lead
to us having more proper unit tests.
|
|
|
|
|
|
| |
Since https://reviews.llvm.org/D117412 landed it has causes a bunch of
SAFE_HEAP tests in emscripten to start failing, because
`__wasm_apply_data_relocs` can now sometimes be called from with
`__wasm_init_memory` as opposed to directly from the start function.
|
|
|
|
|
| |
Eventually this will enable the isorecursive hybrid type system described in
https://github.com/WebAssembly/gc/pull/243, but for now it just throws a fatal
error if used.
|
|
|
|
|
|
| |
This is useful for the case where we might want to finalize
without extracting metadata.
See: https://github.com/emscripten-core/emscripten/pull/15918
|
|
|
|
|
|
|
|
|
|
|
| |
This PR is part of the solution to emscripten-core/emscripten#15594.
emscripten Asyncify won't work properly in side modules, because the
globals, __asyncify_state and __asyncify_data, are not synchronized
between main-module and side-modules.
A new pass arg, asyncify-side-module, is added to make
__asyncify_state and __asyncify_data imported in the instrumented
wasm.
|
|
|
|
|
| |
Update the API to make both the type indices and optimized sorting optional.
It will become more important to avoid unnecessary sorting once isorecursive
types have been implemented because they will make the sorting more complicated.
|
|
|
|
|
| |
without "ignoreImplicitTraps" (#4295)" (#4459)
This reverts commit 5cf3521708cfada341285414df2dc7366d7e5454.
|
|
|
|
|
| |
In the common case, avoid allocating a vector and calling malloc.
This makes us over 3x faster on the benchmark in #4452
|
|
|
|
|
|
|
| |
Handle the isBasic() case first - that inlined function is very fast to call,
and it is the common case. Also, do not do unnecessary work there: just
write out what we need, instead of always doing a memcpy of 16 bytes.
This makes us over 2x faster on the benchmark in #4452
|
|
|
|
|
| |
We call this very frequently in the interpreter.
This is a 25% speedup on the benchmark in #4452
|
|
|
| |
In preparation for the refactoring in #4455.
|
|
|
|
|
|
|
| |
LiteralList overlaps with Literals, but is less efficient as it is not a
SmallVector.
Add reserve/capacity methods to SmallVector which are now
necessary to compile.
|
|
|
|
| |
"ignoreImplicitTraps" (#4295)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When ignoring external input, assume params have a value of 0. This
makes it possible to eval main(argc, argv) if one is careful and does
not actually use those values.
This is basically a workaround for main always receiving argc/argv,
even if the C code has no args (in that case the compiler emits
__original_main for the user's main, and wraps it with a main
that adds the args, hence the problem).
This is similar to the existing support for handling wasi_args_get
when ignoring external input, although it just sets values of zeros for
the params. Perhaps it could check for main() specifically and return
1 for argc and a proper buffer for argv somehow, but I think if a program
wants to use --ignore-external-input it can avoid actually reading
argc/argv.
|
|
|
|
| |
(#4448)
|
|
|
| |
This is necessary for e.g. main() which returns an i32.
|
|
|
|
|
|
|
|
| |
This tool depends (atm) on flattening memory segments. That is not compatible
with memory.init which cares about segment identities.
This changes flatten() only by adding the check for MemoryInit. The rest is
unchanged, although I saw the other two params are not needed and I removed
them while I was there.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By default wasm-ctor-eval removes exports that it manages to completely
eval (if it just partially evals then the export remains, but points to a function
with partially-evalled contents). However, in some cases we do want to keep
the export around even so, for example during fuzzing (as the fuzzer wants
to call the same exports before and after wasm-ctor-eval runs) and also
if there is an ABI we need to preserve (like if we manage to eval all of
main()), or if the function returns a value (which we don't support yet, but
this is a PR to prepare for that).
Specifically, there is now a new option:
--kept-exports foo,bar
That is a list of exports to keep around.
Note that when we keep around an export after evalling the ctor we
make the export point to a new function. That new function just
contains a nop, so that nothing happens when it is called. But the
original function is kept around as it may have other callers, who we
do not want to modify.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This lets us eval part of a function but not all, which is necessary to handle
real-world things like __wasm_call_ctors in LLVM output, as that is the
single ctor that is exported and it has calls to the actual ctors.
To do so, we look for a toplevel block and execute its items one by one, in
a FunctionScope. If we stop in the middle, then we are performing a partial
eval. In that case, we only remove the parts of the function that we removed,
and we also serialize the locals whose values we read from the
FunctionScope.
For example, consider this:
function foo() {
return 10;
}
function __wasm_call_ctors() {
var x;
x = foo();
x++;
// We stop evalling here.
import1();
import2(x);
}
We can eval x = foo() and x++, but we must stop evalling when
we reach the first of those imports. The partially-evalled function
then looks like this:
function __wasm_call_ctors() {
var x;
x = 11;
import1();
import2(x);
}
That is, we evalled two lines of executing code and simply removed
them, and then we wrote out the value of the local at that point, and then
the rest of the code in the function is as it used to be.
|
| |
|
|
|
|
|
|
|
|
| |
As it happens, this doesn't (normally) break the resulting EM_ASM or
EM_JS strings because (IIUC) JS supports the tab literal inside of
strings as well as "\t".
However, it's better to preserve the original text so that it looks
the same in the JS file as it did in the original source.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since https://github.com/emscripten-core/emscripten/pull/15905 landed
emscripten now includes its own dummy atexit function when building with
EXIT_RUNTIME=0.
This dummy function conflicts with the emscripten-provided one:
```
wasm-ld: error: duplicate symbol: atexit
>>> defined in CMakeFiles/binaryen_wasm.dir/src/binaryen-c.cpp.o
>>> defined in ...wasm32-emscripten/lto/libnoexit.a(atexit_dummy.o)
```
Normally overriding symbols from libc does not causes issues but one
needs to be sure to override all the symbols in a given object file so
that the object in question (atexit_dummy.o) does not get linked in. In
this case some other symbol being defined in in atexit_dummy.o (e.g.
__cxa_atexit) is likely the cause of the conflict.
Overriding symbols from libc is likely to break in this way as the libc
evolves, and since emscripten is now providing a dummy, just as we want,
its better/safer to simply remove our dummy.
|
|
|
|
|
| |
This logging is central to what this tool does, and not optional, so stdout
makes more sense I think. Also, as I'm re-integrating this on the emscripten
side, this makes it simpler.
|
|
|
|
|
|
|
| |
Fixes the crash in #4418
Also replace the .at() there with better logic to handle imported functions.
See WebAssembly/wabt#1799 for details on why wabt sometimes emits this.
|
|
|
|
|
|
|
|
|
|
| |
This is necessary for being able to optimize real-world code, as it lets us
use the stack pointer for example. With this PR we allow changes to
globals, and we simply store the final state of the global in the global at
the end. Basically the same as we do for memory, but for globals.
Remove a test that now fails ("imported2"). Replace it with a nicer test
of saving the values of globals. Also add a test for an imported global,
which we do not allow (we never did, but I don't see a test for it).
|