| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
| |
See https://github.com/WebAssembly/extended-const
|
|
|
|
|
|
|
| |
* use [[noreturn]] available since C++11 instead of compiler-specific attributes
* replace deprecated std::is_pod with is_trivial&&is_standard_layout (also available since C++11/14)
* explicitly capture this in [=] lambdas
* extra const functions in FeatureSet, fix implicit cast warning by using the features field directly
* Use CMAKE_CXX_STANDARD to ensure the C++ standard parameter is set on all targets, remove manual compiler flag workaround.
|
| |
|
|
|
|
|
|
| |
Export an object with a `.value` property like the wasm JS API does
in browsers, and implement them with a getter and setter.
Fixes #4522
|
|
|
|
|
|
|
| |
When copying a MemorySize or MemoryGrow instruction (e.g. for inlining),
transfer the memory type also to the copy. Otherwise it will always be
i32, even if memory64 should be used.
This fixes issue #4530.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Merge similar functions that only differs constant values (like immediate
operand of const and call insts) by parameterization.
Performing this pass at post-link time can merge more functions across
objects. Inspired by Swift compiler's optimization which is derived from
LLVM's one:
https://github.com/apple/swift/blob/main/lib/LLVMPasses/LLVMMergeFunctions.cpp
https://github.com/llvm/llvm-project/blob/main/llvm/docs/MergeFunctions.rst
The basic ideas here are constant value parameterization and direct callee
parameterization by indirection.
Constant value parameterization is like below:
;; Before
(func $big-const-42 (result i32)
[[many instr 1]]
(i32.const 44)
[[many instr 2]]
)
(func $big-const-43 (result i32)
[[many instr 1]]
(i32.const 45)
[[many instr 2]]
)
;; After
(func $byn$mgfn-shared$big-const-42 (result i32)
[[many instr 1]]
(local.get $0) ;; parameterized!!
[[many instr 2]]
)
(func $big-const-42 (result i32)
(call $byn$mgfn-shared$big-const-42
(i32.const 42)
)
)
(func $big-const-43 (result i32)
(call $byn$mgfn-shared$big-const-42
(i32.const 43)
)
)
Direct callee parameterization is similar to the constant value parameterization,
but it parameterizes callee function i by ref.func instead. Therefore it is enabled
only when reference-types and typed-function-references features are enabled.
I saw 1 ~ 2 % reduction for SwiftWasm binary and Ruby's wasm port
using wasi-sdk, and 3 ~ 4.5% reduction for Unity WebGL binary when -Oz.
|
|
|
|
| |
We were missing this particular case, which we can in fact handle
when the cast is static.
|
|
|
|
| |
Instead of just reporting the type index that causes an error when building
types, report the name of the responsible type when parsing the text format.
|
|
|
|
|
| |
Introduce static consts with PassOptions Defaults.
Add assertion to verify that the default options are the Os options.
Also update the text in relevant tests.
|
| |
|
|
|
|
|
| |
Allow IndexedTypeNameGenerator to be configured with a custom prefix and also
allow it to be parameterized with an explicit fallback generator. This allows
multiple IndexedTypeNameGenerators to be composed together, for example.
|
|
|
|
|
|
|
| |
Add an option for running the asyncify transformation on the primary module
emitted by wasm-split. The idea is that the placeholder functions should be able
to unwind the stack while the secondary module is asynchronously loaded, then
once the placeholder functions have been patched out by the secondary module the
stack should be rewound and end up in the correct secondary function.
|
|
|
|
|
|
|
| |
Add a new fuzz checker to wasm-type-fuzzer that builds copies of the originally
built types, randomly selecting for each child type from all potential sources,
including both the originally built types and the not-yet-built duplicate types.
After building the new types, check that they are indeed identical to the old
types, which means that nothing has gone wrong with canonicalization.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
We were already eagerly canonicalizing basic HeapTypes when building types so
the more complicated canonicalization algorithms would not have to handle
noncanonical heap types, but we were not doing the same for Types. Equirecursive
canonicalization was properly handling noncanonical Types everywhere, but
isorecursive canonicalization was not. Rather than update isorecursive
canonicalization in multiple places, remove the special handling from
equirecursive canonicalization and canonicalize types in a single location.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous printing system in the Types API would print the full recursive
structure of a Type or HeapType with special markers using de Bruijn indices to
avoid infinite recursion and a separate special marker for when the size
exceeded an arbitrary upper limit. In practice, the types printed by that system
were not human readable, so all that complexity was not useful.
Replace that system with a new system that always emits a HeapType name rather
than recursing into the structure of inner HeapTypes. Add methods for printing
Types and HeapTypes with custom HeapType name generators. Also add a new
wasm-type-printing.h header with off-the-shelf type name generators that
implement simple naming schemes sufficient for tests and the type fuzzer.
Note that these new printing methods and the old printing methods they augment
are not used for emitting text modules. Printing types as part of expressions
and modules is handled by separate code in Print.cpp and the printing API
modified in this PR is mostly used for debugging. However, the new printing
methods are general enough that Print.cpp should be able to use them as well, so
update the format used to print types in the modified printing system to match
the text format in anticipation of making that change in a follow-up PR.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Validating nominally declared supertypes depends on being able to test type
equality, which in turn depends on the compared types having already been
canonicalized. Previously we validated supertypes before canonicalization, so
validation would fail in cases where it should have succeeded. Fix the bug by
canonicalizing first. This means that the global type store can now end up
holding invalid types, but those types will never be exposed to the user, so
that's not a huge problem.
Also fix an unrelated bug that was preventing the test from passing, which is
that supertypes were being hashed and compared without the special logic for
detecting self-references. This bug preventing the equivalent recursion groups
in the test from being deduplicated.
|
|
|
|
| |
This makes it easier to get an overview of what methods exist by looking at the
shorter struct definition.
|
|
|
|
|
|
|
|
|
|
| |
Add support for isorecursive types to wasm-fuzz-types by generating recursion
groups and ensuring that children types are only selected from candidates
through the end of the current group. For non-isorecursive systems, treat all
the types as belonging to a single group so that their behavior is unchanged.
Also fix two small bugs found by the fuzzer: LUB calculation was taking the
wrong path for isorecursive types and isorecursive validation was not handling
basic heap types properly.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These might help reduction. Most newer passes, like say --type-refining, are not
going to actually help by themselves without other passes, so those are not added
(they get run in the -O2 etc. modes, which at least gives them a chance to help).
DeadArgumentElimination: Might help by itself, if just removing arguments reduces
code size. In some cases applying constants may increase code size, though, but
the -optimizing variant helps there.
GlobalTypeOptimization: This can remove type fields which can shrink the type
section by a lot. This is the reason I realized I should open this PR, when I
happened to notice that running that pass manually after reduction helped a lot more.
SimplifyGlobals: Can remove unused globals, merge identical immutable ones,
etc., all of which can help code size directly.
|
|
|
|
|
|
|
|
| |
Refactor the `TopologicalSortStack` into a `TopologicalSort` CRTP utility that
manipulates the DFS stack internally instead of exposing it to the user. The
only method users have to implement to use the utility is `pushPredecessors`,
which should call the provided `push` method on all the predecessors of a given
item. The public interface to `TopologicalSort` is an input iterator, so it can
easily be used in range-based for loops.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This ended up simpler than I thought. We can simply emit global and
local data as we go, creating globals as necessary to contain GC data,
and referring to them using global.get later. That will ensure that
data identity works (things referring to the same object in the interpreter
will refer to the same object when the wasm is loaded). In more detail,
each live GC item is created in a "defining global", a global that is
immutable and of the precise type of that data. Then we just read from
that location in any place that wants to refer to that data. That is,
something like
function foo() {
var x = Bar(10);
var y = Bar(20);
var z = x;
z.value++; // first object now contains 11
...
}
will be evalled into something like
var define$0 = Bar(11); // note the ++ has taken effect here
var define$1 = Bar(20);
function foo() {
var x = define$0;
var y = define$1;
var z = define$0;
...
}
This PR should handle everything but "cycles", that is, GC data that at
runtime ends up forming a loop. Leaving that for later work (not sure
how urgent it is to fix).
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pass ignores reads from structs - it only cares about writes (during a
create or a struct.set). That makes sense since we want to refine the type
of fields to more specific things based on what is actually written to them.
However, a corner case was missed: If we ignore reads, the pass may
"cleverly" optimize to something that is no longer valid to read from. How
that happens is if there is no info at all for a type - no sets or news, so all
we have is a read, which as mentioned before we ignore, so we think we
have nothing at all for that type, and can do arbitrary stuff with it. But then
the arbitrary replacement can be invalid to read from, say if it has fewer
fields.
To handle that, just emit an unreachable. If all we have is a get but no
new then there cannot be an instance here at all. (That's only true in a
closed world, of course, but this entire pass assumes that anyhow.)
|
|
|
|
|
|
|
| |
Using a vector and lazily skipping finished items in `pop` rather than using a
linked list and eagerly removing duplicated items makes the code simpler and
more than twice as fast on my test case. The algorithmic complexity is unchanged
because the work to skip duplicates remains linear in the number of duplicates
added, it's just not spread out over the linear duplicate pushes any more.
|
|
|
|
|
|
|
|
|
|
|
| |
Write and parse recursion groups in binary type sections. Unlike in the text
format, where we ignore recursion groups when not using isorecursive types, do
not allow parsing binary recursion group when using other type systems. Doing so
would produce incorrect results because recursions groups only count as single
entries in the type system vector so we dynamically grow the TypeBuilder when we
encounter them. That would change the mapping of later indices to types, and
would change the meaning of previous type definitions that use those later
indices. This is not a problem in the isorecursive system because in that system
type definitions are not allowed to use later indices.
|
|
|
|
| |
We can only call getHeapType if it is indeed a function type. Otherwise we should
show the error and move on.
|
|
|
|
| |
The IsorecursiveTest.CanonicalizeSelfReferences has been frequently failing on
Windows and MacOS CI. Disable it for now until I can investigate thoroughly.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Generally we try to order types by decreasing use count so that frequently used
types get smaller indices. For the equirecursive and nominal systems, there are
no contraints on the ordering of types, so we just have to sort them according
to their use counts. For the isorecursive type system, however, there are a
number of ordering constraints that have to be met for the type section to be
valid. First, types in the same recursion group must be adjacent so they can be
grouped together. Second, groups must be ordered topologically so that they only
refer to types in themselves or prior groups.
Update type ordering to produce a valid isorecursive output by performing a
topological sort on the recursion groups. While performing the sort, prefer to
visit and finish processing the most used groups first as a heuristic to improve
the final ordering.
Do not reorder types within groups, since doing so would change type identity
and could affect the external interface of the module. Leave that reordering to
an optimization pass (not yet implemented) that users can explicitly opt in to.
|
|
|
|
|
| |
Add the missing initialization instructions to the `Building` section.
Fixes #4488.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GlobalManager is another class that added complexity in the interpreter logic,
and did not help. In fact it hurts extensibility, as when one wants to extend the
interpreter one has another class to customize, and it is templated on the main
runner, so again as #4479 we end up with annoying template cycles.
This simply removes that class. That makes the interpreter code strictly
simpler. Applying that change to wasm-ctor-eval also ends up fixing a
pre-existing bug, so this PR gets testing through that.
The ctor-eval issue was that we did not extend the GlobalManager properly
in the past: we checked for accesses on imported globals there, but not in
the main class, i.e., not on global.get operations. Needing to do things in
two places is an example of the previous complexity. The fix is simply to
implement visitGlobalGet in one place, and remove all the GlobalManager
logic added in ctor-eval, which then gets a lot simpler as well.
The new imported-global-2.wast checks for that bug (a global.get of an
import should stop us from evalling). Existing tests cover the other cases,
like it being ok to read a non-imported global, etc. The existing test
indirect-call3.wast required a slight change: There was a global.get of
an imported global, which was ignored in the place it happened (an init
of an elem segment); the new code checks all global.gets, so it now
catches that.
|
|
|
|
|
| |
Update the HeapType constructors that take Signature, Structs, and Arrays to
work properly with isorecursive typing. This is particularly important for the
Signature constructor, which is used frequently throughout the code base.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Isorecursive canonicalization is similar to equirecursive canonicalization in
that it deduplicates types based on their structure, but unlike equirecursive
canonicalization, isorecursive canonicalization does not need to minimize the
type definition first. Another difference is that structures are deduplicated at
the level of recursion groups rather than individual types, so we cannot reuse
the existing `FiniteShapeHasher` and `FiniteShapeEquator` utilities. Instead,
introduce a new `RecGroupStructure` wrapper that provides equality comparison
and hashing very similar to the former utilities.
Another feature of the isorecursive type system is that its constraints on the
order of recursion groups means that they are already topologically sorted and
can be incrementally canonicalized in a bottom-up manner. This incremental
canonicalization means that the `RecGroupStructure` utility can assume that all
child `HeapTypes` have already been canonicalized and can avoid traversing
anything but the top-level type definitions in the wrapped recursion group. The
only exception is self-references into the wrapped recursion group itself, which
may not be canonical. That special case is detected and handled without any
nontrivial extra work.
Overall, the canonicalization algorithm traverses each type definition once to
canonicalize its `HeapType` use sites, then once to hash its recursion group
structure, then finally once to canonicalize its `Type` use sites in
`globallyCanonicalize`, giving only linear time complexity.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
class (#4479)
As recently discussed, the interpreter code is way too complex. Trying to add
ctor-eval stuff I need, I got stuck and ended up spending some time to get rid
of some of the complexity.
We had a ModuleInstanceBase class which was basically an instance of a
module, that is, an execution of it. And internally we have RuntimeExpressionRunner
which is a runner that integrates with the ModuleInstanceBase - basically, it uses
the runtime info to execute code. For example, the MIB has globals info, and the
RER would read it from there.
But these two classes are really just one functionality - an execution of a module.
We get rid of some complexity by removing the separation between them, ending
up with a class that can run a module.
One set of problems we avoid is that we can now extend the single class in a
simple way. Before, we would need to extend both - and inform each other of
those changes. That gets "fun" with CRTP which we use everywhere. In other
words, each of the two classes depended on the other / would need to be
templated on the other. Specifically, MIB.callFunction would need to be given
the RER to run with, and so that would need to be templated on it. This ends up
leading to a bunch more templating all around - all complexity that we just
don't need. See the simplification to the wasm-ctor-eval for some of that (and
even worse complexity would have been needed without this PR in the next
steps for that tool to eval GC stuff).
The final single class is now called ModuleRunner.
Also fixes a pre-existing issue uncovered by this PR. We had the delegate
target on the runner, but it should be tied to a function scope. This happened
to not be a problem if one always created a new runner for each scope, but
this PR makes the runner longer-lived, so the stale data ended up mattering.
The PR moves that data to the proper place.
Note: Diff without whitespace is far, far smaller.
|
|
|
|
|
| |
We emitted the right text to stdout to indicate a trap in one code path, but did
not return a Trap from the function. As a result, we'd continue and hit the
assert on the next line.
|
| |
|
|
|
|
|
| |
Storing the rec group index on the HeapTypeInfo avoids having to do a linear
scan through the rec group to find the index for a particular type. This will
be important for isorecursive canonicalization, which uses rec group indices.
|
|
|
|
|
|
|
| |
Since isorecursive canonicalization will happen incrementally in a bottom-up
manner, it will be more efficient if can more precisely control which types get
updated at each step. Refactor the `CanonicalizationState` interface to
separately expose shallow updating, which updates only the top-level state, and
deep updating of the use sites within HeapTypeInfos.
|
|
|
|
|
|
|
|
|
| |
Add a base class for it, that is templated and can be extended in general
ways, and make callFunction templated on the runner to use as well.
This allows the interpreter's behavior to be customized in a way that we
couldn't so far. wasm-ctor-eval wants to use a special Runner when it
evals a function, so that it can track certain operations, which this will
enable.
|
|
|
|
| |
After emscripten-core/emscripten#15905 lands Emscripten will no longer use it,
and nothing else needs it AFAIK.
|
|
|
|
|
|
|
|
| |
In asm2wasm we modelled debugInfo using special imports. And we tried
to not move them around much. Current debugInfo is tracked on
instructions and is not affected by removing this.
This may have some tiny effect beneficial effect on code size in debug
builds, perhaps.
|
|
|
|
|
|
|
|
|
|
|
| |
(#4399)
Final part of #4265
(i32(x) >= 0) & (i32(y) >= 0) ==> i32(x | y) >= 0
(i64(x) >= 0) & (i64(y) >= 0) ==> i64(x | y) >= 0
(i32(x) == -1) & (i32(y) == -1) ==> i32(x & y) == -1
(i64(x) == -1) & (i64(y) == -1) ==> i64(x & y) == -1
|
|
|
|
|
|
|
| |
When building isorecursive types, validate their relationships according to the
rules described in https://github.com/WebAssembly/gc/pull/243. Specifically,
supertypes must be declared before their subtypes to statically prevent cycles
and child types must be declared either before or in the same recursion group as
their parents.
|
|
|
|
|
| |
Rewrite AsyncifyFlow.process to use stack instead of recursive call.
This patch resolves #4401
|
|
|
|
|
|
|
|
|
|
|
| |
It is possible for type building to fail, for example if the declared nominal
supertypes form a cycle or are structurally invalid. Previously we would report
a fatal error and kill the program from inside `TypeBuilder::build()` in these
situations, but this handles errors at the wrong layer of the code base and is
inconvenient for testing the error cases.
In preparation for testing the new error cases introduced by isorecursive
typing, make type building fallible and add new tests for existing error cases.
Also fix supertype cycle detection, which it turns out did not work correctly.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In `--hybrid` isorecursive mode, associate each defined type with a recursion
group, represented as a `(rec ...)` wrapping the type definitions in the text
format. Parse that text format, create the rec groups using a new TypeBuilder
method, and print the rec groups in the printer.
The only semantic difference rec groups currently make is that if one type in a
rec group will be included in the output, all the types in that rec group will
be included. This is because changing a rec group in any way (for example by
removing a type) changes the identity of the types in that group in the
isorecursive type system. Notably, rec groups do not yet participate in
validation, so `--hybrid` is largely equivalent to `--nominal` for now.
|
|
|
|
|
|
|
| |
This function call now takes the address (which by defintion is outside
of the stack range) that the program was attempting to set SP to.
This allows emscripten to provide a more useful error message on stack
over/under flow.
|