| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
We already ignore OOMs in the interpreter. This adds the syntax for V8, which
I saw an error on now (on an array.new of a massive size).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We used to have a wasm-merge tool but removed it for a lack of use cases. Recently
use cases have been showing up in the wasm GC space and elsewhere, as people are
using more diverse toolchains together, for example a project might build some C++
code alongside some wasm GC code. Merging those wasm files together can allow
for nice optimizations like inlining and better DCE etc., so it makes sense to have a
tool for merging.
Background:
* Removal: #1969
* Requests:
* wasm-merge - why it has been deleted #2174
* Compiling and linking wat files #2276
* wasm-link? #2767
This PR is a compete rewrite of wasm-merge, not a restoration of the original
codebase. The original code was quite messy (my fault), and also, since then
we've added multi-memory and multi-table which makes things a lot simpler.
The linking semantics are as described in the "wasm-link" issue #2767 : all we do
is merge normal wasm files together and connect imports and export. That is, we
have a graph of modules and their names, and each import to a module name can
be resolved to that module. Basically, like a JS bundler would do for JS, or, in other
words, we do the same operations as JS code would do to glue wasm modules
together at runtime, but at compile time. See the README update in this PR for a
concrete example.
There are no plans to do more than that simple bundling, so this should not
really overlap with wasm-ld's use cases.
This should be fairly fast as it works in linear time on the total input code. However,
it won't be as fast as wasm-ld, of course, as it does build Binaryen IR for each
module. An advantage to working on Binaryen IR is that we can easily do some
global DCE after merging, and further optimizations are possible later.
|
|
|
|
|
|
|
|
|
| |
This removes the trapping export and all others after it. This avoids a potential
infinite loop that can happen when fuzzing TNH, as if TNH is set and a trap
happens then the optimizer can cause an iloop, and while that is valid, it would hang the
fuzzer. We could check for a timeout, but it is faster and more robust to just
remove the code we can't compare anyhow.
This uses wasm-metadce to remove the exports from the failing one.
|
|
|
|
|
|
|
|
|
|
|
| |
Data/Elem (#5692)
ArrayNewSeg => ArrayNewSegData, ArrayNewSegElem
ArrayInit => ArrayInitData, ArrayInitElem
Basically we remove the opcode and use the class type to differentiate them.
This adds some code but it makes the representation simpler and more compact in
memory, and it will help with #5690
|
|
|
|
|
|
|
|
|
|
|
| |
DCE at the end avoids issues with non-nullable local operations in unreachable
code, which is still being discussed. This PR avoids fuzzer errors for now, but we
should revert it when we have a proper fix.
See
* #5599
* #5665
* https://github.com/WebAssembly/function-references/issues/98
|
|
|
|
|
| |
After this change, the only type system usable from the tools will be the
standard isorecursive type system. The nominal type system is still usable via
the API, but it will be removed entirely in a follow-on PR.
|
|
|
|
|
| |
These complement array.copy, which we already supported, as an initial complete
set of bulk array operations. Replace the WIP spec tests with the upstream spec
tests, lightly edited for compatibility with Binaryen.
|
|
|
|
|
|
|
|
|
|
| |
I ran CheckDeterminism at full throttle overnight (set to 1, and disabled
all other things) and it found a bug, so we should focus on that more.
Also ctor-eval as there is ongoing work there.
I reduced a few other priorities of things that haven't seen bugs in a
very long time and are not high priority.
|
|
|
| |
This is the default, and also used by J2Wasm.
|
| |
|
|
|
|
|
| |
I saw a testcase fail on the internal assertion of the buffer being too small.
Enlarge it to use as much of the memory we have anyhow to reduce that
risk (we can use 15 pages instead of 1, without changing anything else).
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
TypeMerging previously tried to merge types with their supertypes and siblings
in a single step, but this could cause a misoptimization in which a type was
merged with its parent's sibling without being merged with its parent, breaking
subtyping.
Fix the bug by merging with supertypes and siblings separately. Since we now
have multiple merging steps, also take the opportunity to run the sibling
merging step multiple times to exploit more merging opportunities.
Fixes #5556.
|
|
|
|
|
|
|
| |
For example, we might hit an allocation limit in the wasm, but the
optimized wasm might optimize that allocation out. So we need to
ignore comparisons in such cases, as we cannot expect the output
to be identical. We already do similar things for FuzzExec and
#5560 adds it for TrapsNeverHappen; this adds it to CompareVMs.
|
|
|
|
|
|
|
|
|
|
| |
If the program tries to allocate an infinite number of objects, but is
prevented from doing that by a null pointer trap, then after we run
with trapsNeverHappen the trap may fail to occur, and we'll hit the
host limitation on allocations. As a result, we'd be comparing one
run with a trap and one run that is meant to be ignored (as we ignore
runs with host limitations), and before this PR we'd error as we would
expect to find the normal output and not the "ignore this host
limitation" marker.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this we generate random GC types that may be used in creating
instructions later.
We don't create many instructions yet, which will be the next step after
this.
Also add some trivial assertions in some places, that have helped
debugging in the past.
Stop fuzzing TypeMerging for now due to #5556 , which this PR
uncovers.
|
| |
|
|
|
|
|
| |
If this number ever gets high then we would need to look into
why we ignore so much. Right now we seem to end up ignoring
much less than 1% which seems ok.
|
|
|
|
|
| |
We can't just skip host limits (#5534) but must also ignore execution at that
point, as optimizations can change the results if they change whether we reach
a host limit.
|
|
|
|
|
|
|
|
|
| |
This is a (more) standard name for `array.init_static`. (The full upstream name
in the spec repo is `array.new_canon_fixed`, but I'm still hoping we can drop
`canon` from all the instruction names and it doesn't appear elsewhere in
Binaryen).
Update all the existing tests to use the new name and add a test specifically to
ensure the old name continues parsing.
|
|
|
|
|
|
|
|
| |
To match the standard instruction name, rename the expression class without
changing any parsing or printing behavior. A follow-on PR will take care of the
functional side of this change while keeping support for parsing the old name.
This change will allow `ArrayInit` to be used as the expression class for the
upcoming `array.init_data` and `array.init_elem` instructions.
|
|
|
|
|
|
| |
After the recent improvements and fixes this is now simple and the fuzzer found
no more issues overnight for me.
Also adjust some existing frequencies.
|
| |
|
|
|
| |
See WebAssembly/stringref#60
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a type hierarchy has abstract classes in the middle, that is, types that
are never instantiated, then we can optimize casts and other operations
to them. Say in Java that we have `AbstractList`, and it only has one
subclass `IntList` that is ever created, then any place we have an `AbstractList`
we must actually have an `IntList`, or a null. (Or, if no subtype is instantiated,
then the value must definitely be a null.)
The actual implementation does a type mapping, that is, it finds all places
using an abstract type and makes them refer to the single instantiated
subtype (or null). After that change, no references to the abstract type
remain in the program, so this both refines types and also cleans up the
type section.
|
| |
|
|
|
|
|
|
| |
string.from_code_point makes a string from an int code point.
string.new_utf8*_try makes a utf8 string and returns null on a UTF8 encoding
error rather than trap.
|
|
|
| |
See WebAssembly/stringref#58
|
|
|
|
|
|
| |
`struct` has replaced `data` in the upstream spec, so update Binaryen's types to
match. We had already supported `struct` as an alias for data, but now remove
support for `data` entirely. Also remove instructions like `ref.is_data` that
are deprecated and do not make sense without a `data` type.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These operations are deprecated and directly representable as casts, so remove
their opcodes in the internal IR and parse them as casts instead. For now, add
logic to the printing and binary writing of RefCast to continue emitting the
legacy instructions to minimize test changes. The few test changes necessary are
because it is no longer valid to perform a ref.as_func on values outside the
func type hierarchy now that ref.as_func is subject to the ref.cast validation
rules.
RefAsExternInternalize, RefAsExternExternalize, and RefAsNonNull are left
unmodified. A future PR may remove RefAsNonNull as well, since it is also
expressible with casts.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Replace `RefIs` with `RefIsNull`
The other `ref.is*` instructions are deprecated and expressible in terms of
`ref.test`. Update binary and text parsing to parse those instructions as
`RefTest` expressions. Also update the printing and emitting of `RefTest`
expressions to emit the legacy instructions for now to minimize test changes and
make this a mostly non-functional change. Since `ref.is_null` is the only
`RefIs` instruction left, remove the `RefIsOp` field and rename the expression
class to `RefIsNull`.
The few test changes are due to the fact that `ref.is*` instructions are now
subject to `ref.test` validation, and in particular it is no longer valid to
perform a `ref.is_func` on a value outside of the `func` type hierarchy.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The `br_on{_non}_{data,i31,func}` operations are deprecated and directly
representable in terms of the new `br_on_cast` and `br_on_cast_fail`
instructions, so remove their dedicated IR opcodes in favor of representing them
as casts. `br_on_null` and `br_on_non_null` cannot be consolidated the same way
because their behavior is not directly representable in terms of `br_on_cast`
and `br_on_cast_fail`; when the cast to null bottom type succeeds, the null
check instructions implicitly drop the null value whereas the cast instructions
would propagate it.
Add special logic to the binary writer and printer to continue emitting the
deprecated instructions for now. This will allow us to update the test suite in
a separate future PR with no additional functional changes.
Some tests are updated because the validator no longer allows passing non-func
data to `br_on_func`. Doing so has not made sense since we separated the three
reference type hierarchies.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Parse both the folded and unfolded forms of blocks and structure the code to
make supporting additional block instructions like if-else and try-catch
relatively simple.
Parsing block types is extra fun because they may implicitly define new
signature heap types via a typeuse, but only if their types are not given by a
single result type. To figuring out whether a new type may be introduced in all
the relevant parsing stages, always track at least the arity of parsed results.
The parser parses block labels, but more work will be required to support branch
instructions that use them.
|
|
|
|
|
| |
Do not fuzz some new testcases that have imported memories. The fuzzer doesn't seem
to have support for that (it errors when it tries to do operations on them, since the
import hasn't been created).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since #5347 public types are never updated by type optimizations, but the
optimization passes have not yet been updated to take that into account, so they
are all buggy under an open world assumption. In #5359 we worked around many
closed world validation errors in the fuzzer by treating --closed-world like a
feature flag and checking whether it was necessary for fuzzer input, but that
did not prevent the type optimization passes from running under an open world,
so it did not work around all the potential issues.
Work around the problem more thoroughly by not running any type optimization
passes in the fuzzer without --closed-world. Also add logic to those passes to
error out if they are run without --closed-world and update the tests
accordingly.
|
|
|
|
|
| |
An initial content testcase may only work in open world, so check for that
using the existing mechanism of checking if such testcases work with out
feature flags.
|
|
|
|
|
|
|
|
|
| |
The `op` string_view was intentionally created to point into the `buf` buffer so
that reading past its end would still be safe, but some C++ standard library
implementations assert when reading past the end of a string_view. Change the
generated code to read out of `buf` instead to avoid those assertions.
Fixes #5322.
Fixes #5342.
|
|
|
|
|
|
|
| |
We previously supported only the non-standard cast instructions introduced when
we were experimenting with nominal types. Parse the names and opcodes of their
standard counterparts and switch to emitting the standard names and opcodes.
Port all of the tests to use the standard instructions, but add additional tests
showing that the non-standard versions are still parsed correctly.
|
|
|
|
|
|
|
|
| |
This finds types that can be merged into their super: types that add no
fields, and are not used in casts, etc. - so we might as well use the super.
This complements TypeSSA, in that it can merge back the new types that
TypeSSA created, if we never found a use for them. Without this, TypeSSA
can bloat binary size quite a lot (I see 10-20%).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This creates new nominal types for each (interesting) struct.new. That then allows
type-based optimizations to be more precise, as those optimizations will track
separate info for each struct.new, in effect. That is kind of like SSA, however, we
do not handle merges. For example:
x = struct.new $A (5);
print(x.value);
y = struct.new $A (11);
print(y.value);
// => //
x = struct.new $A.x (5);
print(x.value);
y = struct.new $A.y (11);
print(y.value);
After the pass runs each of those struct.new creates a unique type, and type-based
analysis can see that 5 or 11 are the only values written in that type (if nothing else
writes there).
This bloats the type section with the new subtypes, so it is best used with a pass
to merge unneeded duplicate types, which a later PR will add. That later PR will
exactly merge back in the types created here, which are nominally different but
indistinguishable otherwise.
This pass is not enabled by default. It's not clear yet where is the best place to do it,
as it must be balanced by type merging, but it might be better to do multiple
rounds of optimization between the two. Needs more investigation.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this change we default to an open world, that is, we do the safe thing
by default: we no longer assume a closed world. Users that want a closed
world must pass --closed-world.
Atm we just do not run passes that assume a closed world. (We might later
refine them to find which types don't escape and only optimize those.) The
RemoveUnusedModuleElements is an exception in that the closed-world
flag influences one part of its operation, but not the rest.
Fixes #5292
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Do not compare reference values across executions
Since we optimize assuming a closed world, optimizations can change the types
and structure of GC data even in externally-visible ways. Because differences
are expected, the fuzzer already did not compare reference-typed values from
before and after optimizations when running with nominal typing. Update it to
not compare these values under any type system.
* Unpin V8
Our WasmGC output is no longer compatible with the previously pinned version and
the issue that caused us to pin it in the first place has been resolved.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(some.operation
(ref.cast .. (local.get $ref))
(local.get $ref)
)
=>
(some.operation
(local.tee $temp
(ref.cast .. (local.get $ref))
)
(local.get $temp)
)
This can help cases where we cast for some reason but happen to not use the
cast value in all places. This occurs in j2wasm in itable calls sometimes: The
this pointer is is refined, but the itable may be done with an unrefined pointer,
which is less optimizable.
So far this is just inside basic blocks, but that is enough for the cast of itable
calls and other common patterns I see.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Monomorphization finds cases where we send more refined types to a function
than it declares. In such cases we can copy the function and refine the parameters:
// B is a subtype of A
foo(new B());
function foo(x : A) { ..}
=>
foo_B(new B()); // call redirected to refined copy
function foo(x : A) { ..} // unchanged
function foo_B(x : B) { ..} // refined copy
This increases code size so it may not be worth it in all cases. This initial PR is
hopefully enough to start experimenting with this on performance, and so it does
not enable the pass by default.
This adds two variations of monomorphization, one that always does it, and the
default which is "careful": it sees whether monomorphizing lets the refined function
actually be better than the original (say, by removing a cast). If there is no
improvement then we do not make any changes. This saves a significant amount
of code size - on j2wasm the careful version increases by 13% instead of 20% -
but it does run more slowly obviously.
|
|
|
|
|
|
|
|
|
| |
In order to test them, fix the binary and text parsers to accept passive data
segments even if a module has no memory. In addition to parsing and emitting the
new instructions, also implement their validation and interpretation. Test the
interpretation directly with wasm-shell tests adapted from the upstream spec
tests. Running the upstream spec tests directly would require fixing too many
bugs in the legacy text parser, so it will have to wait for the new text parser
to be ready.
|
|
|
|
|
|
|
| |
That PR renamed test/lit/optimize-instructions.wast to
test/lit/optimize-instructions-mvp.wast. However, the fuzzer was explicitly
adding the testto the list of important initial contents under the old name, so
it was failing an assertion that the initial contents existed. Update the fuzzer
to use the new test name.
|
|
|
|
|
|
|
| |
These operations emit a completely different type than their input, so they must be
marked as roots, and not as things that flow values through them (because then
we filter everything out as the types are not compatible).
Fixes #5219
|
| |
|
|
|
|
|
| |
The fuzzer started to fail on the recent externalize/internalize test
that was added in #5175 as we lack interpreter support. Move that to a separate
file and ignore it in the fuzzer for now.
|
|
|
|
|
|
|
|
|
|
| |
Add parsing functions for `memarg`s, the offset and align fields of load and
store instructions. These fields are interesting because they are lexically
reserved words that need to be further parsed to extract their actual values. On
top of that, add support for parsing all of the load and store instructions.
This required fixing a buffer overflow problem in the generated parser code and
adding more information to the signatures of the SIMD load and store
instructions. `SIMDLoadStoreLane` instructions are particularly interesting
because they may require backtracking to parse correctly.
|