| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Previously we limited printing in a single Literals. But we can have
infinitely recursive GC literals, or just huge graphs even without
infinite recursion where no single Literals is that big (but we still
get exponential blowup). This PR adds a general limit on how much
we print once we start to print a Literal or Literals.
|
|
|
|
|
|
|
|
|
| |
Without this, a massive GC array for example can end up being printed as a huge
amount of output, which is a problem in the fuzzer. I noticed this because a testcase
was taking minutes to run.
Perhaps we should add an option (BINARYEN_PRINT_FULL=1?) to disable this limit,
but let's see if we ever need that.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A cycle of data is something we can't just naively emit as wasm globals. If
at runtime we end up, for example, with an object A that refers to itself,
then we can't just emit
(global $A
(struct.new $A
(global.get $A)))
The struct.get is of this very global, and such a self-reference is invalid. So
we need to break such cycles as we emit them. The simple idea used here
is to find paths in the cycle that are nullable and mutable, and replace the
initial value with a null that is fixed up later in the start function:
(global $A
(struct.new $A
(ref.null $A)))
(func $start
(struct.set
(global.get $A)
(global.get $A)))
)
This is not optimal in terms of breaking cycles, but it is fast (linear time)
and simple, and does well in practice on j2wasm (where cycles in fact
occur).
|
|
|
|
|
|
|
| |
To allow the external and internal reference values to be differentiated yet
round-trippable, set the `Literal` type to externref on external references, but
keep the gcData the same for both. The only exception is for i31 references, for
which the externalized version gets a `gcData` that contains a copy of the
original i31 literal.
|
|
|
| |
This is enough to test RSE.
|
|
|
| |
This is enough for DAE and other opts to run on string consts.
|
|
|
|
|
|
|
|
|
|
| |
Store string data as GC data. Inefficient (one Const per char), but ok for now.
Implement string.new_wtf16 and string.const, enough for basic testing.
Create strings in makeConstantExpression, which enables ctor-eval support.
Print strings in fuzz-exec which makes testing easier.
|
|
|
|
|
|
| |
`struct` has replaced `data` in the upstream spec, so update Binaryen's types to
match. We had already supported `struct` as an alias for data, but now remove
support for `data` entirely. Also remove instructions like `ref.is_data` that
are deprecated and do not make sense without a `data` type.
|
|
|
|
|
|
|
|
|
| |
In order to test them, fix the binary and text parsers to accept passive data
segments even if a module has no memory. In addition to parsing and emitting the
new instructions, also implement their validation and interpretation. Test the
interpretation directly with wasm-shell tests adapted from the upstream spec
tests. Running the upstream spec tests directly would require fixing too many
bugs in the legacy text parser, so it will have to wait for the new text parser
to be ready.
|
|
|
|
|
|
|
|
|
| |
`array` is the supertype of all defined array types and for now is a subtype of
`data`. (Once `data` becomes `struct` this will no longer be true.) Update the
binary and text parsing of `array.len` to ignore the obsolete type annotation
and update the binary emitting to emit a zero in place of the old type
annotation and the text printing to print an arbitrary heap type for the
annotation. A follow-on PR will add support for the newer unannotated version of
`array.len`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These types, `none`, `nofunc`, and `noextern` are uninhabited, so references to
them can only possibly be null. To simplify the IR and increase type precision,
introduce new invariants that all `ref.null` instructions must be typed with one
of these new bottom types and that `Literals` have a bottom type iff they
represent null values. These new invariants requires several additional changes.
First, it is now possible that the `ref` or `target` child of a `StructGet`,
`StructSet`, `ArrayGet`, `ArraySet`, or `CallRef` instruction has a bottom
reference type, so it is not possible to determine what heap type annotation to
emit in the binary or text formats. (The bottom types are not valid type
annotations since they do not have indices in the type section.)
To fix that problem, update the printer and binary emitter to emit unreachables
instead of the instruction with undetermined type annotation. This is a valid
transformation because the only possible value that could flow into those
instructions in that case is null, and all of those instructions trap on nulls.
That fix uncovered a latent bug in the binary parser in which new unreachables
within unreachable code were handled incorrectly. This bug was not previously
found by the fuzzer because we generally stop emitting code once we encounter an
instruction with type `unreachable`. Now, however, it is possible to emit an
`unreachable` for instructions that do not have type `unreachable` (but are
known to trap at runtime), so we will continue emitting code. See the new
test/lit/parse-double-unreachable.wast for details.
Update other miscellaneous code that creates `RefNull` expressions and null
`Literals` to maintain the new invariants as well.
|
|
|
| |
Change `standardizeNaN` to take a `Literal` to reduce usage verbosity.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#4985)
x + nan -> nan'
x - nan -> nan'
x * nan -> nan'
x / nan -> nan'
min(x, nan) -> nan'
max(x, nan) -> nan'
where nan' is canonicalized nan of rhs
x != nan -> 1
x == nan -> 0
x >= nan -> 0
x <= nan -> 0
x > nan -> 0
x < nan -> 0
|
|
|
|
|
|
|
| |
The GC proposal has split `any` and `extern` back into two separate types, so
reintroduce `HeapType::ext` to represent `extern`. Before it was originally
removed in #4633, externref was a subtype of anyref, but now it is not. Now that
we have separate heaptype type hierarchies, make `HeapType::getLeastUpperBound`
fallible as well.
|
|
|
|
|
|
|
| |
RTTs were removed from the GC spec and if they are added back in in the future,
they will be heap types rather than value types as in our implementation.
Updating our implementation to have RTTs be heap types would have been more work
than deleting them for questionable benefit since we don't know how long it will
be before they are specced again.
|
|
|
|
|
|
| |
We already require non-null literals to have non-null types, but with this
change we can enforce that constraint by construction. Also remove the default
behavior of creating a function reference literal with heap type `func`, since
there is always a more specific function type to use.
|
|
|
|
|
| |
The encoding here is simple: we store i31 values in the literal.i32
field. The top bit says if a value exists, which means literal.i32 == 0 is the
same as null.
|
|
|
|
|
|
|
|
|
| |
Basic reference types like `Type::funcref`, `Type::anyref`, etc. made it easy to
accidentally forget to handle reference types with the same basic HeapTypes but
the opposite nullability. In principle there is nothing special about the types
with shorthands except in the binary and text formats. Removing these shorthands
from the internal type representation by removing all basic reference types
makes some code more complicated locally, but simplifies code globally and
encourages properly handling both nullable and non-nullable reference types.
|
|
|
|
|
|
|
|
| |
This starts to implement the Wasm Strings proposal
https://github.com/WebAssembly/stringref/blob/main/proposals/stringref/Overview.md
This just adds the types.
|
|
|
|
|
|
|
|
| |
Taking a Type is redundant as we only care about the heap type -
the nullability must be Nullable.
This avoids needing an assertion in the function, that is, it makes
the API more type-safe.
|
|
|
|
|
|
| |
Remove `Type::externref` and `HeapType::ext` and replace them with uses of
anyref and any, respectively, now that we have unified these types in the GC
proposal. For backwards compatibility, continue to parse `extern` and
`externref` and maintain their relevant C API functions.
|
|
|
| |
As proposed in https://github.com/WebAssembly/relaxed-simd/issues/52.
|
|
|
|
|
|
|
| |
Handle the isBasic() case first - that inlined function is very fast to call,
and it is the common case. Also, do not do unnecessary work there: just
write out what we need, instead of always doing a memcpy of 16 bytes.
This makes us over 2x faster on the benchmark in #4452
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds relaxed-simd instructions based on the current status of the
proposal
https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md.
Binary opcodes are based on what is listed in
https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md#binary-format.
Text names are not fixed yet, and some sort sort of names that maps to
the non-relaxed versions are chosen for this prototype.
Support for these instructions have been added to LLVM via builtins,
adding support here will allow Emscripten to successfully compile files
that use those builtins.
Interpreter support has also been added, and they delegate to the
non-relaxed versions of the instructions.
Most instructions are implemented in the interpreter the same way as the non-relaxed
simd128 instructions, except for fma/fms, which is always fused.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allocation and cast instructions without explicit RTTs should use the canonical
RTTs for the given types. Furthermore, the RTTs for nominal types should reflect
the static type hierarchy. Previously, however, we implemented allocations and
casts without RTTs using an alternative system that only used static types
rather than RTT values. This alternative system would work fine in a world
without first-class RTTs, but it did not properly allow mixing instructions that
use RTTs and instructions that do not use RTTs as intended by the M4 GC spec.
This PR fixes the issue by using canonical RTTs where appropriate and cleans up
the relevant casting code using std::variant.
|
|
|
|
| |
This helps prevent bugs where we assume that the GCData has either a HeapType or
Rtt without checking. Indeed, one such bug is found and fixed.
|
| |
|
|
|
|
| |
interpreter (#4023)
|
|
|
|
| |
(#4027)
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
Like a few other SIMD operations, this i64x2.bitmask had not been implemented in
the interpreter yet. Unlike the others, i64x2.bitmask has type i32 rather than
type v128, so Precompute was not skipping it, leading to a crash, as in
https://github.com/emscripten-core/emscripten/issues/14629. Fix the problem by
implementing i64x2.bitmask in the interpreter.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This is the same as rtt.sub, but creates a "new" rtt each time. See
https://docs.google.com/document/d/1DklC3qVuOdLHSXB5UXghM_syCh-4cMinQ50ICiXnK3Q/edit#
The old Literal implementation of rtts becomes a little more complex here,
as it was designed for the original spec where only structure matters. It may
be worth a complete redesign there, but for now as the spec is in flux I think
the approach here is good enough.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes precomputation on GC after #3803 was too optimistic.
The issue is subtle. Precompute will repeatedly evaluate expressions and
propagate their values, flowing them around, and it ignores side effects
when doing so. For example:
(block
..side effect..
(i32.const 1)
)
When we evaluate that we see there are side effects, but regardless of them
we know the value flowing out is 1. So we can propagate that value, if it is
assigned to a local and read elsewhere.
This is not valid for GC because struct.new and array.new have a "side
effect" that is noticeable in the result. Each time we call struct.new we get a
new struct with a new address, which ref.eq can distinguish. So when this
pass evaluates the same thing multiple times it will get a different result.
Also, we can't precompute a struct.get even if we know the struct, not unless
we know the reference has not escaped (where a call could modify it).
To avoid all that, do not precompute references, aside from the trivially safe ones
like nulls and function references (simple constants that are the same each time
we evaluate the expression emitting them).
precomputeExpression() had a minor bug which this fixes. It checked the type
of the expression to see if we can create a constant for it, but really it should
check the value - since (separate from this PR) we have no way to emit a
"constant" for a struct etc. Also that only matters if replaceExpression is true, that
is, if we are replacing with a constant; if we just want the value internally, we have
no limit on that.
Also add Literal support for comparing GC refs, which is used by ref.eq. Without
that tiny fix the tests here crash.
This adds a bunch of tests, many for corner cases that we don't handle (since
the PR makes us not propagate GC references). But they should be helpful
if/when we do, to avoid the mistakes in #3803
|
|
|
|
| |
Also removes experimental SIMD instructions that were not included in the final
spec proposal.
|
|
|
|
|
|
|
|
|
| |
We added isGCData() before we had dataref. But now there is a clear
parallel of Function vs Data. This PR makes us more consistent there,
renaming isGCData to isData and using that throughout.
This also fixes a bug where the old isGCData just checked if the input
was an Array or a Struct, and ignored the data heap type itself. It is not
possible to test that, however, due to other bugs, so that is deferred.
|
|
|
|
|
|
|
|
|
|
| |
The binary spec
(https://docs.google.com/document/d/1yAWU3dbs8kUa_wcnnirDxUu9nEBsNfq0Xo90OWx6yuo/edit#)
lists `dataref` after `i31ref`, and `dataref` also comes after `i31ref`
in its binary code in the value-increasing order. This reorders these
two in wasm-type.h and other places, although in most of those places
the order is irrelevant.
This also adds C and JS API for `dataref`.
|
|
|
| |
This removes `exnref` type and `br_on_exn` instruction.
|
|
|
|
|
| |
This is not 100% of everything, but is enough to get tests passing, which
includes full binary and text format support, getting all switches to compile
without error, and some additions to InstrumentLocals.
|
|
|
|
|
|
|
|
| |
To handle both nullable and non-nullable i31s and other heap types, we cannot
just look at the isBasic case (which is just one of the two).
This may fix this issue on the release builder:
https://github.com/WebAssembly/binaryen/runs/1669944081?check_suite_focus=true
but the issue does not reproduce locally, so I worry it is something else...
|
|
|
|
|
| |
builder for some inexplicable reason (#3477)
See #3459
|
|
|
|
|
|
| |
As proposed in https://github.com/WebAssembly/simd/pull/380, using the opcodes
used in LLVM and V8. Since these opcodes overlap with the opcodes of
i64x2.all_true and i64x2.any_true, which have long since been removed from the
SIMD proposal, this PR also removes those instructions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds info to RTT literals so that they can represent the chain of
rtt.canon/sub commands that generated them, and it adds an internal
RTT for each GC allocation (array or struct).
The approach taken is to simply store the full chain of rtt.sub types
that led to each literal. This is not efficient, but it is simple and seems
sufficient for the semantics described in the GC MVP doc - specifically,
only the types matter, in that repeated executions of rtt.canon/sub
on the same inputs yield equal outputs.
This PR fixes a bunch of minor issues regarding that, enough to allow testing
of the optimization and execution of ref.test/cast.
|
|
|
|
|
|
| |
- i64x2.eq (https://github.com/WebAssembly/simd/pull/381)
- i64x2 widens (https://github.com/WebAssembly/simd/pull/290)
- i64x2.bitmask (https://github.com/WebAssembly/simd/pull/368)
- signselect ops (https://github.com/WebAssembly/simd/pull/124)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds rtt.canon and rtt.sub together with RTT type support
that is necessary for them. Together this lets us test roundtripping the
instructions and types.
Also fixes a missing traversal over globals in collectHeapTypes,
which the example from the GC docs requires, as the RTTs are in
globals there.
This does not yet add full interpreter support and other things. It
disables initial contents on GC in the fuzzer, to avoid the fuzzer
breaking.
Renames the binary ID for exnref, which is being removed from
the spec, and which overlaps with the binary ID for rtt.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the first instruction that uses a GC Struct or Array, so it's where
we start to actually need support in the interpreter for those values, which
is added here.
GC data is modeled as a gcData field on a Literal, which is just a
Literals. That is, both a struct and an array are represented as an
array of values. The type which is alongside would indicate if it's a
struct or an array. Note that the data is referred to using a shared_ptr
so it should "just work", but we'll only be able to really test that once we
add struct.new and so can verify that references are by reference and
not value, etc.
As the first instruction to care about i8/16 types (which are only possible
in a Struct or Array) this adds support for parsing and emitting them.
This PR includes fuzz fixes for some minor things the fuzzer found, including
some bad printing of not having ResultTypeName in necessary places
(found by the text format roundtripping fuzzer).
|