| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
| |
See https://github.com/emscripten-core/emscripten/pull/15855
|
|
|
|
|
|
|
| |
If that type is not valid then we cannot even create and finalize the node,
which means we'd hit an assertion inside finalize(), before we reach the
validator.
Fixes #4383
|
|
|
|
|
| |
Update the LUB calculation code to use std::optional rather than out params and
validate LUBs in the fuzzer to ensure that the change is NFC as intended. Also
add HeapType::getLeastUpperBound to the public API as a convenience.
|
|
|
|
|
|
|
|
|
|
| |
Hashing and comparison of nominal HeapTypeInfos previously observed their child
Types, so the Types had to be canonicalized before the HeapTypes. Unfortunately
equirecursive canonicalization requires that the HeapTypes be canonicalized
before the Types, so this was a point of divergence between the two systems.
However, #4394 updated hashing and comparison of nominal types to not depend on
child Types, so now we can harmonize the two systems by having them use the same
`globallyCanonicalize` function to canonicalize their HeapTypes followed by
their Types.
|
|
|
|
|
|
|
| |
Now that caching of "canonical" nominal signatures is handled at a separate
layer, we can remove the separate code paths for hashing and comparing
HeapTypeInfos based on their structure even in nominal mode. Now hashing and
comparing of HeapTypeInfos is uniformly handled by FiniteShapeHasher and
FiniteShapeEquator.
|
|
|
| |
Fixes #4384
|
|
|
|
|
|
|
| |
We have always cached nominal signature types keyed on their signatures to avoid
creating extra nominal types through the `HeapType::HeapType(Signature)`
constructor. However, that logic was previously built into the HeapTypeInfo
canonicalization system. To allow that system to be simplified in future PRs,
separate the caching into its own explicit layer.
|
|
|
|
|
|
|
| |
Types and HeapTypes are inserted into their respective stores either by copying
a reference to a `TypeInfo` or `HeapTypeInfo` or by moving a
`std::unique_ptr<TypeInfo>` or `std::unique_ptr<HeapTypeInfo>`. Previously these
two code paths had separate, similar logic. To reduce deduplication, combine
both code paths into a single method.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We do some postprocessing after parsing `Try` to make sure `delegate`
only targets `try`s and not `block`s:
https://github.com/WebAssembly/binaryen/blob/9659f9b07c1196447edee68fe04c8d7dd2480652/src/wasm/wasm-binary.cpp#L6404-L6426
But in case the outer `try` has neither of `catch` nor `delegate`, the
previous code just return prematurely, skipping the postprocessing part,
resulting in a binary parsing error. This PR removes that early-exiting
code.
Some test outputs have changed because `try`s are assigned labels after
the early exit. But those labels can be removed by other optimization
passes when there is no inner `rethrow` or `delegate` that targets them.
(On a side note, the restriction that `delegate` cannot target a `block`
has been removed a few months ago in the spec, so if a `delegate`
targets a `block`, it means it is just rethrown from that block. But I
still think this is a convenient invariant to hold at least within the
binaryen IR. I'm planning to allow parsing of `delegate` targeting
`block`s later, but I will make them point to `try` when read in the
IR. At the moment the LLVM toolchain does not generate such code.)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
With nominal function types, this change makes it so that we preserve the
identity of the function type used with call_indirect instructions rather than
recreating a function heap type, which may or may not be the same as the
originally parsed heap type, from the function signature during module writing.
This will simplify the type system implementation by removing the need to store
a "canonical" nominal heap type for each unique signature. We previously
depended on those canonical types to avoid creating multiple duplicate function
types during module writing, but now we aren't creating any new function types
at all.
|
|
|
|
|
| |
Check that types that were meant to have a subtype relationship actually do. To
expose the intended subtyping to the fuzzer, expose `subtypeIndices` in the
return value of the type generation function.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As we work toward allowing nominal and structural types to coexist, any
difference in how they can be built or used will be an inconvenient footgun that
we will have to work around. In the spirit of reducing the differences between
the type systems, allow TypeBuilder to construct basic HeapTypes in nominal mode
just as it can in equirecursive mode.
Although this change is a net increase in code complexity for not much
benefit (wasm-opt never needs to build basic HeapTypes), it is also an
incremental step toward getting rid of separate type system modes, so I expect
it to simplify other PRs in the near future.
This change also uncovered a bug in how the type fuzzer generated subtypes of
basic HeapTypes. The generated subtypes did not necessarily have the intended
`Kind`, which caused failures in nominal subtype validation in the fuzzer.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a new fuzzer binary that repeatedly generates random types to find bugs in
the type system implementation. Each iteration creates some number of root types
followed by some number of subtypes thereof. Each built type can contain
arbitrary references to other built types, regardless of their order of
construction.
Right now the fuzzer only finds fatal errors in type building (and in its own
implementation), but it is meant to be extended to check other properties in the
future, such as that LUB calculations work as expected.
The logic for creating types is also intended to be integrated into the main
fuzzer in a follow-on PR so that the main fuzzer can fuzz with arbitrarily more
interesting GC types.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds relaxed-simd instructions based on the current status of the
proposal
https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md.
Binary opcodes are based on what is listed in
https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md#binary-format.
Text names are not fixed yet, and some sort sort of names that maps to
the non-relaxed versions are chosen for this prototype.
Support for these instructions have been added to LLVM via builtins,
adding support here will allow Emscripten to successfully compile files
that use those builtins.
Interpreter support has also been added, and they delegate to the
non-relaxed versions of the instructions.
Most instructions are implemented in the interpreter the same way as the non-relaxed
simd128 instructions, except for fma/fms, which is always fused.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This improves validation of `catch` bodies mostly by checking the
validity of `pop`s.
For every `catch` body:
- Checks if its tag exists
- If the tag's type is none:
- Ensures there shouldn't be any `pop`s
- If the tag's type is not none:
- Checks if there's a single `pop` within the catch body
- Checks if the tag type matches the `pop`'s type
- Checks if the `pop`'s location is valid
For every `catch_all` body:
- Ensures there shuldn't be any `pop`s
This uncovers several bugs related to `pop`s in existing tests, which
this PR also fixes.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allocation and cast instructions without explicit RTTs should use the canonical
RTTs for the given types. Furthermore, the RTTs for nominal types should reflect
the static type hierarchy. Previously, however, we implemented allocations and
casts without RTTs using an alternative system that only used static types
rather than RTT values. This alternative system would work fine in a world
without first-class RTTs, but it did not properly allow mixing instructions that
use RTTs and instructions that do not use RTTs as intended by the M4 GC spec.
This PR fixes the issue by using canonical RTTs where appropriate and cleans up
the relevant casting code using std::variant.
|
|
|
|
| |
This helps prevent bugs where we assume that the GCData has either a HeapType or
Rtt without checking. Indeed, one such bug is found and fixed.
|
|
|
| |
This sets the C++ standard variable in the build to C++17, and makes use of std::optional (a C++17 library feature) in one place, to test that it's working.
|
| |
|
|
|
|
|
|
|
|
| |
Switch from "extends" to M4 nominal syntax
Change all test inputs from using the old (extends $super) syntax to using the
new *_subtype syntax for their inputs and also update the printer to emit the
new syntax. Add a new test explicitly testing the old notation to make sure it
keeps working until we remove support for it.
|
|
|
|
|
|
|
|
|
|
|
| |
Add an assert on not emitting a null name (which would cause
a crash a few lines down on trying to read its bytes). I hit that
when writing a buggy pass that updated field names.
Also fix the case of a type not having a name but some of its
fields having names. We can't test that atm since our text
format requires types to have names anyhow, so this is a
fix for a possible future where we do allow parsing non-named
types.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
Implement parsing the new {func,struct,array}_subtype format for nominal types.
For now, the new format is parsed the same way the old-style (extends X) format
is parsed, i.e. in --nominal mode types are parsed as nominal but otherwise they
are parsed as equirecursive. Intentionally do not parse the new types
unconditionally as nominal for now to allow frontends to update their nominal
text format while continuing to use the workflow of running wasm-opt without
--nominal to lower nominal types to structural types.
|
|
|
|
|
|
|
|
| |
See #4220 - this lets us handle the common case for now of simply having
an identical heap type to the table when the signature is identical.
With this PR, #4207's optimization of call_ref + table.get into
call_indirect now leads to a binary that works in V8 in nominal mode.
|
| |
|
|
|
|
|
|
|
|
| |
These new nominal types do not depend on the global type sytem being changed
with the --nominal flag. Instead, they can coexist with the existing
equirecursive structural types, as required in the new milestone 4 spec. This PR
implements subtyping, upper bounding, canonicalizing, and other type operations
but using the new types in the parsers and elsewhere in Binaryen is left to a
follow-on PR.
|
|
|
|
|
|
|
|
| |
Update the binary format used in --nominal mode to match the format of nominal
types in milestone 4. In particular, types without declared supertypes are now
emitted using the nominal type codes with either `func` or `data` as their
supertypes. This change is hopefully enough to get --nominal mode code running
on V8's milestone 4 implementation until the rest of the type system changes can
be implemented for use without --nominal.
|
|
|
|
|
| |
Before this fix, the first table (index 0) is counted as its element segment
having "no table index" even when its type is not funcref, which could break
things if that table had a more specialized type.
|
|
|
| |
It's deprecated in C++17
|
|
|
|
| |
Adds the part of the spec test suite that this passes (without table.set we
can't do it all).
|
| |
|
|
|
|
|
|
|
|
|
| |
See #4149
This modifies the test added in #4163 which used static casts on
dynamically-created structs and arrays. That was technically not
valid (as we won't want users to "mix" the two forms). This makes that
test 100% static, which both fixes the test and gives test coverage
to the new instructions added here.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
These variants take a HeapType that is the type we intend to cast to,
and do not take an RTT.
These are intended to be more statically optimizable. For now though
this PR just implements the minimum to get them parsing and to get
through the optimizer without crashing.
Spec: https://docs.google.com/document/d/1afthjsL_B9UaMqCA5ekgVmOm75BVFu6duHNsN9-gnXw/edit#
See #4149
|
|
|
|
|
|
|
| |
See also:
spec change: https://github.com/WebAssembly/tool-conventions/pull/170
llvm change: https://reviews.llvm.org/D109595
wabt change: https://github.com/WebAssembly/wabt/pull/1707
emscripten change: https://github.com/emscripten-core/emscripten/pull/15019
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An "intrinsic" is modeled as a call to an import. We could also add new
IR things for them, but that would take more work and lead to less
clear errors in other tools if they try to read a binary using such a
nonstandard extension.
A first intrinsic is added here, call.without.effects This is basically the same
as call_ref except that the optimizer is free to assume the call has no
side effects. Consequently, if the result is not used then it can be optimized
out (as even if it is not used then side effects could have kept it around).
Likewise, the lack of side effects allows more reordering and other
things.
A lowering pass for intrinsics is provided. Rather than automatically
lower them to normal wasm at the end of optimizations, the user must
call that pass explicitly. A typical workflow might be
-O --intrinsic-lowering -O
That optimizes with the intrinsic present - perhaps removing calls
thanks to it - then lowers it into normal wasm - it turns into a call_ref -
and then optimizes further, which would turns the call_ref into a
direct call, potentially inline, etc.
|
|
|
|
|
|
|
| |
array.init is like array.new_with_rtt except that it takes
as arguments the values to initialize the array with (as opposed to
a size and an optional initial value).
Spec: https://docs.google.com/document/d/1afthjsL_B9UaMqCA5ekgVmOm75BVFu6duHNsN9-gnXw/edit#
|
|
|
| |
Helps #3739
|
|
|
|
|
|
| |
Before this, the element segments would be printed as having type
funcref, and then if their table had a specialized type, the element
type would not be a subtype of the table and validation would fail.
|
|
|
|
|
| |
If extra data is found in this section simply propagate it.
Also, remove some dead code from wasm-binary.cpp.
|
|
|
|
|
|
|
|
|
| |
After #4106 we already treat the input file "-" as a shorthand for reading from
stdin at the file.cpp level. However, the "-" input was still treated as a
normal file name at the wasm-io.cpp file and as a result was always treated as
text input. This commit updates wasm-io.cpp to use the stdin code path
supporting both binary and text input for "-".
Fixes #4105 (again).
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This was being set in the creation of Loads in the binary reader, but
forgotten in the SIMD logic - which ends up creating a Load with
type v128, and signed_ was uninitialized.
Very hard to test this, but I saw it "break" hash value computation
which is how I noticed this.
Also initialize the I31 sign field. Now all of them in wasm.h are
properly initialized.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A field in a struct is constant if we can see that in the entire program we
only ever write the same constant value to it. For example, imagine a
vtable type that we construct with the same funcrefs each time, then (if
we have no struct.sets, or if we did, and they had the same value), we
could replace a get with that constant value, since it cannot be anything
else:
(struct.new $T (i32.const 10) (rtt))
..no other conflicting values..
(struct.get $T 0) => (i32.const 10)
If the value is a function reference, then this may allow other passes
to turn what was a call_ref into a direct call and perhaps also get
inlined, effectively a form of devirtualization.
This only works in nominal typing, as we need to know the supertype
of each type. (It could work in theory in structural, but we'd need to do
hard work to find all the possible supertypes, and it would also
become far less effective.)
This deletes a trivial test for running -O on GC content. We have
many more tests for GC at this point, so that test is not needed, and
this PR also optimizes the code into something trivial and
uninteresting anyhow.
|
|
|
| |
The wrong variable was null checked, leading to segfaults.
|
|
|
|
| |
interpreter (#4023)
|