| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Remove old, experimental instructions and type encodings that will not be
shipped as part of WasmGC. Updating the encodings and text format to match the
final spec is left as future work.
|
|
|
|
|
|
| |
Before we always created if-elses. Now we also create an If with one arm some of
the time, when we can.
Also, sometimes make one if arm unreachable, if we have two arms.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement support in the type system for final types, which are not allowed to
have any subtypes. Final types are syntactically different from similar
non-final types, so type canonicalization is made aware of finality. Similarly,
TypeMerging and TypeSSA are updated to work correctly in the presence of final
types as well.
Implement binary and text parsing and emitting of final types. Use the standard
text format to represent final types and interpret the non-standard
"struct_subtype" and friends as non-final. This allows a graceful upgrade path
for users currently using the non-standard text format, where they can update
their code to use final types correctly at the point when they update to use the
standard format. Once users have migrated to using the fully expanded standard
text format, we can update update Binaryen's parsers to interpret the MVP
shorthands as final types to match the spec without breaking those users.
To make it safe for V8 to independently start interpreting types declared
without `sub` as final, also reserve that shorthand encoding only for types that
have no strict subtypes.
|
|
|
|
|
|
|
|
| |
Previously we incorrectly used "strict" to mean the immediate subtypes of a
type, when in fact a strict subtype of a type is any subtype excluding the type
itself. Rename the incorrect `getStrictSubTypes` to `getImmediateSubTypes`,
rename the redundant `getAllStrictSubTypes` to `getStrictSubTypes`, and rename
the redundant `getAllSubTypes` to `getSubTypes`. Fixing the capitalization of
"SubType" to "Subtype" is left as future work.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This capability was originally introduced to support calculating LUBs in the
equirecursive type system, but has not been needed for anything except tests
since the equirecursive type system was removed. Since building basic heap types
is no longer useful and was a source of significant complexity, remove the APIs
that allowed it and the tests that used those APIs.
Also remove test/example/type-builder.cpp, since a significant portion of it
tested the removed APIs and the rest is already better tested in
test/gtest/type-builder.cpp.
|
| |
|
| |
|
|
|
|
|
| |
And since the only type system left is the standard isorecursive type system,
remove `TypeSystem` and its associated APIs entirely. Delete a few tests that
only made sense under the isorecursive type system.
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we emit e.g. a struct.get's reference, this PR makes us prefer a non-nullable
value, and even to reuse an existing local if possible. By doing that we reduce
the risk of a trap, and also by using locals we end up testing operations on the
same data, like this:
x = new A();
x.a = ..
foo(x.a)
In contrast, without this PR each of those x. uses might be new A().
|
|
|
|
|
| |
After this change, the only type system usable from the tools will be the
standard isorecursive type system. The nominal type system is still usable via
the API, but it will be removed entirely in a follow-on PR.
|
|
|
|
|
|
|
|
|
|
|
| |
Casting (ref nofunc) to (ref func) seems like it can succeed based on the rule
of "if it's a subtype, it can cast ok." But the fuzzer found a corner case where that
leads to a validation error (see testcase).
Refactor the cast evaluation logic to handle uninhabitable refs directly, and
return Unreachable for them (since the cast cannot even be reached).
Also reorder the rule checks there to always check for a non-nullable cast
of a bottom type (which always fails).
|
|
|
|
| |
A return value was unused, and we have BranchUtils::operateOnScopeNameDefs now
which can replace old manual code.
|
| |
|
|
|
|
|
|
|
|
| |
Without this, in certain complex operations we could end up calling a nested
make() operation that included nontrivial things, which could cause problems.
The specific problem I encountered was in fixAfterChanges() we tried to fix up
a duplicate label, but calling makeTrivial() emitted something very large that
happened to include a new block with a new label nested under a struct.get,
and that block's label conflicted with a label we'd already processed.
|
| |
|
| |
|
| |
|
|
|
| |
Don't use a fixed 10% chance to mutate, but pick a mutation rate in each function.
|
|
|
|
|
|
|
| |
We already did this for the first memory, and just needed to loop to handle initial
content in the test suite that has multiple memories.
Also clean up that code while I'm around, to avoid repeating
wasm.memories[0] all the time.
|
|
|
|
|
|
|
|
|
| |
Repurpose makeBasicRef, makeCompoundRef to generate not just "constant"
refs but any reference, and use those to create StructNew/ArrayNew.
The key changes are to add makeCompoundRef to make(), and to make
the function call make() for children, where possible, instead of just
makeTrivial(). We also replace the i31-specific path with a call to
makeBasicRef which handles i31 among other things.
|
|
|
|
|
|
|
|
|
|
| |
All top-level Module elements are identified and referred to by Name, but for
historical reasons element and data segments were referred to by index instead.
Fix this inconsistency by using Names to refer to segments from expressions that
use them. Also parse and print segment names like we do for other elements.
The C API is partially converted to use names instead of indices, but there are
still many functions that refer to data segments by index. Finishing the
conversion can be done in the future once it becomes necessary.
|
| |
|
|
|
|
|
|
|
|
|
| |
Before this PR we iterated over an unordered set. Replace that with an
iteration on a vector. (Also, the value in the set was not even used, so
this should even be faster.)
Add random names in the fuzzer to types, the lack of which is I believe
the reason this was not detected before.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The fuzzer had code to avoid emitting `global.get` of locally defined (i.e.
non-imported) globals in global initializers and data segment offsets, but that
code only handled top-level `global.get` because it predated the extended-const
proposal. Unfortunately this bug went undetected until #5557, which fixed the
validator to make it reject invalid uses of `global.get` in constant
expressions.
Fix the bug so the validator no longer produces invalid modules.
|
|
|
|
|
|
|
|
|
|
|
| |
The nesting limit of around 20 was enough to cause exponential blowup. A 20K
input file lead to a 2GB wasm in one case I saw (!) which takes many seconds to
fuzz.
Instead, reduce the limit, and also check if random tells us that the random
input is done; when that's done we should stop, which limits us to O(input size).
Also do this for non-nullable types, and handle that in globals (we cannot emit a
RefAsNulNull there, so switch the global type if necessary).
|
|
|
|
|
|
| |
Even with a 1% chance of a huge array, there is a second problem aside from
hitting an allocation failure, which is DoS - building such a huge array of
Literals takes noticeable time in the fuzzer. Instead, just limit array max sizes,
which is consistent with what we do for struct sizes etc.
|
|
|
|
| |
Only rarely return an uninhabitable subtype of an inhabitable one. This
avoids a major source of uninhabitability and immediate traps.
|
|
|
|
| |
In particular, the removed code path here that did a RefAsNonNull of a null
was causing a lot of code to just trap.
|
|
|
|
|
|
|
|
|
|
| |
Previously we emitted it early, and would then modify it in random ways
like other initial content. But this function is called frequently during
execution, so if we were unlucky and modded that function to trap then
basically all other functions would trap as well.
After fixing this, some places assert on not having any functions or types
to pick a random one from, so fix those places too.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this we generate random GC types that may be used in creating
instructions later.
We don't create many instructions yet, which will be the next step after
this.
Also add some trivial assertions in some places, that have helped
debugging in the past.
Stop fuzzing TypeMerging for now due to #5556 , which this PR
uncovers.
|
| |
|
|
|
|
|
| |
The main fuzzer needs to be able to filter out uninhabitable types and the type
fuzzer has code for finding uninhabitable types. Move and refactor the code to
expose a `getInhabitable` function that can be used for both purposes.
|
|
|
|
|
|
| |
Function references are always inhabitable because functions can be created with
any function type, even types that refer to uninhabitable types. Take advantage
of this by skipping function references when finding non-nullable reference
cycles that cause uninhabitability.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some valid GC types, such as non-nullable references to bottom heap types and
types that contain non-nullable references to themselves, are uninhabitable,
meaning it is not possible to construct values of those types. This can cause
problems for the fuzzer, which generally needs to be able to construct values of
arbitrary types.
To simplify things for the fuzzer, introduce a utility for transforming type
graphs such that all their types are inhabitable. The utility performs a DFS to
find cycles of non-nullable references and breaks those cycles by introducing
nullability.
The new utility is itself fuzzed in the type fuzzer.
|
|
|
|
| |
Only very rarely ask to create a huge array, as that can easily hit a host
size limit and cause a run to be ignored.
|
|
|
| |
It is not a constant instruction and cannot be used in globals.
|
|
|
|
| |
(#5529)
|
| |
|
|
|
|
|
|
|
|
| |
To match the standard instruction name, rename the expression class without
changing any parsing or printing behavior. A follow-on PR will take care of the
functional side of this change while keeping support for parsing the old name.
This change will allow `ArrayInit` to be used as the expression class for the
upcoming `array.init_data` and `array.init_elem` instructions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously the idea was that we started with HANG_LIMIT = 10 or so, and we'd decrement
it by one in each potentially-recursive call and loop entry. When we reached 0 we'd start
to unwind the stack. Then, after we unwound it all the way, we'd reset HANG_LIMIT before
calling the next export.
That approach adds complexity that each "execution wrapper", like for JS or for --fuzz-exec,
had to manually reset HANG_LIMIT. That was done by calling an export. Calls to those
exports had to appear in various places, which is sort of a hack.
The new approach here does the following when the hang limit reaches zero: It resets
HANG_LIMIT, and it traps. The trap unwinds the call stack all the way out. When the next
export is called, it will have a fresh hang limit since we reset it before the trap.
This does have downsides. Before, we did not always trap when we hit the hang limit but
rather we'd emit something unreachable, like a return. The idea was that we'd leave the
current function scope at least, so we don't hang forever. That let us still execute a small
amount of code "on the way out" as we unwind the stack. I'm not sure it's worth the
complexity for that.
The advantages of this PR are to simplify the code, and also it makes more fuzzing
approaches easy to implement. I'd like to add a wasm-ctor-eval fuzzer, and having to add
hacks to call the hang limit init export in it would be tricky. With this PR, the execution
model is simple in the fuzzer: The exports are called one by one, in order, and that's it -
no extra magic execution needs to be done.
Also bump the hang limit from 10 to 100, just to give some more chance for code to run.
|
|
|
|
|
| |
This is necessary to avoid fuzzer breakage after #5497 as it added a test with
strings. We could also ignore that file, like we do for other string files, but this
was not much work so just implement it.
|
|
|
|
| |
Half the time, never add any unreachable code. This ensures we run the
most code we possibly can half the time, at least.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes the fuzzer replace things with an unreachable instruction in
rare situations. The hope was to find bugs like #5487, but instead it's
mostly found bugs in the inliner actually (#5492, #5493).
This also fixes an uncovered bug in the fuzzer, where we refinalized in
more than one place. It is unsafe to do so before labels are fixed up
(as duplicate labels can confuse us as to which types are needed; this
is actually the same issue as in #5492). To fix that, remove the extra
refinalize that was too early, and also rename the fixup function since
it does a general fixup for all the things.
|
|
|
|
|
|
|
|
|
|
| |
Support function subtyping with contravariant parameters and covariant results.
The actual change is a single line in wasm-type.cpp, so most of the patch is
updating the type fuzzer to generate interesting function subtypes. Since
function parameters are covariant, generating a function subtype requires
generating supertypes of its parameter types, which required new functionality
in the fuzzer. Also update the fuzzer to choose to reuse types at a finer grain,
so for example individual function parameters or results might be reused
unmodified while other parameters or results are still modified.
|
|
|
|
|
|
| |
`struct` has replaced `data` in the upstream spec, so update Binaryen's types to
match. We had already supported `struct` as an alias for data, but now remove
support for `data` entirely. Also remove instructions like `ref.is_data` that
are deprecated and do not make sense without a `data` type.
|