| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Conservatively avoid introducing synchronization bugs by not optimizing
atomic struct.gets at all in GUFA. It is possible that we could be more
precise in the future.
Also remove obsolete logic dealing with the types of null values as a
drive-by. All null values now have bottom types, so the type mismatch
this code checked for is impossible.
|
|
|
|
|
|
|
| |
Implement `ref.i31_shared` the new instruction for creating references
to shared i31s. Implement binary and text parsing and emitting as well
as interpretation. Copy the upstream spec test for i31 and modify it so
that all the heap types are shared. Comment out some parts that we do
not yet support.
|
|
|
|
|
|
|
|
|
| |
Rename instructions `extern.internalize` into `any.convert_extern` and
`extern.externalize` into `extern.convert_any` to follow more closely
the spec. This was changed in
https://github.com/WebAssembly/gc/issues/432.
The legacy name is still accepted in text inputs and in the C and JS
APIs.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we checked late, and as a result might end up failing to optimize when
a sub-pattern could have worked. E.g.
(call
(A)
)
(call
(A)
)
The call cannot be optimized, but the A pattern repeats. Before this PR we'd
greedily focus on the entire call and then fail. After this PR we skip the call
before we commit to which patterns to try to optimize, so we succeed.
Add a isShallowlyGenerative helper here as we compute this step by step as
we go. Also remove a parameter to the generativity code (it did not use the
features it was passed).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds basic support for the new instructions in the new EH proposal
passed at the Oct CG hybrid CG meeting:
https://github.com/WebAssembly/meetings/blob/main/main/2023/CG-10.md
https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md
This mainly adds two instructions: `try_table` and `throw_ref`. This is
the bare minimum required to read and write text and binary format, and
does not include analyses or optimizations. (It includes some analysis
required for validation of existing instructions.) Validation for
the new instructions is not yet included.
`try_table` faces the same problem with the `resume` instruction in
#6083 that without the module-level tag info, we are unable to know the
'sent types' of `try_table`. This solves it with a similar approach
taken in #6083: this adds `Module*` parameter to `finalize` methods,
which defaults to `nullptr` when not given. The `Module*` parameter is
given when called from the binary and text parser, and we cache those
tag types in `sentTypes` array within `TryTable` class. In later
optimization passes, as long as they don't touch tags, it is fine to
call `finalize` without the `Module*`. Refer to
https://github.com/WebAssembly/binaryen/pull/6083#issuecomment-1854634679
and #6096 for related discussions when `resume` was added.
|
|
|
|
|
|
|
| |
`_VECTOR` or `_ARRAY` defines in `wasm-delegations-fields.def` are
supposed to be defined in terms of their non-vector/array counterparts
when undefined. This removes empty `_VECTOR`/`_ARRAY` defines when
including `wasm-delegations-fields.def`, while adding definitions for
`DELEGATE_GET_FIELD` in case it is missing.
|
|
|
|
|
|
|
|
| |
Globally replace the source string "I31New" with "RefI31" in preparation for
renaming the instruction from "i31.new" to "ref.i31", as implemented in the spec
in https://github.com/WebAssembly/gc/pull/422. This would be NFC, except that it
also changes the string in the external-facing C APIs.
A follow-up PR will make the corresponding behavioral change.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Simplify the optimization of ref.cast and ref.test in OptimizeInstructions by
moving the loop that examines fallthrough values one at a time out to a shared
function in properties.h. Also simplify ref.cast optimization by analyzing the
cast result in just one place.
In addition to simplifying the code, also make the cast optimizations more
powerful by analyzing the nullability and heap type of the cast value
independently, resulting in a potentially more precise analysis of the cast
behavior. Also improve optimization power by considering fallthrough values when
optimizing the SuccessOnlyIfNonNull case.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both isValidInConstantExpression and isSingleConstantExpression must look
recursively at the internals of a RefAs that externalizes and internalizes, or else
we might do something like externalize a local.get, which is not constant.
getLiteral must handle externalize/internalize as well, and return a properly-
modified literal.
Without these fixes the testcase hits different internal assertions, and we either
fail to recognize something is constant or not, or think that it is but fail to
produce a literal for it.
|
|
|
|
|
|
|
|
|
|
| |
This fixes wasm-ctor-eval on evalling a GC data structure that contains a
field initialized with an externalized value.
Per the spec this is a constant instruction and I verified that V8 allows this.
Also add missing validation in wasm-ctor-eval of the output (which makes
debugging this kind of thing a little easier).
|
|
|
|
|
|
|
|
|
|
| |
Previously we treated global.get as a constant expression and only
additionally verified that the target globals were immutable in some cases. But
global.get of a mutable global is never a constant expression, and further,
only imported globals are available in constant expressions unless GC is
enabled.
Fix constant expression validation to only allow global.get of immutable,
imported globals, and fix all the invalid tests.
|
|
|
|
|
|
|
|
| |
To match the standard instruction name, rename the expression class without
changing any parsing or printing behavior. A follow-on PR will take care of the
functional side of this change while keeping support for parsing the old name.
This change will allow `ArrayInit` to be used as the expression class for the
upcoming `array.init_data` and `array.init_elem` instructions.
|
|
|
| |
This is enough for DAE and other opts to run on string consts.
|
|
|
|
|
|
|
|
|
|
| |
Store string data as GC data. Inefficient (one Const per char), but ok for now.
Implement string.new_wtf16 and string.const, enough for basic testing.
Create strings in makeConstantExpression, which enables ctor-eval support.
Print strings in fuzz-exec which makes testing easier.
|
|
|
|
|
|
| |
Optimize ref.cast instructions that must succeed by simply replacing them with
their child in the case where the child has a more refined type or by
propagating a further removed fallthrough value with a more refined type using a
tee.
|
|
|
|
|
|
|
|
| |
In particular, do not treat the converted value as "falling through" the
conversion. Since the conversions cross type hierarchies, treating the converted
values as fallthrough values would make subsequent casts look like they must
fail, when in fact they may not.
Fixes #5407.
|
| |
|
|
|
|
|
|
|
| |
The fallthrough there is trickier because the value is evaluated before the condition.
Unlike other fallthroughs, the value is not last, so we need to check if the condition
(which is after it) interferes with it.
|
|
|
|
|
|
|
|
| |
x - C -> x + (-C)
min(C, x) -> min(x, C)
max(C, x) -> max(x, C)
And remove redundant rules
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A rather tricky corner case: we normally look at fallthrough values for copies of
fields, so when we try to refine a field, we ignore stuff like this:
a.x = b.x;
That copies the same field on the same type to itself, so refining is not limited by
it. But if we have something else in the middle, and that thing cannot change
type, then it is a problem, like this:
(struct.set
(..ref..)
(local.tee $temp
(struct.get)))
tee has the type of the local, which does not change in this pass. So we can't
look at just the fallthrough here and skip the tee: after refining the field, the
tee's old type might not fit in the field's new type.
We could perhaps add casts to fix things up, but those may have too big a
cost. For now, just ignore the fallthrough.
|
|
|
|
|
|
|
| |
RTTs were removed from the GC spec and if they are added back in in the future,
they will be heap types rather than value types as in our implementation.
Updating our implementation to have RTTs be heap types would have been more work
than deleting them for questionable benefit since we don't know how long it will
be before they are specced again.
|
|
|
|
|
|
| |
We already require non-null literals to have non-null types, but with this
change we can enforce that constraint by construction. Also remove the default
behavior of creating a function reference literal with heap type `func`, since
there is always a more specific function type to use.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This marks all reference operations that return 0/1 as doing so. This
allows various bitwise operations to be optimized on them.
This also marks StringEq as a boolean, though we can't test that fully yet
as Strings support is wip (no interpreter or other stuff yet).
As a driveby this moves emitsBoolean to its own file, and uses it
in getMaxBits to avoid redundancy (the redundant code paths now have
a WASM_UNREACHABLE).
|
|
|
|
|
| |
This is more work than a typical instruction because it also adds a new section:
all the (string.const "foo") strings are put in a new "strings" section in the binary, and
the instructions refer to them by index.
|
|
|
| |
See https://github.com/WebAssembly/extended-const
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for try-delegate in `EffectAnalyzer`. Without this
support, the expresion below has been incorrectly classified as "cannot
throw", because the previous code considered everything inside
`try`-`catch_all` as "cannot throw". This is not the case when there is
a `delegate` that can bypass the `catch_all`.
```wasm
try $l0
try
try
throw $e
delegate $l0
catch_all
end
end
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
With nominal function types, this change makes it so that we preserve the
identity of the function type used with call_indirect instructions rather than
recreating a function heap type, which may or may not be the same as the
originally parsed heap type, from the function signature during module writing.
This will simplify the type system implementation by removing the need to store
a "canonical" nominal heap type for each unique signature. We previously
depended on those canonical types to avoid creating multiple duplicate function
types during module writing, but now we aren't creating any new function types
at all.
|
|
|
| |
Clearer this way.
|
|
|
|
|
|
|
|
|
| |
See #4149
This modifies the test added in #4163 which used static casts on
dynamically-created structs and arrays. That was technically not
valid (as we won't want users to "mix" the two forms). This makes that
test 100% static, which both fixes the test and gives test coverage
to the new instructions added here.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This finishes the refactoring started in #4115 by doing the
same change to pass a Module into EffectAnalyzer instead of
features. To do so this refactors the fallthrough API and a few
other small things. After those changes, this PR removes the
old feature constructor of EffectAnalyzer entirely.
This requires a small breaking change in the C API, changing
BinaryenExpressionGetSideEffects's feature param to a
module. That makes this change not NFC, but otherwise it is.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Knowing the module will allow us to do more analysis in the
effect analyzer. For now, this just refactors the code to allow
providing a module instead of features, and to infer the
features from the module. This actually shortens the code in
most places which is nice (just pass module instead of
module->features).
This modifies basically all callers to use the new module
form, except for the fallthrough logic. That would require
some more refactoring, so to keep this PR reasonably small
that is not yet done.
|
|
|
|
|
|
|
|
| |
Before, we'd compute the hash of a child, then store that in a map, then
the parent would find the child's hash in the map using the pointer to the
child. But as we do a simple postorder walk, we can use a stack, and
avoid hashing the child pointers.
This makes it 10% faster or so.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Technically this is not a new pass, but it is a rewrite almost from scratch.
Local Common Subexpression Elimination looks for repeated patterns,
stuff like this:
x = (a + b) + c
y = a + b
=>
temp = a + b
x = temp + c
y = temp
The old pass worked on flat IR, which is inefficient, and was overly
complicated because of that. The new pass uses a new algorithm that
I think is pretty simple, see the detailed comment at the top.
This keeps the pass enabled only in -O4, like before - right after
flattening the IR. That is to make this as minimal a change as possible.
Followups will enable the pass in the main pipeline, that is, we will
finally be able to run it by default. (Note that to make the pass work
well after flatten, an extra simplify-locals is added - the old pass used
to do part of simplify-locals internally, which was one source of
complexity. Even so, some of the -O4 tests have changes, due to
minor factors - they are just minor orderings etc., which can be
seen by inspecting the outputs before and after using e.g.
--metrics)
This plus some followup work leads to large wins on wasm GC output.
On j2cl there is a common pattern of repeated struct.gets, so common
that this pass removes 85% of all struct.gets, which makes the total
binary 15% smaller. However, on LLVM-emitted code the benefit is
minor, less than 1%.
|
|
|
|
|
|
| |
Define a "single constant expression" consistently, as a thing that is a
compile-time constant. Move I31New out of there, and also clean up
the function it is moved into, canInitializeGlobal(), to be consistent
in validating children.
|
|
|
| |
Similar to #4005 but on OptimizeInstructions instead of RemoveUnusedBrs.
|
|
|
| |
fixes part of #3906
|
|
|
|
|
|
| |
A new getImmediateFallthrough is called in the loop.
Aside from this being more efficient than recursion, the new method will
be used in escape analysis.
|
|
|
|
|
| |
* Note that ref.cast has a fallthrough value.
* Optimize ref.eq on identical inputs.
|
|
|
|
|
|
|
|
|
|
|
|
| |
The precompute pass ignored all reference types, but that was overly
pessimistic: we can precompute some of them, namely a null and a
reference to a function are fully precomputable, etc.
To allow that to work, add missing integration in getFallthrough as well.
With this, we can precompute quite a lot of field accesses in the existing
-Oz testcase, as can be seen from the output. That testcase runs
--fuzz-exec so it prints out all those logged values, proving they have
not changed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for reading/writing of the new `delegate` instruction
in the folded wast format, the stack IR format, the poppy IR format, and
the binary format in Binaryen. We don't have a formal spec written down
yet, but please refer to WebAssembly/exception-handling#137 and
WebAssembly/exception-handling#146 for the informal semantics. In the
current version of spec `delegate` is basically a rethrow, but with
branch-like immediate argument so that it can bypass other
catches/delegates in between.
`delegate` is not represented as a new `Expression`, but it is rather
an option within a `Try` class, like `catch`/`catch_all`.
One special thing about `delegate` is, even though it is written
_within_ a `try` in the folded wat format, like
```wasm
(try
(do
...
)
(delegate $l)
)
```
In the unfolded wat format or in the binary format, `delegate` serves as
a scope end instruction so there is no separate `end`:
```wasm
try
...
delegate $l
```
`delegate` semantically targets an outer `catch` or `delegate`, but we
write `delegate` target as a `try` label because we only give labels to
block-like scoping expressions. So far we have not given `Try` a label
and used inner blocks or a wrapping block in case a branch targets the
`try`. But in case of `delegate`, it can syntactically only target `try`
and if it targets blocks or loops it is a validation failure.
So after discussions in #3497, we give `Try` a label but this label can
only be targeted by `delegate`s. Unfortunately this makes parsing and
writing of `Try` expression somewhat complicated. Also there is one
special case; if the immediate argument of `try` is the same as the
depth of control flow stack, this means the 'delegate' delegates to the
caller. To handle this case this adds a fake label
`DELEGATE_CALLER_TARGET`, and when writing it back to the wast format
writes it as an immediate value, unlike other cases in which we write
labels.
This uses `DELEGATE_FIELD_SCOPE_NAME_DEF/USE` to represent `try`'s label
and `delegate`'s target. There are many cases that `try` and
`delegate`'s labels need to be treated in the same way as block and
branch labels, such as for hashing or comparing. But there are routines
in which we automatically assume all label uses are branches. I thought
about adding a new kind of defines such as
`DELEGATE_FIELD_TRY_NAME_DEF/USE`, but I think it will also involve some
duplication of existing routines or classes. So at the moment this PR
chooses to use the existing `DELEGATE_FIELD_SCOPE_NAME_DEF/USE` for
`try` and `delegate` labels and makes only necessary amount of changes
in branch-utils. We can revisit this decision later if necessary.
Many of changes to the existing test cases are because now all `try`s
are automatically assigned a label. They will be removed in
`RemoveUnusedNames` pass in the same way as block labels if not targeted
by any delegates.
This only supports reading and writing and has not been tested against
any optimization passes yet.
---
Original unfolded wat file to generate test/try-delegate.wasm:
```wasm
(module
(event $e)
(func
try
try
delegate 0
catch $e
end)
(func
try
try
catch $e
i32.const 0
drop
try
delegate 1
end
catch $e
end
)
)
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds info to RTT literals so that they can represent the chain of
rtt.canon/sub commands that generated them, and it adds an internal
RTT for each GC allocation (array or struct).
The approach taken is to simply store the full chain of rtt.sub types
that led to each literal. This is not efficient, but it is simple and seems
sufficient for the semantics described in the GC MVP doc - specifically,
only the types matter, in that repeated executions of rtt.canon/sub
on the same inputs yield equal outputs.
This PR fixes a bunch of minor issues regarding that, enough to allow testing
of the optimization and execution of ref.test/cast.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds rtt.canon and rtt.sub together with RTT type support
that is necessary for them. Together this lets us test roundtripping the
instructions and types.
Also fixes a missing traversal over globals in collectHeapTypes,
which the example from the GC docs requires, as the RTTs are in
globals there.
This does not yet add full interpreter support and other things. It
disables initial contents on GC in the fuzzer, to avoid the fuzzer
breaking.
Renames the binary ID for exnref, which is being removed from
the spec, and which overlaps with the binary ID for rtt.
|
|
|
|
| |
unreachable (#3413)
|
|
|
|
| |
Use matchers and more descriptive variable names to clarify the intent of the
functions for finding and inspecting sign extension patterns.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bugs (#3401)
* Count signatures in tuple locals.
* Count nested signature types (confirming @aheejin was right, that was missing).
* Inlining was using the wrong type.
* OptimizeInstructions should return -1 for unhandled types, not error.
* The fuzzer should check for ref types as well, not just typed function references,
similar to what GC does.
* The fuzzer now creates a function if it has no other option for creating a constant
expression of a function type, then does a ref.func of that.
* Handle unreachability in call_ref binary reading.
* S-expression parsing fixes in more places, and add a tiny fuzzer for it.
* Switch fuzzer test to just have the metrics, and not print all the fuzz output which
changes a lot. Also fix noprint handling which only worked on binaries before.
* Fix Properties::getLiteral() to use the specific function type properly, and make
Literal's function constructor require that, to prevent future bugs.
* Turn all input types into nullable types, for now.
|
|
|
| |
When there are two versions of a function, one handling tuples and the other handling non-tuple values, the previous naming convention was to have "Single" in the name of the non-tuple handling function. This PR simplifies the convention and shortens function names by making the names plural for the tuple-handling version and singular for the non-tuple-handling version.
|
|
|
| |
Integrates `i31ref` types and instructions into the fuzzer, by assuming that `(i31.new (i32.const N))` is constant and hence suitable to be used in global initializers.
|
|
|
| |
Add floating point Eq and Ne operators to Properties::isSymmetric. Also treat additional float ops as symmetric specifically in OptimizeInstructions when their operands are known to be non-NaN.
|