| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
E.g.
x + C1 > C2 ==> x > (C2-C1)
We do need to be careful of overflows in either the add on the left or
the proposed subtract on the right. In the latter case, we can at least do
x + C1 > C2 ==> x + (C1-C2) > 0
Helps #5008 (but more patterns remain).
Found by the superoptimizer #4994. This was the top suggestion for Java and Dart.
|
|
|
|
|
| |
We explicitly wrote out memory, table, and globals, but did not add structs. This switches
us to use readsMutableGlobalState which has the full list of all relevant global state,
including the memory, table, and globals as well as structs.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
E.g. if we just do addition etc., then any higher bits will be wrapped out anyhow:
int32_t(int64_t(x) + int64_t(10))
=>
x + int32_t(10)
Found by the superoptimizer #4994 . This is by far the most promising suggestion it
had. Interestingly, it mainly helps Go, where it removes 20% of all Unary operations
(the extends and wraps), and Rust, where it removes 3%.
Helps #5004. This handles the common cases I see in the superoptimizer output, but
there are more that could be handled.
|
|
|
|
|
|
| |
struct.set (#5021)
We replaced an unreachable struct.set with something reachable, which can
break validation in corner cases.
|
|
|
|
|
|
|
|
|
|
|
|
| |
shifts and same constant (#4996)
(x >> C) << C -> x & -(1 << C)
(x >>> C) << C -> x & -(1 << C)
(x << C) >>> C -> x & (-1 >>> C)
// (x << C) >> C doesn't support
Found by the superoptimizer #4994
Fixes #5012
|
|
|
|
|
|
|
| |
Add a pass that wraps all imports and exports with functions that handle
storing and passing along the suspender externref needed for JSPI.
https://github.com/WebAssembly/js-promise-integration/blob/main/proposals/js-promise-integration/Overview.md
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
x ? 0 : y ==> z & y where z = !x
x ? y : 1 ==> z | y where z = !x
Only do this when we have z = !x, that is, we can invert x without adding
an actual eqz (which would add work).
To do this, canonicalize selects to prefer to flip the arms, when
possible, if it would move a constant to a location that the existing
optimizations already turn into an and/or. That is,
x >= 5 ? 0 : y != 42
would be canonicalized into
x < 5 ? y != 42 : 0
and existing opts turn that into
(x < 5) & (y != 42)
The canonicalization does not always help this optimization, as we need
the values to be boolean to do this, but canonicalizing is still nice to get
more regular code which might compress slightly better.
|
|
|
|
|
|
|
|
|
|
| |
(#5000)
Also add testing that they pass through the full optimizer.
Fixes #4999
Drive-by fixes to the text in the assertions, which was copy-pasted.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds C-API bindings for the following expression classes:
RefTest
RefCast
BrOn with operations BrOnNull, BrOnNonNull, BrOnCast, BrOnCastFail, BrOnFunc, BrOnNonFunc, BrOnData, BrOnNonData, BrOnI31, BrOnNonI31
StructNew with operations StringNewUTF8, StringNewWTF8, StringNewReplace, StringNewWTF16, StringNewUTF8Array, StringNewWTF8Array, StringNewReplaceArray, StringNewWTF16Array
StructGet
StructSet
ArrayNew
ArrayInit
ArrayGet
ArraySet
ArrayLen
ArrayCopy
StringNew
StringConst
StringMeasure with operations StringMeasureUTF8, StringMeasureWTF8, StringMeasureWTF16, StringMeasureIsUSV, StringMeasureWTF16View
StringEncode with operations StringEncodeUTF8, StringEncodeWTF8, StringEncodeWTF16, StringEncodeUTF8Array, StringEncodeWTF8Array, StringEncodeWTF16Array
StringConcat
StringEq
StringAs with operations StringAsWTF8, StringAsWTF16, StringAsIter
StringWTF8Advance
StringWTF16Get
StringIterNext
StringIterMove with operations StringIterMoveAdvance, StringIterMoveRewind
StringSliceWTF with operations StringSliceWTF8, StringSliceWTF16
StringSliceIter
|
|
|
|
| |
Do not export functions that have types not allowed in the rules for
JS interop. Only very few GC types can be on the JS boundary atm.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An overview of this is in the README in the diff here (conveniently, it is near the
top of the diff). Basically, we fix up nn locals after each pass, by default. This keeps
things easy to reason about - what validates is what is valid wasm - but there are
some minor nuances as mentioned there, in particular, we ignore nameless blocks
(which are commonly added by various passes; ignoring them means we can keep
more locals non-nullable).
The key addition here is LocalStructuralDominance which checks which local
indexes have the "structural dominance" property of 1a, that is, that each get has
a set in its block or an outer block that precedes it. I optimized that function quite
a lot to reduce the overhead of running that logic after each pass. The overhead
is something like 2% on J2Wasm and 0% on Dart (0%, because in this mode we
shrink code size, so there is less work actually, and it balances out).
Since we run fixups after each pass, this PR removes logic to manually call the
fixup code from various places we used to call it (like eh-utils and various passes).
Various passes are now marked as requiresNonNullableLocalFixups => false.
That lets us skip running the fixups after them, which we normally do automatically.
This helps avoid overhead. Most passes still need the fixups, though - any pass
that adds a local, or a named block, or moves code around, likely does.
This removes a hack in SimplifyLocals that is no longer needed. Before we
worked to avoid moving a set into a try, as it might not validate. Now, we just do it
and let fixups happen automatically if they need to: in the common code they
probably don't, so the extra complexity seems not worth it.
Also removes a hack from StackIR. That hack tried to avoid roundtrip adding a
nondefaultable local. But we have the logic to fix that up now, and opts will
likely keep it non-nullable as well.
Various tests end up updated here because now a local can be non-nullable -
previous fixups are no longer needed.
Note that this doesn't remove the gc-nn-locals feature. That has been useful for
testing, and may still be useful in the future - it basically just allows nn locals in
all positions (that can't read the null default value at the entry). We can consider
removing it separately.
Fixes #4824
|
|
|
| |
Followup to #4282
|
|
|
|
| |
These new GC instructions infallibly convert between `extern` and `any`
references now that those types are not in the same hierarchy.
|
| |
|
|
|
|
| |
A resize from a large amount to a small amount would sometimes not clear
the flexible storage, if we used it before but not after.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To unblock some optimizations. For example this:
```wat
(select
(i32.eqz (local.get $x))
(i32.const 0)
(i32.eqz (local.get $y))
)
```
Was previously optimized as:
```wat
(i32.eqz
(select
(i32.const 1)
(local.get $x)
(local.get $y)
)
)
```
Because `optimizeSelect` applied `!x ? !y : 0 -> x ? 0 : !y` then `!(x ? 1 : y)`, blocking the next rules which could have folded this to `or` or `and`.
After this PR the same example optimizes better:
```wat
(i32.eqz
(i32.or
(local.get $x)
(local.get $y)
)
)
```
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A continuation of #4272.
```
(signed)x < s_min + 1 ==> x == s_min
(signed)x >= s_min + 1 ==> x != s_min
(signed)x > s_max - 1 ==> x == s_max
(signed)x <= s_max - 1 ==> x != s_max
(unsigned)x <= u_max - 1 ==> x != u_max
(unsigned)x > u_max - 1 ==> x == u_max
```
|
|
|
|
|
|
|
| |
Match the latest version of the GC spec. This change does not depend on V8
changing its interpretation of the shorthands because we are still temporarily
not emitting the binary shorthands, but all Binaryen users will have to update
their interpretations along with this change if they use the text or binary
shorthands.
|
|
|
|
|
|
|
|
|
| |
Implement function parsing, including parsing of locals and type uses. Also add
a new phase of parsing that iterates through type uses that do not have explicit
types to create any implicitly defined types and append them to the type index
space. This is important because the implicitly defined types may be referred to
by index elsewhere and because the legacy parser does not handle these
implicitly defined types correctly. Finally, maintain a map of implicit type use
locations to corresponding types for use in later phases of parsing.
|
| |
|
|
|
| |
Adding multi-memories to the the list of wasm-features.
|
|
|
|
|
| |
I don't see an existing test for this, and it's useful behavior since such inlining will
propagate the trap to the caller, possibly helping DCE and other things there, so it's
good to have a test to guarantee we never break it.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fuzzing with TrapsNeverHappen found a bug, and then reading similar code
I found another, where we check structural equality but ignored effects. Some
things look equal but may have different values at runtime:
(foo
(call $x)
(call $y)
)
The arms are identical structurally but due to side effects may not be identical
in value.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The "ignore trap" logic there is not close to enough for what we'd need to
actually fuzz in a way that ignores traps, so this removes it. Atm that logic
just allows a trap to happen without causing an error (that is, when comparing
two results, one might trap and the other not, but they'd still be considered
"equal"). But due to how we optimize traps in TrapsNeverHappens mode, the
optimizer is free to assume the trap never occurs, which might remove side
effects that are noticeable later. To actually handle that, we'd need to refactor
the code to retain results per function (including the Loggings) and then to
ignore everything from the very first trapping function. That is somewhat
complicated to do, and a simpler thing is done in #4936, so we won't need
it here.
|
|
|
| |
Followup to #4910.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Those instructions need to know if the memory is 64-bit or not. We looked that
up on the module globally, which is convenient, but in the C API this was actually
a breaking change, it turns out. To keep things working, provide that information
when creating a MemoryGrow or MemorySize, as another parameter in the C
API. In the C++ API (wasm-builder), support both modes, and default to the
automatic lookup.
We already require a bunch of other explicit info when creating expressions, like
making a Call requires the return type (we don't look it up globally), and even a
LocalGet requires the local type (we don't look it up on the function), so this is
consistent with those.
Fixes #4946
|
|
|
| |
After landing #4944, also need to delete these unneeded test files.
|
|
|
|
|
| |
Just like `extern` is no longer a subtype of `any` in the new GC type system,
`func` is no longer a subtype of `any`, either. Make that change in our type
system implementation and update tests and fuzzers accordingly.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes what looks like it might be a regression in #4943. It's not actually
an issue since it just affects wat files, but it did uncover an existing
inefficiency. The situation is this:
(block
..
(br $somewhere)
(nop)
)
Removing such a nop is always helpful, as the pass might see that that
br goes to where control flow is going anyhow, and the nop would
confuse it. We used to remove such nops only when the block had a name,
which is why wat testcases looks optimal, but we were actually doing the
less-efficient thing on real-world code. It was a minor inefficiency, though, as
the nop is quickly removed by later passes anyhow. Still, the fix is trivial (to
always remove such nops, regardless of a name on the block or not).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously the wat parser would turn this input:
(block
(nop)
)
into something like this:
(block $block17
(nop)
)
It just added a name all the time, in case the block is referred to by an index
later even though it doesn't have a name.
This PR makes us rountrip more precisely by not adding such names: if there
was no name before, and there is no break by index, then do not add a name.
In addition, this will be useful for non-nullable locals since whether a block has
a name or not matters there. Like #4912, this makes us more regular in our
usage of block names.
|
|
|
| |
Removing the .wasm multi-memories tests as they can be easily represented in .wast format, which is easier to read/handle.
|
|
|
|
|
|
|
| |
Such globals can be written to from the outside, so we cannot infer anything about
their contents.
Fixes #4932
|
|
|
|
|
|
|
|
| |
In LLVM output and probably others, the initial table contents are never
changed. We may append later, but we don't trample the initial table
entries. As a result, with this new flag we can turn indirect calls on those
offsets into direct ones:
--directize-initial-tables-immutable
|
|
|
|
|
| |
A memory must be explicitly defined before being exported.
Fix CI for 6b3f3af.
|
|
|
| |
Follow up to #4771 to test new --print-profile options for wasm-split.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
casts (#4720)
i32 -> f64 -> i32 rountripping optimizations:
```rust
i32(f64(i32(x))) -> x // i32.trunc(_sat)_f64_s(f64.convert_i32_s(x)) -> x
u32(f64(u32(x))) -> x // i32.trunc(_sat)_f64_u(f64.convert_i32_u(x)) -> x
// note assymetric signed / unsigned or unsigned / signed cases can't be simplified in similar way
```
and rounding after integer to float casts:
```rust
ceil(float(int(x))) -> float(int(x))
floor(float(int(x))) -> float(int(x))
trunc(float(int(x))) -> float(int(x))
nearest(float(int(x))) -> float(int(x))
```
where `float = f32 | f64`, `int = i32 | i64 | u32 | u64`
|
|
|
|
| |
Due to missing test coverage, we missed in #4811 that some memory operations
needed to get make64() called on them.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A rather tricky corner case: we normally look at fallthrough values for copies of
fields, so when we try to refine a field, we ignore stuff like this:
a.x = b.x;
That copies the same field on the same type to itself, so refining is not limited by
it. But if we have something else in the middle, and that thing cannot change
type, then it is a problem, like this:
(struct.set
(..ref..)
(local.tee $temp
(struct.get)))
tee has the type of the local, which does not change in this pass. So we can't
look at just the fallthrough here and skip the tee: after refining the field, the
tee's old type might not fit in the field's new type.
We could perhaps add casts to fix things up, but those may have too big a
cost. For now, just ignore the fallthrough.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DAE will normally not remove an unreachable parameter, because it checks for
effects there. But in TrapsNeverHappen mode, we assume that an unreachable
is an effect we can remove, so we are willing to remove it:
(func $foo (param $unused i32)
;; never use $unused
)
(func $bar
(call $foo
(unreachable)))
;;=> dae+tnh
(func $foo
)
(func $bar
(call $foo))
But that transformation is invalid: the call's type was unreachable before but
no longer is. What went wrong here is that, yes, it is valid to remove an
unreachable, but we may need to update types while doing so, which we
were not doing.
This wasn't noticed before due to a combination of unfortunate factors:
The main reason is that this only happens in TrapsNeverHappens mode. We
don't fuzz that, because it's difficult: that mode can assume a trap never happens,
so a trap is undefined behavior really. On real-world code this is great, but in the
fuzzer it means that the output can seem to change after optimizations.
The validator happened to be missing an error for a call that has type unreachable
but shouldn't: Validator: Validate unreachable calls more carefully #4909 . Without
that, we'd only get an error if the bad type influenced a subsequent pass in a confusing
way - which is possible, but difficult to achieve (what ended up happening in practice is
that SignatureRefining on J2Wasm relied on the unreachable and refined a type too much).
Even with that fix, for the problem to be detected we'd need for the validation error to
happen in the final output, after running all the passes. In practice, though, that's not
likely, since other passes tend to remove unreachables etc. Pass-debug mode is very
useful for finding stuff like this, as it validates after every individual pass. Sadly it turns
out that global validation was off there: Validator: Validate globally by default #4906
(so it was catching the 99% of validation errors that are local, but this particular error
was in the remaining 1%...).
As a fix, simply ignore this case. It's not really worth the effort to optimize it, since DCE
will just remove unreachables like that anyhow. So if we run again after a DCE we'd get
a chance to optimize.
This updates some existing tests to avoid (unreachable). That was used as an example
of something with effects, but after this change it is treated more carefully. Replace those
things with something else that has effects (a call).
|
|
|
|
|
|
|
|
|
|
| |
We already did this if the block was a child of a control flow structure, which is
the common case (see the new added comment around that code, which clarifies
why). This does the same for all other blocks. This is simple to do and a minor
optimization, but the main benefit from this is just to make our handling of blocks
uniform: after this, we never emit a block with no name. This will make 1a non-
nullable locals easier to handle (since they will be able to assume that property;
and not emitting such blocks avoids some work to handle non-nullable locals
in them).
|
|
|
|
|
|
|
| |
The GC proposal has split `any` and `extern` back into two separate types, so
reintroduce `HeapType::ext` to represent `extern`. Before it was originally
removed in #4633, externref was a subtype of anyref, but now it is not. Now that
we have separate heaptype type hierarchies, make `HeapType::getLeastUpperBound`
fallible as well.
|
|
|
|
|
|
|
| |
This PR removes the single memory restriction in IR, adding support for a single module to reference multiple memories. To support this change, a new memory name field was added to 13 memory instructions in order to identify the memory for the instruction.
It is a goal of this PR to maintain backwards compatibility with existing text and binary wasm modules, so memory indexes remain optional for memory instructions. Similarly, the JS API makes assumptions about which memory is intended when only one memory is present in the module. Another goal of this PR is that existing tests behavior be unaffected. That said, tests must now explicitly define a memory before invoking memory instructions or exporting a memory, and memory names are now printed for each memory instruction in the text format.
There remain quite a few places where a hardcoded reference to the first memory persist (memory flattening, for example, will return early if more than one memory is present in the module). Many of these call-sites, particularly within passes, will require us to rethink how the optimization works in a multi-memories world. Other call-sites may necessitate more invasive code restructuring to fully convert away from relying on a globally available, single memory pointer.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
call.without.effects has a specific form, where the last parameter is a
function reference, and that function reference must have the right type
for the other parameters if called with them:
(call $call.without.effects
(..i32..)
(..f64..)
(..function reference, which takes params i32 and f64..)
|
|
|
|
|
|
| |
This allows emscripten to move these helper functions from JS library
imports to native wasm exports.
See https://github.com/emscripten-core/emscripten/issues/7273
|
|
|
| |
I was reading these tests and failing to find the names script.
|
|
|
|
|
| |
Also, add support for the `--binaryen-bin` flag to
`scripts/port_passes_tests_to_lit.py`. This is needed for folks who
don't do in-tree builds.
|
|
|
|
| |
`pop`s type should be a supertype, not a subtype, of the tag's type
within `catch`.
|
|
|
| |
Like the 8-bit array variants, it takes 3 parameters.
|
|
|
|
|
|
|
| |
For now this index is always 0, but we must emit it.
Also clean up the wat test a little - we don't have validation yet, but we should
not validate without a memory in that file.
|