| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| |
|
|
|
| |
This lets that pass optimize 64-bit offsets on memory64 loads and stores.
|
|
|
|
| |
This should make the CI green again. Also fix one of the errors. I haven't fixed
the other errors because I don't know how.
|
|
|
|
|
|
|
|
|
| |
floating points (#5034)
```
(-x) + y -> y - x
x + (-y) -> x - y
x - (-y) -> x + y
```
|
|
|
| |
This finalizes the multi memories feature introduced in #4968.
|
| |
|
|
|
| |
Also fix some formatting issue in the file.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Recently we added logic to ignore effects that don't "escape" past the function call.
That is, e.g. local.set only affects the current function scope, and once the call stack
is unwound it no longer matters as an effect. This moves that logic to a shared place,
and uses it in the core Vacuum logic.
The new constructor in EffectAnalyzer receives a function and then scans it as
a whole. This works just like e.g. scanning a Block as a whole (if we see a break in
the block, that has an effect only inside it, and the Block + children doesn't have a
branch effect).
Various tests are updated so they don't optimize away trivially, by adding new
return values for them.
|
|
|
|
| |
(#5038)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
due to timeout (#5039)
I think this simplifies the logic behind what we consider to trap. Before we had kind of
a hack in visitLoop that now has a more clear reasoning behind it: we consider as
trapping things that trap in all VMs all the time, or will eventually. So a single allocation
doesn't trap, but an unbounded amount can, and an infinite loop is considered to
trap as well (a timeout in a VM will be hit eventually, somehow).
This means we cannot optimize way a trivial infinite loop with no effects in it,
while (1) {}
But we can optimize it out in trapsNeverHappen mode. In any event, such a loop
is not a realistic situation; an infinite loop with some other effect in it, like a call to
an import, will not be optimized out, of course.
Also clarify some other things regarding traps and trapsNeverHappen following
recent discussions in https://github.com/emscripten-core/emscripten/issues/17732
Specifically, TNH will never be allowed to remove calls to imports.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes: https://github.com/emscripten-core/emscripten/issues/17846
More detailed explanation of the issue from Thibaud:
- A promising export is entered, generating a suspender s1, which is stored in the global
- The wasm code calls a wrapped import, passing it the value in the global (s1) and suspends
- Another export is entered, generating suspender s2, which is stored in the global
- We call another wrapped import, which suspends s2 (so far so good)
- We return to the event loop and s1 is resumed
And now we are in an inconsistent state: the active suspender is "s1", but the object in the global is "s2".
So the next time we call a wrapped import, there is a mismatch, which is what this runtime error reports.
|
|
|
|
|
| |
This allows a three-step upgrade process where binaryen is updated with this
change, then users remove their use of these flags, then binaryen can remove the
flags permanently.
|
|
|
|
|
|
| |
This import was being injected and then used to implement trapping.
Rather than injecting an import that doesn't exist in the original
module we instead use the existing mechanism to implement this as
an internal helper.
|
|
|
|
| |
A try whose body throws, and does nothing else, and the try catches that
exception, can be removed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a map of function name => the effects of that function to the
PassOptions structure. That lets us compute those effects once and then
use them in multiple passes afterwards. For example, that lets us optimize
away a call to a function that has no effects:
(drop (call $nothing))
[..]
(func $nothing
;; .. lots of stuff but no effects, only a returned value ..
)
Vacuum will remove that dropped call if we tell it that the called function has
no effects. Note that a nice result of adding this to the PassOptions struct
is that all passes will use the extra info automatically.
This is not enabled by default as the benefits seem rather minor, though it
does help in a small but noticeable way on J2Wasm code, where we use
call.without.effects and have situations like this:
(func $foo
(call $bar)
)
(func $bar
(call.without.effects ..)
)
The call to bar looks like it has effects, normally, but with global effect info
we know it actually doesn't.
To use this, one would do
--generate-global-effects [.. some passes that use the effects ..] --discard-global-effects
Discarding is not necessary, but if there is a pass later that adds effects, then not
discarding could lead to bugs, since we'd think there are fewer effects than there are.
(However, normal optimization passes never add effects, only remove them.)
It's also possible to call this multiple times:
--generate-global-effects -O3 --generate-global-effects -O3
That computes affects after the first -O3, and may find fewer effects than earlier.
This doesn't compute the full transitive closure of the effects across functions. That is,
when computing a function's effects, we don't look into its own calls. The simple case
so far is enough to handle the call.without.effects example from before (though it
may take multiple optimization cycles).
|
|
|
| |
Adds an --in-secondary-memory switch to the wasm-split tool that allows profile data to be stored in a separate memory from module main memory. With this option, users do not need to reserve the initial memory region for profile data and the data can be shared between multiple threads.
|
|
|
|
|
|
|
|
| |
x - C -> x + (-C)
min(C, x) -> min(x, C)
max(C, x) -> max(x, C)
And remove redundant rules
|
|
|
| |
Covers CallRef, RefTest, RefCast, BrOn, StructNew, StructGet, StructSet, ArrayNew, ArrayInit, ArrayGet, ArraySet, ArrayLen, ArrayCopy, StringNew, StringConst, StringMeasure, StringEncode, StringConcat, StringEq, StringAs, StringWTF8Advance, StringWTF16Get, StringIterNext, StringIterMove, StringSliceWTF, StringSliceIter.
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we were assuming asmLibraryArg which is what emscripten
passes as the `env` import object but using this method is more
flexible and should allow wasm2js to work with import that are
not all form a single object.
The slight size increase here is just temporary until emscripten
gets updated.
See https://github.com/emscripten-core/emscripten/pull/17737
|
|
|
|
|
|
|
|
|
|
| |
We had some concerns about this not working in the past, but thinking about it
now, I believe it is safe to do. Specifically, a throw is either like a break or a return -
either it jumps out to an outer scope (like a break) or it jumps out of the function
(like a return), and both breaks and returns have already been handled here.
This change has some nice effects on J2Wasm output, where there are quite a
lot of throws, which we can now optimize around.
|
|
|
|
|
| |
* Improve ExtractFunction pass error printing.
* Update lint
|
|
|
|
|
|
| |
This just moves the code from #5025 to the right function, which I did not
realize existed. optimizeRelational is where we optimize binary operations
that do comparisons, and it's nice to put all that code together. Avoids repeated
checks of isRelational() in separate places.
|
|
|
|
| |
Emscripten minifies the import name to a shorter string "a"). Adjust LogExecution pass to discover the import name that is used. (#4746)
|
|
|
|
|
|
|
| |
When we see e.g. x < y and x has fewer bits set, we can infer a result.
Helps #5010. As mentioned there, this is one of the top superoptimizer findings.
On j2wasm it ends up removing a few hundred binary operations for example.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#4985)
x + nan -> nan'
x - nan -> nan'
x * nan -> nan'
x / nan -> nan'
min(x, nan) -> nan'
max(x, nan) -> nan'
where nan' is canonicalized nan of rhs
x != nan -> 1
x == nan -> 0
x >= nan -> 0
x <= nan -> 0
x > nan -> 0
x < nan -> 0
|
|
|
|
| |
BinaryenSetMemory (#4963)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In practice typed function references will not ship before GC and is not
independently useful, so it's not necessary to have a separate feature for it.
Roll the functionality previously enabled by --enable-typed-function-references
into --enable-gc instead.
This also avoids a problem with the ongoing implementation of the new GC bottom
heap types. That change will make all ref.null instructions in Binaryen IR refer
to one of the bottom heap types. But since those bottom types are introduced in
GC, it's not valid to emit them in binaries unless unless GC is enabled. The fix
if only reference types is enabled is to emit (ref.null func) instead
of (ref.null nofunc), but that doesn't always work if typed function references
are enabled because a function type more specific than func may be required.
Getting rid of typed function references as a separate feature makes this a
nonissue.
|
|
|
| |
Replacing Fatal() call sites in src/shell-interface.h & src/tools/wasm-ctor-eval.cpp that were added in the Multi-Memories PR with assert()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
E.g.
x + C1 > C2 ==> x > (C2-C1)
We do need to be careful of overflows in either the add on the left or
the proposed subtract on the right. In the latter case, we can at least do
x + C1 > C2 ==> x + (C1-C2) > 0
Helps #5008 (but more patterns remain).
Found by the superoptimizer #4994. This was the top suggestion for Java and Dart.
|
|
|
|
|
| |
We explicitly wrote out memory, table, and globals, but did not add structs. This switches
us to use readsMutableGlobalState which has the full list of all relevant global state,
including the memory, table, and globals as well as structs.
|
|
|
|
|
| |
The only call to `generateSubBasic` was removed as part of a bug fix in #4346,
but the function itself was not removed. Remove it and other unused functions it
depends on now.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
E.g. if we just do addition etc., then any higher bits will be wrapped out anyhow:
int32_t(int64_t(x) + int64_t(10))
=>
x + int32_t(10)
Found by the superoptimizer #4994 . This is by far the most promising suggestion it
had. Interestingly, it mainly helps Go, where it removes 20% of all Unary operations
(the extends and wraps), and Rust, where it removes 3%.
Helps #5004. This handles the common cases I see in the superoptimizer output, but
there are more that could be handled.
|
|
|
|
|
|
| |
struct.set (#5021)
We replaced an unreachable struct.set with something reachable, which can
break validation in corner cases.
|
|
|
|
|
|
|
|
|
|
|
|
| |
shifts and same constant (#4996)
(x >> C) << C -> x & -(1 << C)
(x >>> C) << C -> x & -(1 << C)
(x << C) >>> C -> x & (-1 >>> C)
// (x << C) >> C doesn't support
Found by the superoptimizer #4994
Fixes #5012
|
|
|
|
|
|
|
| |
Add a pass that wraps all imports and exports with functions that handle
storing and passing along the suspender externref needed for JSPI.
https://github.com/WebAssembly/js-promise-integration/blob/main/proposals/js-promise-integration/Overview.md
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
x ? 0 : y ==> z & y where z = !x
x ? y : 1 ==> z | y where z = !x
Only do this when we have z = !x, that is, we can invert x without adding
an actual eqz (which would add work).
To do this, canonicalize selects to prefer to flip the arms, when
possible, if it would move a constant to a location that the existing
optimizations already turn into an and/or. That is,
x >= 5 ? 0 : y != 42
would be canonicalized into
x < 5 ? y != 42 : 0
and existing opts turn that into
(x < 5) & (y != 42)
The canonicalization does not always help this optimization, as we need
the values to be boolean to do this, but canonicalizing is still nice to get
more regular code which might compress slightly better.
|
|
|
|
|
|
|
|
|
|
| |
(#5000)
Also add testing that they pass through the full optimizer.
Fixes #4999
Drive-by fixes to the text in the assertions, which was copy-pasted.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds C-API bindings for the following expression classes:
RefTest
RefCast
BrOn with operations BrOnNull, BrOnNonNull, BrOnCast, BrOnCastFail, BrOnFunc, BrOnNonFunc, BrOnData, BrOnNonData, BrOnI31, BrOnNonI31
StructNew with operations StringNewUTF8, StringNewWTF8, StringNewReplace, StringNewWTF16, StringNewUTF8Array, StringNewWTF8Array, StringNewReplaceArray, StringNewWTF16Array
StructGet
StructSet
ArrayNew
ArrayInit
ArrayGet
ArraySet
ArrayLen
ArrayCopy
StringNew
StringConst
StringMeasure with operations StringMeasureUTF8, StringMeasureWTF8, StringMeasureWTF16, StringMeasureIsUSV, StringMeasureWTF16View
StringEncode with operations StringEncodeUTF8, StringEncodeWTF8, StringEncodeWTF16, StringEncodeUTF8Array, StringEncodeWTF8Array, StringEncodeWTF16Array
StringConcat
StringEq
StringAs with operations StringAsWTF8, StringAsWTF16, StringAsIter
StringWTF8Advance
StringWTF16Get
StringIterNext
StringIterMove with operations StringIterMoveAdvance, StringIterMoveRewind
StringSliceWTF with operations StringSliceWTF8, StringSliceWTF16
StringSliceIter
|
|
|
|
| |
Do not export functions that have types not allowed in the rules for
JS interop. Only very few GC types can be on the JS boundary atm.
|
|
|
| |
Similar to #4969 but for data segments.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An overview of this is in the README in the diff here (conveniently, it is near the
top of the diff). Basically, we fix up nn locals after each pass, by default. This keeps
things easy to reason about - what validates is what is valid wasm - but there are
some minor nuances as mentioned there, in particular, we ignore nameless blocks
(which are commonly added by various passes; ignoring them means we can keep
more locals non-nullable).
The key addition here is LocalStructuralDominance which checks which local
indexes have the "structural dominance" property of 1a, that is, that each get has
a set in its block or an outer block that precedes it. I optimized that function quite
a lot to reduce the overhead of running that logic after each pass. The overhead
is something like 2% on J2Wasm and 0% on Dart (0%, because in this mode we
shrink code size, so there is less work actually, and it balances out).
Since we run fixups after each pass, this PR removes logic to manually call the
fixup code from various places we used to call it (like eh-utils and various passes).
Various passes are now marked as requiresNonNullableLocalFixups => false.
That lets us skip running the fixups after them, which we normally do automatically.
This helps avoid overhead. Most passes still need the fixups, though - any pass
that adds a local, or a named block, or moves code around, likely does.
This removes a hack in SimplifyLocals that is no longer needed. Before we
worked to avoid moving a set into a try, as it might not validate. Now, we just do it
and let fixups happen automatically if they need to: in the common code they
probably don't, so the extra complexity seems not worth it.
Also removes a hack from StackIR. That hack tried to avoid roundtrip adding a
nondefaultable local. But we have the logic to fix that up now, and opts will
likely keep it non-nullable as well.
Various tests end up updated here because now a local can be non-nullable -
previous fixups are no longer needed.
Note that this doesn't remove the gc-nn-locals feature. That has been useful for
testing, and may still be useful in the future - it basically just allows nn locals in
all positions (that can't read the null default value at the entry). We can consider
removing it separately.
Fixes #4824
|
|
|
|
|
|
| |
exists (#4991)
If it exists, just turn it into an import. If not, then as before we create it + turn it into
an import.
|
|
|
|
|
| |
It had hardcoded handling of the table, but was missing RefFunc (and maybe more?)
Also some cleanups around that code.
|
|
|
| |
Similar to #4969 but for element segments.
|
|
|
|
| |
These new GC instructions infallibly convert between `extern` and `any`
references now that those types are not in the same hierarchy.
|
|
|
| |
Similar to #4969 but for memories.
|
| |
|
|
|
|
| |
A resize from a large amount to a small amount would sometimes not clear
the flexible storage, if we used it before but not after.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To unblock some optimizations. For example this:
```wat
(select
(i32.eqz (local.get $x))
(i32.const 0)
(i32.eqz (local.get $y))
)
```
Was previously optimized as:
```wat
(i32.eqz
(select
(i32.const 1)
(local.get $x)
(local.get $y)
)
)
```
Because `optimizeSelect` applied `!x ? !y : 0 -> x ? 0 : !y` then `!(x ? 1 : y)`, blocking the next rules which could have folded this to `or` or `and`.
After this PR the same example optimizes better:
```wat
(i32.eqz
(i32.or
(local.get $x)
(local.get $y)
)
)
```
|