| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A rather tricky corner case: we normally look at fallthrough values for copies of
fields, so when we try to refine a field, we ignore stuff like this:
a.x = b.x;
That copies the same field on the same type to itself, so refining is not limited by
it. But if we have something else in the middle, and that thing cannot change
type, then it is a problem, like this:
(struct.set
(..ref..)
(local.tee $temp
(struct.get)))
tee has the type of the local, which does not change in this pass. So we can't
look at just the fallthrough here and skip the tee: after refining the field, the
tee's old type might not fit in the field's new type.
We could perhaps add casts to fix things up, but those may have too big a
cost. For now, just ignore the fallthrough.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DAE will normally not remove an unreachable parameter, because it checks for
effects there. But in TrapsNeverHappen mode, we assume that an unreachable
is an effect we can remove, so we are willing to remove it:
(func $foo (param $unused i32)
;; never use $unused
)
(func $bar
(call $foo
(unreachable)))
;;=> dae+tnh
(func $foo
)
(func $bar
(call $foo))
But that transformation is invalid: the call's type was unreachable before but
no longer is. What went wrong here is that, yes, it is valid to remove an
unreachable, but we may need to update types while doing so, which we
were not doing.
This wasn't noticed before due to a combination of unfortunate factors:
The main reason is that this only happens in TrapsNeverHappens mode. We
don't fuzz that, because it's difficult: that mode can assume a trap never happens,
so a trap is undefined behavior really. On real-world code this is great, but in the
fuzzer it means that the output can seem to change after optimizations.
The validator happened to be missing an error for a call that has type unreachable
but shouldn't: Validator: Validate unreachable calls more carefully #4909 . Without
that, we'd only get an error if the bad type influenced a subsequent pass in a confusing
way - which is possible, but difficult to achieve (what ended up happening in practice is
that SignatureRefining on J2Wasm relied on the unreachable and refined a type too much).
Even with that fix, for the problem to be detected we'd need for the validation error to
happen in the final output, after running all the passes. In practice, though, that's not
likely, since other passes tend to remove unreachables etc. Pass-debug mode is very
useful for finding stuff like this, as it validates after every individual pass. Sadly it turns
out that global validation was off there: Validator: Validate globally by default #4906
(so it was catching the 99% of validation errors that are local, but this particular error
was in the remaining 1%...).
As a fix, simply ignore this case. It's not really worth the effort to optimize it, since DCE
will just remove unreachables like that anyhow. So if we run again after a DCE we'd get
a chance to optimize.
This updates some existing tests to avoid (unreachable). That was used as an example
of something with effects, but after this change it is treated more carefully. Replace those
things with something else that has effects (a call).
|
|
|
|
|
|
|
| |
The GC proposal has split `any` and `extern` back into two separate types, so
reintroduce `HeapType::ext` to represent `extern`. Before it was originally
removed in #4633, externref was a subtype of anyref, but now it is not. Now that
we have separate heaptype type hierarchies, make `HeapType::getLeastUpperBound`
fallible as well.
|
|
|
|
|
|
|
| |
This PR removes the single memory restriction in IR, adding support for a single module to reference multiple memories. To support this change, a new memory name field was added to 13 memory instructions in order to identify the memory for the instruction.
It is a goal of this PR to maintain backwards compatibility with existing text and binary wasm modules, so memory indexes remain optional for memory instructions. Similarly, the JS API makes assumptions about which memory is intended when only one memory is present in the module. Another goal of this PR is that existing tests behavior be unaffected. That said, tests must now explicitly define a memory before invoking memory instructions or exporting a memory, and memory names are now printed for each memory instruction in the text format.
There remain quite a few places where a hardcoded reference to the first memory persist (memory flattening, for example, will return early if more than one memory is present in the module). Many of these call-sites, particularly within passes, will require us to rethink how the optimization works in a multi-memories world. Other call-sites may necessitate more invasive code restructuring to fully convert away from relying on a globally available, single memory pointer.
|
|
|
|
|
|
| |
This allows emscripten to move these helper functions from JS library
imports to native wasm exports.
See https://github.com/emscripten-core/emscripten/issues/7273
|
|
|
| |
I was reading these tests and failing to find the names script.
|
|
|
|
|
| |
Also, add support for the `--binaryen-bin` flag to
`scripts/port_passes_tests_to_lit.py`. This is needed for folks who
don't do in-tree builds.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
call.without.effects will turn into a normal call of the last parameter later,
(call $call.without.effects
A
B
(ref.func $foo)
)
;; => intrinsic lowering
(call $foo
A
B
)
SignaturePruning needs to be aware of that: we can't remove a parameter from $foo without
also updating relevant calls to $call.without.effects. Rather than handle that, just skip such
cases, and leave them to be optimized after intrinsics are lowered away.
|
|
|
|
|
|
|
| |
A function literal (ref.func) should never reach a struct or array get, but
if there is a cast then it can look like they might arrive. We filter in ref.cast
which avoids that (since casting a function to a data type will trap), but
there is also br_on_cast which is not yet optimized. This PR adds code
to avoid an assert in readFromData in that case.
|
|
|
|
|
|
| |
eqz(eqz(i32(x))) -> i32(x) != 0
eqz(eqz(i64(x))) -> i64(x) != 0
Only when shrinkLevel == 0 (prefer speed over binary size).
|
|
|
|
|
|
|
| |
RTTs were removed from the GC spec and if they are added back in in the future,
they will be heap types rather than value types as in our implementation.
Updating our implementation to have RTTs be heap types would have been more work
than deleting them for questionable benefit since we don't know how long it will
be before they are specced again.
|
|
|
|
| |
Rather than doing it as a side effect of dumping the metadata in
wasm-emscripten-finalize.
|
|
|
|
|
|
|
|
|
|
|
|
| |
+ Move these rules to separate function;
+ Refactor them to use matches;
+ Add comments;
+ Handle rotational shifts as well;
+ Handle overflows for `<<`, `>>`, `>>>` shifts;
+ Add mixed rotate rules:
```rust
rotl(rotr(x, C1), C2) => rotr(x, C1 - C2)
rotr(rotl(x, C1), C2) => rotl(x, C1 - C2)
```
|
|
|
|
|
|
|
|
| |
Like RemoveUnusedModuleElements, places that build graphs of function
reachability must special-case the call-without-effects intrinsic. Without that,
it looks like a call to an import. Normally a call to an import is fine - it makes us
be super-pessimistic, as we think things escape all the way out - but in GC
for now we are assuming a closed world, and so we end up broken. To fix that,
properly handle the intrinsic case.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are two new potential problems that `GlobalTypeRewriter` can run into when
working with isorecursive types instead of nominal types. First, the refined
types may have replaced generic references with references to specific other
types, potentially creating new recursions and making the existing recursion
groups insufficient. Second, distinct types may be refined to structurally
identical types and those distinct input types may map the same output type,
potentially changing cast behavior.
Both of these problems are solved by putting all the new types in a single large
recursion group.
We do not currently account for the fact that types may be used in the external
interface of the module, but when we do, externalized types will be excluded
from optimizations and will not be affected by the creation of this single large
rec group.
Fixes #4816.
|
|
|
|
|
|
| |
constants on RHS (#4808)
(x * C1) << C2 -> x * (C1 << C2)
(x << C1) * C2 -> x * (C2 << C1)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This tracks the possible contents in the entire program all at once using a single IR.
That is in contrast to say DeadArgumentElimination of LocalRefining etc., all of whom
look at one particular aspect of the program (function params and returns in DAE,
locals in LocalRefining). The cost is to build up an entire new IR, which takes a lot
of new code (mostly in the already-landed PossibleContents). Another cost
is this new IR is very big and requires a lot of time and memory to process.
The benefit is that this can find opportunities that are only obvious when looking
at the entire program, and also it can track information that is more specialized
than the normal type system in the IR - in particular, this can track an ExactType,
which is the case where we know the value is of a particular type exactly and not
a subtype.
|
|
|
|
|
|
|
|
|
| |
Basic reference types like `Type::funcref`, `Type::anyref`, etc. made it easy to
accidentally forget to handle reference types with the same basic HeapTypes but
the opposite nullability. In principle there is nothing special about the types
with shorthands except in the binary and text formats. Removing these shorthands
from the internal type representation by removing all basic reference types
makes some code more complicated locally, but simplifies code globally and
encourages properly handling both nullable and non-nullable reference types.
|
|
|
|
|
| |
For them to be the same we must have a value that can appear on both
sides. If the heap types disallow that, then only null is possible, and if
that is impossible as well then the result must be 0.
|
|
|
|
|
| |
Minor fuzz bug. When we replace a struct.set with its children we also
add a ref.as_non_null on the reference, but that must not occur before
effects in the other child.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This marks all reference operations that return 0/1 as doing so. This
allows various bitwise operations to be optimized on them.
This also marks StringEq as a boolean, though we can't test that fully yet
as Strings support is wip (no interpreter or other stuff yet).
As a driveby this moves emitsBoolean to its own file, and uses it
in getMaxBits to avoid redundancy (the redundant code paths now have
a WASM_UNREACHABLE).
|
|
|
|
|
|
|
|
|
|
|
| |
(#4749)
(ref.eq
(local.tee $x (..))
(local.get $x)
)
That will definitely return 1. Before this PR the side effects of tee stopped us
from optimizing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
#4748 regressed us in some cases, because it removed casts first:
(ref.is_func
(ref.as_func
(local.get $anyref)))
If the cast is removed first, and the local has no useful type info, then
we'd have removed the cast but could not remove the ref.is. But
the ref.is could be optimized to 1, as it must be a func - the type
info proves it thanks to the cast. To avoid this, remove casts after
everything else.
|
|
|
|
|
|
| |
(#4748)
Comparing references does not depend on the cast, so if we are ignoring
traps in traps-never-happen mode then we can remove them.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Updating wasm.h/cpp for DataSegments
* Updating wasm-binary.h/cpp for DataSegments
* Removed link from Memory to DataSegments and updated module-utils, Metrics and wasm-traversal
* checking isPassive when copying data segments to know whether to construct the data segment with an offset or not
* Removing memory member var from DataSegment class as there is only one memory rn. Updated wasm-validator.cpp
* Updated wasm-interpreter
* First look at updating Passes
* Updated wasm-s-parser
* Updated files in src/ir
* Updating tools files
* Last pass on src files before building
* added visitDataSegment
* Fixing build errors
* Data segments need a name
* fixing var name
* ran clang-format
* Ensuring a name on DataSegment
* Ensuring more datasegments have names
* Adding explicit name support
* Fix fuzzing name
* Outputting data name in wasm binary only if explicit
* Checking temp dataSegments vector to validateBinary because it's the one with the segments before we processNames
* Pass on when data segment names are explicitly set
* Ran auto_update_tests.py and check.py, success all around
* Removed an errant semi-colon and corrected a counter. Everything still passes
* Linting
* Fixing processing memory names after parsed from binary
* Updating the test from the last fix
* Correcting error comment
* Impl kripken@ comments
* Impl tlively@ comments
* Updated tests that remove data print when == 0
* Ran clang format
* Impl tlively@ comments
* Ran clang-format
|
|
|
|
| |
Spec and VM support for that is not yet stable (atm VMs do not allow complex user-
defined types to be passed around).
|
|
|
|
| |
Otherwise when a type is only used on a global, it will be incorrectly omitted
from the output.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In GSI we look for a read of a global in a situation like this:
$global1: value1
$global2: value2
(struct.get $Type (ref))
If global inference shows this get must be of either $global1 or $global2, then we
can optimize to this:
(ref) == $global1 ? value1 : value2
We focus on the case of two values because 1 is handled by other passes, and >2
makes the tradeoffs less clear.
However, a simple extension is the case where there are more than 2 globals, but
there are only two values, and one value is unique to one global:
$global1: valueA
$global2: valueB
$global3: valueA
=>
(ref) == $global2 ? valueB : valueA
We can still use a single comparison here, on the global that has the
unique value. Then the else will handle all the other globals.
This increases the cases that GSI can optimize J2Wasm output by over 50%.
|
|
|
|
|
|
| |
Similar to #4004 but for 32-bit integers
i32(x) << 24 >> 24 ==> i32.extend8_s(x)
i32(x) << 16 >> 16 ==> i32.extend16_s(x)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This optimizes constants in the megamorphic case of two: when we
know two function references are possible, we could in theory emit this:
(select
(ref.func A)
(ref.func B)
(ref.eq
(..ref value..) ;; globally, only 2 things are possible here, and one has
;; ref.func A as its value, and the other ref.func B
(ref.func A))
That is, compare to one of the values, and emit the two possible values there.
Other optimizations can then turn a call_ref on this select into an if over
two direct calls, leading to devirtualization.
We cannot compare a ref.func directly (since function references are not
comparable), and so instead we look at immutable global structs. If we
find a struct type that has only two possible values in some field, and
the structs are in immutable globals (which happens in the vtable case
in j2wasm for example), then we can compare the references of the struct
to decide between the two values in the field.
|
|
|
|
|
|
|
|
|
| |
SimplifyLocals (#4705)
Followup to #4703, this also handles the case where there is a non-
nullable local.set in the value of a nullable one, which we also cannot
optimize.
Fixes #4702
|
|
|
|
|
|
|
| |
Binaryen will not change dominance in SimplifyLocals, however, the current spec's
notion of dominance is simpler than ours, and we must not optimize certain cases in
order to still validate. See details in the comment and test.
Helps #4702
|
|
|
|
|
|
| |
calls (#4660)
This extends the existing call_indirect code to do the same for call_ref,
basically. The shared code is added to a new helper utility.
|
|
|
|
|
|
|
|
|
|
| |
Optionally avoid updating types in TypeUpdating::updateParamTypes(). That update
is incomplete if the function signature is also changing, which is the case in
SignatureRefining (but not DeadArgumentElimination). "Incomplete" means that
we updated the local.get type, but the function signature does not match yet. That
incomplete state can hit an internal error in GlobalTypeRewriter::updateSignatures
where it updates types. To avoid that, do the entire full update only there (in
GlobalTypeRewriter::updateSignatures).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we could return different results depending on the order we
noted things:
note(anyref.null);
note(funcref.null);
get() => anyref.null
note(funcref.null);
note(anyref.null);
get() => funcref.null
This is correct, as nulls are equal anyhow, and any could be used in
the location we are optimizing. However, it can lead to nondeterminism
if the caller's order of notes is nondeterministic. That is the case in
DeadArgumentElimination, where we scan functions in parallel, then
merge them without special ordering.
To fix this, make the note operation symmetric. That seems simplest and
least likely to be confusing. We can use the LUB to do that.
To avoid duplicating the null logic, refactor note() to use combine().
|
|
|
|
|
| |
Casts involve branches in the VM, so adding a cast in return for removing a branch
(like If=>Select) is not beneficial. We don't want to ever do any more casts than we
already are.
|
|
|
|
|
|
|
| |
Do not prune parameters if there is a supertype that is a signature.
Without this we crash on an assertion in TypeBuilder when we try to
recreate the types (as we try to make a a subtype with fewer fields
than the super).
|
| |
|
|
|
|
|
|
| |
Remove `Type::externref` and `HeapType::ext` and replace them with uses of
anyref and any, respectively, now that we have unified these types in the GC
proposal. For backwards compatibility, continue to parse `extern` and
`externref` and maintain their relevant C API functions.
|
|
|
|
|
|
| |
V8 requires that supertypes come before subtypes when it parses
isorecursive (i.e. standards-track) type definitions. Since 2268f2a we are
emitting nominal types using the standard isorecursive format, so respect the
ordering requirement.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We assume a closed world atm in the GC space, but the call.without.effects
intrinsic sort of breaks that: that intrinsic looks like an import, but we really
need to care about what is sent to it even in a closed world:
(call $call-without-effects
(ref.func $target-keep)
)
That reference cannot be ignored, as logically it is called just as if there
were a call_ref there. This adds support for that, fixing the combination of
#4621 and using call.without.effects.
Also flip the vector of ref.func names to a set. I realized that in a very
large program we might see the same name many times.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we see (ref.func $foo) that does not mean that $foo is reachable - we
must also see a (call_ref ..) of the proper type. Only after seeing both should
we mark the function as reachable, which this PR does.
This adds some complexity as we need to track intermediate state as we go,
since we could see the RefFunc before the CallRef or vice versa. We also
need to handle the case of a RefFunc without a CallRef properly: We cannot
remove the function, as the RefFunc must refer to it, but at least we can
empty out the body since we know it is never reached.
This removes an old wasm-opt test which is now superseded by a new lit
test.
On J2Wasm output this removes 3% of all functions, which account for
2.5% of total code size.
|
|
|
|
|
|
|
|
|
| |
Casts can replace a type with a subtype, which normally has no downsides, but
in a corner case of struct types it can lead to us needing to refinalize higher up
too, see details in the comment.
We have avoided any Refinalize calls in OptimizeInstructions, but the case
handled here requires it sadly. I considered moving it to another pass, but this
is a peephole optimization so there isn't really a better place.
|
|
|
| |
This hits the fuzzer when it tries to call reference exports with a null.
|
|
|
|
|
|
|
|
|
|
|
| |
The cast instruction may be unreachable but the intended type for the cast
still needs to be collected. Otherwise we end up with problems both during
optimizations that look at heap types and in printing (which will use the heap
type in code but not declare it).
Diff without whitespace is much smaller: this just moves code around so
that we can use a template to avoid code duplication. The actual change
is just to scan ->intendedType unconditionally, and not ignore it if the
cast is unreachable.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a field has no reads, we remove all its writes, but we did this:
(struct.set $foo A B)
=>
(drop A) (drop B)
We also need to trap if A, the reference, is null, which this PR
fixes,
(struct.set $foo A B)
=>
(drop (ref.as_non_null A)) (drop B)
|
|
|
|
|
|
| |
This fixes two bugs: First, we need to compare the nominal types of function
constants when looking for constants to "merge", not just their structure.
Second, when creating the new function we must use the proper type of
those constants, and not just another type.
|