| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Remove old, experimental instructions and type encodings that will not be
shipped as part of WasmGC. Updating the encodings and text format to match the
final spec is left as future work.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Br and BrOn can consider the code before and after them connected if it might
be reached (which is the case if the Br has a condition, which BrOn always has).
The wasm2js changes may look a little odd as some of them have this:
i64toi32_i32$1 = i64toi32_i32$2;
i64toi32_i32$1 = i64toi32_i32$2;
I looked into that and the reason is that those outputs are not optimized, and
also even in unoptimized wasm2js we do run simplify-locals once (to try to
reduce the downsides of flatten). As a result, this PR makes a difference there,
and that difference can lead to such odd duplicated code after other operations.
However, there are no changes to optimized wasm2js outputs, so there is no
actual problem.
Followup to #5860.
|
|
|
|
|
|
|
|
|
|
|
| |
SimplifyLocals (#5860)
This addresses most of the minor regression from the correctness fix in #5857.
That PR makes us consider calls as branching, but in some cases it is ok to
ignore that branching (see the comment in the code here), which this PR allows as
an option.
This undoes one test change from that PR, showing it undoes the regression for
SimplifyLocals. More tests are added to cover this specifically as well.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Calls were simply not handled there, so we could think we were still in the same
basic block when we were not, affecting various passes (but somehow this went
unnoticed until the TNHOracle #5850 ran on some particular Java code).
One existing test was affected, and two new tests are added: one for TNHOracle
where I detected this, and one in OptimizeCasts which is perhaps a simpler way
to see the problem.
All the cases but the TNH one, however, do not need this fix for correctness
since they actually don't care if a call would throw. As a TODO, we should find a
way to undo this minor regression. The regression only affects builds with EH
enabled, though, so most users should be unaffected even in the interm.
|
|
|
|
|
|
|
|
| |
None of that code is speed-sensitive, or at least doesn't need to be inlined to be
fast. Move it to cpp for faster compile times.
This caused a cascade of necessary header fixes (i.e. after removing unneeded
header inclusions in module-utils.h, files that improperly depended on that
stopped working and needed an added include).
|
|
|
|
|
|
|
|
| |
In order of appearance in this diff:
* Use heapType rather than otherHeapType when either will do, for brevity.
* We checked isNull after checking for isLiteral (line 173), but every null is a literal.
* Calling setNoneOrNull simplifies some repetitive code.
* We checked isLiteral yet again lower down.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TypeMapper is a utility used to globally rewrite types, mapping some eliminated
source types into destination types they should be replaced with. This was
previously done by first rewriting all the types in the IR according to the
given mapping, then rewriting the type definitions and updating all the types in
the IR again. Not only was doing the rewriting twice inefficient, it also
introduced a subtle bug where the set of private types eligible to be rewritten
could be inconsistent because updating types in the IR could change the types of
control flow structures. The fuzzer found a case where this inconsistency caused
the type rebuilding to fail.
Fix the bug by first building the new types with the mapping applied and only
then rewriting the IR a single time.
Also add a `TypeBuilder::dump` utility for use in debugging.
Fixes #5845.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a TrapsNeverHappen oracle that is used inside the main PossibleContents
oracle of GUFA. The idea is that when traps never happen we can reason "backwards"
from information to things that must be true before it:
temp = x.field;
x.cast_to<Y>(); // Y is a subtype of x's type X
Here we cast x to a subtype. If we assume traps never happen then the cast must
succeed, and that means we can assume we had a Y on the previous line, where
perhaps that information lets us infer the value of x.field.
This PR focuses on calls, which are the more interesting situation to optimize
because other passes do some work already inside functions. Specifically, we look
for things that will trap in the called function or the caller, such as if the called
function always casts a param to some type, we can assume the caller passes
such a type in. And if we have a call_ref then any target that would trap cannot be
called (at least in a closed world).
This has some benefits, in particular when combined with --gufa-cast-all since
that casts more things, which lets us apply the inferences made here. I see 3.3%
fewer call_ref instructions on a Kotlin testcase, for example. This helps more
on -Os when we inline less.
|
| |
|
|
|
|
|
|
|
|
|
| |
A pass that just operates on locals, for example, does not care about branches outside of
the function. That means that when we see a call, then even if EH is enabled we don't need
to create a new basic block right after it (unless the call is inside a try-catch - then it might
branch to the catch, of course).
This makes CFG-using passes 9% faster compared to before this PR. This plus #5827 offset
the slowdown from #5823 and overall give an improvement compared to before.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before this PR, if a call had no paths to a catch in the same function then we skipped
creating a new basic block right after it. As a result, we could have a call in the middle
of a basic block. If EH is enabled that means we might transfer control flow out of
the function from the middle of a block. But it is better to have the property that
any transfer of control flow - to another basic block, or outside of the function - can
only happen at the end of a basic block.
This causes some overhead, but a subsequent PR (#5838) will remove that as a
followup, and this PR adds a little code to pass the module and check if EH is enabled,
and avoid the overhead if not, which at least avoids regressing the non-EH case
until that followup lands.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement support in the type system for final types, which are not allowed to
have any subtypes. Final types are syntactically different from similar
non-final types, so type canonicalization is made aware of finality. Similarly,
TypeMerging and TypeSSA are updated to work correctly in the presence of final
types as well.
Implement binary and text parsing and emitting of final types. Use the standard
text format to represent final types and interpret the non-standard
"struct_subtype" and friends as non-final. This allows a graceful upgrade path
for users currently using the non-standard text format, where they can update
their code to use final types correctly at the point when they update to use the
standard format. Once users have migrated to using the fully expanded standard
text format, we can update update Binaryen's parsers to interpret the MVP
shorthands as final types to match the spec without breaking those users.
To make it safe for V8 to independently start interpreting types declared
without `sub` as final, also reserve that shorthand encoding only for types that
have no strict subtypes.
|
|
|
|
|
|
|
|
| |
Previously we incorrectly used "strict" to mean the immediate subtypes of a
type, when in fact a strict subtype of a type is any subtype excluding the type
itself. Rename the incorrect `getStrictSubTypes` to `getImmediateSubTypes`,
rename the redundant `getAllStrictSubTypes` to `getStrictSubTypes`, and rename
the redundant `getAllSubTypes` to `getSubTypes`. Fixing the capitalization of
"SubType" to "Subtype" is left as future work.
|
|
|
|
|
|
|
| |
Rather than wrap a `TypeList`, make `Tuple` an alias of `TypeList`. This means
removing `Tuple::toString`, but that had no callers and was of limited use for
debugging anyway. In return, the use of tuples becomes much less verbose.
In the future, it may make sense to remove one of `Tuple` and `TypeList`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#5764)
EffectAnalyzer::canReorder/invalidate now assume that the things from whom we
generated the effects both execute (or, rather, that if the first of them doesn't
transfer control flow then they execute). If they both execute then we can do more work
in TrapsNeverHappen mode, since we can then reorder this for example:
(global.set ..)
(i32.load ..)
The load may trap, but in TNH mode we assume it won't. So we can reorder those
two. However, if they did not both execute then we could be in this situation:
(global.set ..)
(br_if ..)
(i32.load)
Reordering the load and the set here would be invalid, because we could make the
load execute when it didn't execute before, and it could now start to actually
trap at runtime.
This new assumption seems obvious, since we don't compare the effects of
things unless they are adjacent and with no control flow between them - otherwise,
why compare them? To be sure, I manually reviewed every single use of
EffectAnalyzer::canReorder/invalidate in the entire codebase.
I've also been fuzzing this for several days (hundreds of thousands of iterations),
and have not seen any problem.
This was motivated by seeing that #5744 should be able to do more work in TNH
mode, but it wasn't. New tests show the benefits there in OptimizeCasts as well
as in SimplifyLocals.
|
|
|
|
| |
More generally, the LUB computation that code relies on did not handle
bottom types properly.
|
|
|
|
|
|
|
|
|
|
|
|
| |
The final versions of the br_on_cast and br_on_cast_fail instructions have two
reference type annotations: one for the input type and one for the cast target
type. In the binary format, this is represented as a flags byte followed by two
encoded heap types. Upgrade all of the tests at once to use the new versions of
the instructions and drop support for the old instructions from the text parser.
Keep support in the binary parser to avoid breaking users, though. Drop some
binary tests of deprecated instruction encodings that would be more effort to
update than they're worth.
Re-land with fixes of #5734
|
|
|
|
| |
This method doesn't seem to be used anymore. It was added in #1771 and
its uses were removed in #2492.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This PR removes a check for
transfersControlFlow() && other.trap
which is already checked higher up in the code, here:
transfersControlFlow() && other.hasSideEffects()
In
binaryen/src/ir/effects.h , lines 223 to 224
That last code handles the first code because trapping is part
of hasSideEffects().
|
|
|
|
| |
The import information of Tags and Memories was not preserved.
|
|
|
|
|
|
|
| |
This reverts commit b7b1d0df29df14634d2c680d1d2c351b624b4fbb.
See comment at the end of #5734: It turns out that dropping the old opcodes causes
problems for current users, so let's revert this for now, and later we can figure out
how best to do the update.
|
|
|
|
|
|
|
|
|
|
| |
The final versions of the br_on_cast and br_on_cast_fail instructions have two
reference type annotations: one for the input type and one for the cast target
type. In the binary format, this is represented as a flags byte followed by two
encoded heap types. Since these instructions have been in flux for a while, do
not attempt to maintain backward compatibility with older versions of the
instructions. Instead, upgrade all of the tests at once to use the new versions
of the instructions. Drop some binary tests of deprecated instruction encodings
that would be more effort to update than they're worth.
|
|
|
|
|
|
| |
We depend on repeated calls to walk/visit accumulating effects, so this
was a bug; if we want to clear stuff then we create a new EffectAnalyzer.
Removing that fixes the attached testcase. Also added a unit test.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We used to have a wasm-merge tool but removed it for a lack of use cases. Recently
use cases have been showing up in the wasm GC space and elsewhere, as people are
using more diverse toolchains together, for example a project might build some C++
code alongside some wasm GC code. Merging those wasm files together can allow
for nice optimizations like inlining and better DCE etc., so it makes sense to have a
tool for merging.
Background:
* Removal: #1969
* Requests:
* wasm-merge - why it has been deleted #2174
* Compiling and linking wat files #2276
* wasm-link? #2767
This PR is a compete rewrite of wasm-merge, not a restoration of the original
codebase. The original code was quite messy (my fault), and also, since then
we've added multi-memory and multi-table which makes things a lot simpler.
The linking semantics are as described in the "wasm-link" issue #2767 : all we do
is merge normal wasm files together and connect imports and export. That is, we
have a graph of modules and their names, and each import to a module name can
be resolved to that module. Basically, like a JS bundler would do for JS, or, in other
words, we do the same operations as JS code would do to glue wasm modules
together at runtime, but at compile time. See the README update in this PR for a
concrete example.
There are no plans to do more than that simple bundling, so this should not
really overlap with wasm-ld's use cases.
This should be fairly fast as it works in linear time on the total input code. However,
it won't be as fast as wasm-ld, of course, as it does build Binaryen IR for each
module. An advantage to working on Binaryen IR is that we can easily do some
global DCE after merging, and further optimizations are possible later.
|
|
|
|
|
|
|
|
|
|
|
| |
See WebAssembly/stringref#46.
This format is already adopted by V8: https://chromium-review.googlesource.com/c/v8/v8/+/3892695.
The text format is left unchanged (see #5607 for a discussion on the subject).
I have also added support for string.encode_lossy_utf8 and
string.encode_lossy_utf8 array (by allowing the replace policy for
Binaryen's string.encode_wtf8 instruction).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds an option to ignore effects in the parent in
getDroppedChildrenAndAppend. With that, this becomes usable in more
places, like Directize, basically in situations where we know we can ignore
effects in the parent (since we've inferred they are not needed). This lets
us get rid of some boilerplate code in Directize.
Diff without whitespace is a lot smaller. A large other part of the diff is a
rename of curr => parent which I think it makes it more readable as then
parent/children is a clear contrast, and then the new parameter "ignore/
notice parent effects" is obviously connected to "parent".
The top comment in drop.cpp is removed as it just duplicated the top
comment in the header drop.h.
This is basically NFC but using drop.h does bring the advantage of
emitting less code, see the test changes, so it is noticeable in the IR.
This is a refactoring PR in preparation for a larger improvement to
Directize that will also benefit from this new drop capability.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This changes loops from having the effect "may trap (timeout)" to having
"may not return." The only noticeable difference is in TrapsNeverHappen mode,
which ignores the former but not the latter. So after this PR, in TNH mode we do
not optimize away an infinite loop that seems to have no other side effects. We
may also use this for other things in the future, like continuations/stack switching.
There are upsides and downsides to allowing the optimizer to remove infinite
loops (the C and C++ communities have had interesting discussions on that topic
over the years...) but it seems safer to not optimize them out for now, to let the
most code work properly. If a need comes up to optimize such code, we can look
at various options then (like a flag to ignore infinite loops).
See discussion in #5228
|
|
|
|
|
|
|
|
|
|
|
|
| |
wasm-delegations-fields (#5690)
This makes delegations-fields track Kinds. That is, rather than say a field is
just a Name, we can say it is a name of kind Function. This allows users to
track references to functions, tables, memories, etc., in a simple and generic
way, avoiding duplicated code which we have atm. (In particular this will help
wasm-merge in the future.)
This also uses that functionality in two small places to show the benefits
(see memory-utils.cpp and MemoryPacking.cpp).
|
|
|
|
|
|
|
|
|
|
|
| |
Data/Elem (#5692)
ArrayNewSeg => ArrayNewSegData, ArrayNewSegElem
ArrayInit => ArrayInitData, ArrayInitElem
Basically we remove the opcode and use the class type to differentiate them.
This adds some code but it makes the representation simpler and more compact in
memory, and it will help with #5690
|
| |
|
|
|
|
|
| |
And since the only type system left is the standard isorecursive type system,
remove `TypeSystem` and its associated APIs entirely. Delete a few tests that
only made sense under the isorecursive type system.
|
|
|
|
|
|
|
| |
Before this PR we hit the assert on the type not being basic.
We could also look into fixing the caller to skip bottom types, but as
bottom types trivially have no subtypes, it is more future-facing to
simply handle it.
|
|
|
|
|
|
|
|
|
|
|
| |
Casting (ref nofunc) to (ref func) seems like it can succeed based on the rule
of "if it's a subtype, it can cast ok." But the fuzzer found a corner case where that
leads to a validation error (see testcase).
Refactor the cast evaluation logic to handle uninhabitable refs directly, and
return Unreachable for them (since the cast cannot even be reached).
Also reorder the rule checks there to always check for a non-nullable cast
of a bottom type (which always fails).
|
|
|
|
|
|
|
| |
Without the hint, we always look for a valid name using name$0, $1, $2, etc.,
starting from 0, and in some cases that can lead to quadratic behavior.
Noticed on a testcase in the fuzzer that runs for over 24 seconds (I gave up at
that point) but takes only 2 seconds with this.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Technically we need to filter both before and after combining, that is,
if a location's contents will be filtered by F() then if the new contents
are x and old contents y then we need to end up with
F(F(x) U F(y)). That is, filtering before is necessary to ensure the union
of content does not end up unnecessarily large, and the filtering
after is necessary to ensure the final result is properly filtered to fit.
(If our representation were perfect then this would not be needed, but
it is not, as the union of two exact types can end up as a very large
cone, for example.)
For efficiency we have been filtering afterwards. But that is not enough
for packed fields, it turns out, where we must filter before. If we don't,
then if inputs 0 and 0x100 arrive to an i8 field then combining them
we get "unknown integer" (which is then filtered by 0xff, but it's too
late). By filtering before, the actual values are both 0 and we end up
with that as the only possible value.
It turns out that filtering before is enough for such fields, so do only
that.
|
|
|
| |
Like data.drop etc., it notices data segment identity.
|
|
|
|
|
| |
The same bug was present in both: We ignored packing, so writing a larger
value than fits in the field would lead to us propagating that original value.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously (ref.as_non_null (global.get ..)) would return the global with no changes,
and if the global was nullable then the type didn't match the output, which hit an
assertion (where GUFA checks that the contents match the declared type in the wasm).
To fix this, refine global types, that is, the type we track on GlobalInfo may be more
refined than the global itself. In the above example, if the global is nullable then
the GlobalInfo would point to that global but have a non-nullable type.
In fact the code was already prepared for this, and few changes were needed.
|
|
|
|
| |
(#5641)
|
| |
|
|
|
|
|
| |
These complement array.copy, which we already supported, as an initial complete
set of bulk array operations. Replace the WIP spec tests with the upstream spec
tests, lightly edited for compatibility with Binaryen.
|
|
|
|
|
|
|
|
|
|
| |
All top-level Module elements are identified and referred to by Name, but for
historical reasons element and data segments were referred to by index instead.
Fix this inconsistency by using Names to refer to segments from expressions that
use them. Also parse and print segment names like we do for other elements.
The C API is partially converted to use names instead of indices, but there are
still many functions that refer to data segments by index. Finishing the
conversion can be done in the future once it becomes necessary.
|
|
|
|
|
| |
That code should only propagate unreachability, and not refine. If it refines when
we call finalize() then other code around it might end up invalid (as it could be
partially refined; see testcase).
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both isValidInConstantExpression and isSingleConstantExpression must look
recursively at the internals of a RefAs that externalizes and internalizes, or else
we might do something like externalize a local.get, which is not constant.
getLiteral must handle externalize/internalize as well, and return a properly-
modified literal.
Without these fixes the testcase hits different internal assertions, and we either
fail to recognize something is constant or not, or think that it is but fail to
produce a literal for it.
|
|
|
|
|
|
|
|
|
|
| |
This fixes wasm-ctor-eval on evalling a GC data structure that contains a
field initialized with an externalized value.
Per the spec this is a constant instruction and I verified that V8 allows this.
Also add missing validation in wasm-ctor-eval of the output (which makes
debugging this kind of thing a little easier).
|
|
|
|
|
|
|
|
|
|
| |
Previously we treated global.get as a constant expression and only
additionally verified that the target globals were immutable in some cases. But
global.get of a mutable global is never a constant expression, and further,
only imported globals are available in constant expressions unless GC is
enabled.
Fix constant expression validation to only allow global.get of immutable,
imported globals, and fix all the invalid tests.
|
|
|
|
|
| |
The missing associated types will become necessary if we ever use these
iterators in a nontrivial manner. Make the parent reference into a pointer so
that the copy constructor and assignment operator are not implicitly deleted.
|
|
|
|
| |
Without this we hit an assertion on trying to write the binary, on a missing
heap type.
|