| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Before we always created if-elses. Now we also create an If with one arm some of
the time, when we can.
Also, sometimes make one if arm unreachable, if we have two arms.
|
|
|
|
|
|
|
| |
If we see (ref.cast $A) but we have inferred that a more refined type will
be present there at runtime $B then we can refine the cast to (ref.cast $B).
We could do the same even when a cast is not present, but that would
increase code size. This optimization keeps code size constant.
|
|
|
|
|
|
|
|
|
|
| |
This change creates a lattice, FinitePowersetLattice, to represent finite
powerset lattices constructed from sets containing members of arbitrary type
The bitvector finite powerset lattice class is renamed FiniteIntPowersetLattice.
The templated FinitePowersetLattice class contains a FiniteIntPowersetLattice
object, and over that provides functionality to map lattice element members
to bitvector indices. Methods are also provided by the lattice to add or
remove members of the given type from lattice members as an abstraction
over flipping bits in the bitvector.
|
|
|
|
|
|
|
| |
Analyzer (#5794)
This change makes the transfer function an object separate from the monotone analyzer. The transfer function class type is a generic template of the monotone analyzer, and an instance of the transfer function is instantiated and then passed into the monotone analyzer via its constructor. The API of the transfer function is enforced by a static assertion.
This change also introduces LivenessTransferFunction as a transfer function for liveness analysis as an example.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement support in the type system for final types, which are not allowed to
have any subtypes. Final types are syntactically different from similar
non-final types, so type canonicalization is made aware of finality. Similarly,
TypeMerging and TypeSSA are updated to work correctly in the presence of final
types as well.
Implement binary and text parsing and emitting of final types. Use the standard
text format to represent final types and interpret the non-standard
"struct_subtype" and friends as non-final. This allows a graceful upgrade path
for users currently using the non-standard text format, where they can update
their code to use final types correctly at the point when they update to use the
standard format. Once users have migrated to using the fully expanded standard
text format, we can update update Binaryen's parsers to interpret the MVP
shorthands as final types to match the spec without breaking those users.
To make it safe for V8 to independently start interpreting types declared
without `sub` as final, also reserve that shorthand encoding only for types that
have no strict subtypes.
|
|
|
|
|
|
|
|
| |
Previously we incorrectly used "strict" to mean the immediate subtypes of a
type, when in fact a strict subtype of a type is any subtype excluding the type
itself. Rename the incorrect `getStrictSubTypes` to `getImmediateSubTypes`,
rename the redundant `getAllStrictSubTypes` to `getStrictSubTypes`, and rename
the redundant `getAllSubTypes` to `getSubTypes`. Fixing the capitalization of
"SubType" to "Subtype" is left as future work.
|
|
|
|
|
|
| |
(#5799)
It can be useful to optimize a function body after its parameters are refined,
like we do for other parameter changes.
|
|
|
|
| |
Delete old, unused "NOMINAL" check lines and replace the sole remaining check
prefix, "HYBRID", with the standard "CHECK".
|
|
|
|
|
|
| |
Use the standard "(sub $super ...)" format instead of the non-standard
"XXX_supertype ... $super" format. In a follow-on PR implementing final types,
this will allow us to print and parse the standard text format for final types
right away with a smaller diff.
|
|
|
|
| |
This already worked (thanks to LocalGraph integration), but add an explicit test to
verify that just to be sure.
|
|
|
|
|
| |
This parallels the code in RefCast. Previously we only looked at the type reaching us, but
intermediate fallthrough values can let us optimize too. In particular, we were not
optimizing (ref.test (local.tee ..)) if the tee was to a less-refined type.
|
|
|
| |
StringifyWalker is a new Walker with UnifiedExpressionVisitor. This walker performs a shallow visit of control-flow (try, if, block, loop) expressions and their simple expression siblings before then visiting the children of each control-flow expression in postorder. As a result, this walker un-nests nested control flow structures, so the expression visit order does not correspond to a normal postorder traversal of the function.
|
|
|
|
| |
(#5791)
|
|
|
| |
This change introduces a finite-height powerset lattice FinitePowersetLattice where each element's height is determined when constructed in runtime. This is implemented using a vector of `bools. This change also modifies BlockState and MonotoneCFGAnalyzer to be templated on a generic lattice type instead of using a hardcoded lattice. It additionally fixes an iterator bug in MonotoneCFGAnalyzer::fromCFG which assigned a temporary iterator object as predecessor and successor pointers to BlockStates instead of pointers to actual BlockState objects.
|
|
|
|
|
|
|
|
| |
Previously we limited printing in a single Literals. But we can have
infinitely recursive GC literals, or just huge graphs even without
infinite recursion where no single Literals is that big (but we still
get exponential blowup). This PR adds a general limit on how much
we print once we start to print a Literal or Literals.
|
|
|
|
|
|
|
|
|
|
|
| |
This is a followup to #5333 . That fixed the selection of which passes to run, but
forgot to also fix the global state of the current optimize/shrink levels. This PR
fixes that. As a result, running -O3 -Oz will now work as expected: the first -O3
will run the right passes (as #5333 fixed) and while running them, the global
optimize/shrinkLevels will be -O3 (and not -Oz), which this PR fixes.
A specific result of this is that -O3 -Oz used to inline less, since the invocation
of inlining during -O3 thought we were optimizing for size. The new test verifies
that we do fully inline in the first -O3 now.
|
|
|
| |
This introduces a limited monotone flow-sensitive liveness analysis on local indices as an initial proof of concept for the creation of a monotone flow-sensitive static analysis framework. Tests are included in test/gtest/cfg.cpp.
|
| |
|
|
|
|
|
| |
Such tests all have to handle disabling and re-enabling color output for printed
wast and have to handle parsing input wast, so add a test fixture that makes
these tasks easier.
|
| |
|
|
|
| |
Subtypes are allowed as well, not just exact matches, in the pop value's type.
|
|
|
|
|
|
|
| |
Rather than wrap a `TypeList`, make `Tuple` an alias of `TypeList`. This means
removing `Tuple::toString`, but that had no callers and was of limited use for
debugging anyway. In return, the use of tuples becomes much less verbose.
In the future, it may make sense to remove one of `Tuple` and `TypeList`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pass strips all EH stuff, including EH instructions and tags, from
the input module and disables the EH feature from the features section.
1. This removes `catch` and `catch_all` blocks from the code. So
```wast
(try
(do
(some code)
)
(catch
...
)
)
```
becomes just `(some code)`. Note that all `rethrow`s will be removed
with `catch`es. Note that all `rethrow`s will be removed with `catch`es.
2. This converts 'throw (...)` into `unreachable`. Note that `rethrows
3. This removes all tags from the module, which are unused anyway after
1 and 2.
4. This removes exception handling feature from the features section.
You can use the pass with
```console
$ wasm-opt --enable-exception-handling --strip-eh INPUT -o OUTPUT
```
This is not an optimization pass, so it is not run unless you specify
the pass explicitly.
This is in effect similar to Clang's `-fignore-exceptions`, in which you
can throw but it will result in a crash and we compile away all landing
pads. This can be used for people who don't (or can't) use
`-fignore-exceptions` in their build settings or who want to compile
away `catch` blocks later.
Closes emscripten-core/emscripten#19585.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#5764)
EffectAnalyzer::canReorder/invalidate now assume that the things from whom we
generated the effects both execute (or, rather, that if the first of them doesn't
transfer control flow then they execute). If they both execute then we can do more work
in TrapsNeverHappen mode, since we can then reorder this for example:
(global.set ..)
(i32.load ..)
The load may trap, but in TNH mode we assume it won't. So we can reorder those
two. However, if they did not both execute then we could be in this situation:
(global.set ..)
(br_if ..)
(i32.load)
Reordering the load and the set here would be invalid, because we could make the
load execute when it didn't execute before, and it could now start to actually
trap at runtime.
This new assumption seems obvious, since we don't compare the effects of
things unless they are adjacent and with no control flow between them - otherwise,
why compare them? To be sure, I manually reviewed every single use of
EffectAnalyzer::canReorder/invalidate in the entire codebase.
I've also been fuzzing this for several days (hundreds of thousands of iterations),
and have not seen any problem.
This was motivated by seeing that #5744 should be able to do more work in TNH
mode, but it wasn't. New tests show the benefits there in OptimizeCasts as well
as in SimplifyLocals.
|
|
|
|
| |
More generally, the LUB computation that code relies on did not handle
bottom types properly.
|
|
|
|
| |
The logic ignored copied values, which was fine for struct.get operations but
not for struct.new.
|
|
|
|
|
|
|
|
|
|
|
|
| |
The final versions of the br_on_cast and br_on_cast_fail instructions have two
reference type annotations: one for the input type and one for the cast target
type. In the binary format, this is represented as a flags byte followed by two
encoded heap types. Upgrade all of the tests at once to use the new versions of
the instructions and drop support for the old instructions from the text parser.
Keep support in the binary parser to avoid breaking users, though. Drop some
binary tests of deprecated instruction encodings that would be more effort to
update than they're worth.
Re-land with fixes of #5734
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we have
(struct.get $A
(struct.get $B
then if both types end up refined we may have a problem. If the inner one is
refined to emit nullref then the outer one no longer knows what type it is,
since it depends on the type of the ref child for that in our IR. We can't just
skip updating it, as the outside may depend on its new refined type to
validate. To avoid errors here, just make this code that is effectively
unreachable also actually unreachable.
|
|
|
| |
Related to #5737 which did something similar for other types.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#5744)
In the OptimizeCasts pass, it is useful to move more refined casts as early as
possible without causing side-effects. This will allow such casts to potentially
trap earlier, and will allow the OptimizeCasts pass to use more refined casts
earlier. This change allows a more refined cast to be duplicated at an earlier
local.get expression. The later instance of the cast will then be eliminated in
a later optimization pass.
For example, if we have the following instructions:
(drop
(local.get $x)
)
(drop
(ref.cast $A
(local.get $x)
)
(drop
(ref.cast $B
(local.get $x)
)
)
Where $B is a sublcass of $A, we can convert this to:
(drop
(ref.cast $B
(local.get $x)
)
)
(drop
(ref.cast $A
(local.get $x)
)
(drop
(ref.cast $B
(local.get $x)
)
)
Concretely we will save the first cast to a local and use it in the other
local.gets.
|
|
|
|
|
|
|
|
|
| |
We previously had logic to emit GC types used in the IR as their corresponding
top types when GC was not enabled (so e.g. nullfuncref would be emitted as
funcref), but the logic was not robust enough and non-null function references
were not properly emitted as funcref.
Refactor the relevant code to be more robust and future-proof, and add a test
demonstrating that the lowering works as intended.
|
|
|
|
|
|
|
| |
No nop instruction is necessary in wasm, so in StackIR we can simply
remove them all.
Fixes #5745
|
|
|
|
| |
The import information of Tags and Memories was not preserved.
|
|
|
|
|
|
|
| |
This reverts commit b7b1d0df29df14634d2c680d1d2c351b624b4fbb.
See comment at the end of #5734: It turns out that dropping the old opcodes causes
problems for current users, so let's revert this for now, and later we can figure out
how best to do the update.
|
| |
|
|
|
| |
Fixes #5720
|
|
|
|
|
|
|
|
|
|
| |
The final versions of the br_on_cast and br_on_cast_fail instructions have two
reference type annotations: one for the input type and one for the cast target
type. In the binary format, this is represented as a flags byte followed by two
encoded heap types. Since these instructions have been in flux for a while, do
not attempt to maintain backward compatibility with older versions of the
instructions. Instead, upgrade all of the tests at once to use the new versions
of the instructions. Drop some binary tests of deprecated instruction encodings
that would be more effort to update than they're worth.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds two rules to vacuum in TNH mode:
if (..) trap() => if (..) {}
{ stuff, trap() } => {}
That is, we assume traps never happen so an if will not branch to one, and
code right before a trap can be assumed to not execute. Together, we should
be removing practically all possible code in TNH mode (though we could also
add support for br_if etc.).
|
|
|
|
| |
The function type should be printed there just like for non-imported
functions.
|
|
|
|
|
|
| |
We depend on repeated calls to walk/visit accumulating effects, so this
was a bug; if we want to clear stuff then we create a new EffectAnalyzer.
Removing that fixes the attached testcase. Also added a unit test.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We used to have a wasm-merge tool but removed it for a lack of use cases. Recently
use cases have been showing up in the wasm GC space and elsewhere, as people are
using more diverse toolchains together, for example a project might build some C++
code alongside some wasm GC code. Merging those wasm files together can allow
for nice optimizations like inlining and better DCE etc., so it makes sense to have a
tool for merging.
Background:
* Removal: #1969
* Requests:
* wasm-merge - why it has been deleted #2174
* Compiling and linking wat files #2276
* wasm-link? #2767
This PR is a compete rewrite of wasm-merge, not a restoration of the original
codebase. The original code was quite messy (my fault), and also, since then
we've added multi-memory and multi-table which makes things a lot simpler.
The linking semantics are as described in the "wasm-link" issue #2767 : all we do
is merge normal wasm files together and connect imports and export. That is, we
have a graph of modules and their names, and each import to a module name can
be resolved to that module. Basically, like a JS bundler would do for JS, or, in other
words, we do the same operations as JS code would do to glue wasm modules
together at runtime, but at compile time. See the README update in this PR for a
concrete example.
There are no plans to do more than that simple bundling, so this should not
really overlap with wasm-ld's use cases.
This should be fairly fast as it works in linear time on the total input code. However,
it won't be as fast as wasm-ld, of course, as it does build Binaryen IR for each
module. An advantage to working on Binaryen IR is that we can easily do some
global DCE after merging, and further optimizations are possible later.
|
|
|
|
|
|
|
|
|
|
|
| |
See WebAssembly/stringref#46.
This format is already adopted by V8: https://chromium-review.googlesource.com/c/v8/v8/+/3892695.
The text format is left unchanged (see #5607 for a discussion on the subject).
I have also added support for string.encode_lossy_utf8 and
string.encode_lossy_utf8 array (by allowing the replace policy for
Binaryen's string.encode_wtf8 instruction).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a new "analysis" source directory that will contain the source for a new
static program analysis framework. To start the framework, add a CFG utility
that provides convenient iterators for iterating through the basic blocks of the
CFG as well as the predecessors, successors, and contents of each block.
The new CFGs are constructed using the existing CFGWalker, but they are
different in that the new utility is meant to provide a usable representation of
a CFG whereas CFGWalker is meant to allow collecting arbitrary information about
each basic block in a CFG.
For testing and debugging purposes, add `print` methods to CFGs and basic
blocks. This requires exposing the ability to print expression contents
excluding children, which was something we previously did only for StackIR.
Also add a new gtest file with a test for constructing and printing a CFG. The
test reveals some strange properties of the current CFG construction, including
empty blocks and strange placement of `loop` instructions, but fixing these
problems is left as future work.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds an option to ignore effects in the parent in
getDroppedChildrenAndAppend. With that, this becomes usable in more
places, like Directize, basically in situations where we know we can ignore
effects in the parent (since we've inferred they are not needed). This lets
us get rid of some boilerplate code in Directize.
Diff without whitespace is a lot smaller. A large other part of the diff is a
rename of curr => parent which I think it makes it more readable as then
parent/children is a clear contrast, and then the new parameter "ignore/
notice parent effects" is obviously connected to "parent".
The top comment in drop.cpp is removed as it just duplicated the top
comment in the header drop.h.
This is basically NFC but using drop.h does bring the advantage of
emitting less code, see the test changes, so it is noticeable in the IR.
This is a refactoring PR in preparation for a larger improvement to
Directize that will also benefit from this new drop capability.
|
|
|
|
|
|
|
|
|
| |
#4191 meant to do that, I think, but only did so for "pattern B". This does it
for all patterns, and adds assertions.
In theory this could regress code that benefits from partial inlining of
"pattern A" (since this PR stops doing it by default), but I did not see a significant
difference on any benchmarks, and it is easy to re-enable that behavior by
doing --partial-inlining-ifs=1.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This changes loops from having the effect "may trap (timeout)" to having
"may not return." The only noticeable difference is in TrapsNeverHappen mode,
which ignores the former but not the latter. So after this PR, in TNH mode we do
not optimize away an infinite loop that seems to have no other side effects. We
may also use this for other things in the future, like continuations/stack switching.
There are upsides and downsides to allowing the optimizer to remove infinite
loops (the C and C++ communities have had interesting discussions on that topic
over the years...) but it seems safer to not optimize them out for now, to let the
most code work properly. If a need comes up to optimize such code, we can look
at various options then (like a flag to ignore infinite loops).
See discussion in #5228
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TypeUpdater::remove must be called after removing a thing from the
tree. If not, then we can get confused by something like this:
(block $b
(br $b)
)
If we first call TypeUpdater::remove then we see that the block's only
br is going away, so it becomes unreachable. But when we then remove
the br then the block should have type none. Removing the br first
from the IR, and then calling TypeUpdater::remove, is the safe way to do it.
However, changing that order in Vacuum is not trivial. After looking into
this, I realized that it is just simpler to remove TypeUpdater entirely.
Instead, we can ReFinalize at the end unconditionally. This has the downside
that we do not propagate type updates "as we go", but that should be very
rare.
Another downside is that TypeUpdater tracks the number of brs, which
can help remove code like in the one test that regresses here (see comment
there). But I'm not sure that removal was valid - Vacuum should not really be
doing it, and it looks like its related to this bug actually. Instead, we have a
dedicated pass for removing unused brs - RemoveUnusedBrs - so leave
things for it.
This PR's benefit, aside from now handling the fuzz testcase, is that it makes
the code simpler and faster. I see a 10-25% speedup on the Vacuum pass on
various binaries I tested on. (Vacuum is one of our faster passes anyhow,
though, so the runtime of -O1 is not much improved.)
Another minor benefit might be that running ReFinalize more often can
propagate type info more quickly, thanks to #5704 etc. But that is probably
very minor.
|
|
|
| |
Do the port automatically using the port_passes_tests_to_lit.py script.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A cycle of data is something we can't just naively emit as wasm globals. If
at runtime we end up, for example, with an object A that refers to itself,
then we can't just emit
(global $A
(struct.new $A
(global.get $A)))
The struct.get is of this very global, and such a self-reference is invalid. So
we need to break such cycles as we emit them. The simple idea used here
is to find paths in the cycle that are nullable and mutable, and replace the
initial value with a null that is fixed up later in the start function:
(global $A
(struct.new $A
(ref.null $A)))
(func $start
(struct.set
(global.get $A)
(global.get $A)))
)
This is not optimal in terms of breaking cycles, but it is fast (linear time)
and simple, and does well in practice on j2wasm (where cycles in fact
occur).
|