| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
| |
A pass that just operates on locals, for example, does not care about branches outside of
the function. That means that when we see a call, then even if EH is enabled we don't need
to create a new basic block right after it (unless the call is inside a try-catch - then it might
branch to the catch, of course).
This makes CFG-using passes 9% faster compared to before this PR. This plus #5827 offset
the slowdown from #5823 and overall give an improvement compared to before.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change adds a fuzzer which checks the following properties in
abstract interpretation static analyses.
- Transfer Function Monotonicity
- Lattice Element Reflexivity
- Lattice Element Transitivity
- Lattice Element Anti-Symmetry
This is done by randomly generating a module and using its functions as
transfer function inputs, along with randomly generated lattice elements
(states). Lattice element properties are fuzzed from the randomly
generated states also.
|
|
|
|
|
|
|
|
|
| |
See the example in the code and test for a situation that requires this for validation.
To fix validation we add a cast. That should practically always be removed by later
optimizations, and the fact it took the fuzzer this long to even find such a situation
also adds confidence that this won't be adding overhead (and in this situation, the
optimizer will definitely remove the cast).
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before this PR, if a call had no paths to a catch in the same function then we skipped
creating a new basic block right after it. As a result, we could have a call in the middle
of a basic block. If EH is enabled that means we might transfer control flow out of
the function from the middle of a block. But it is better to have the property that
any transfer of control flow - to another basic block, or outside of the function - can
only happen at the end of a basic block.
This causes some overhead, but a subsequent PR (#5838) will remove that as a
followup, and this PR adds a little code to pass the module and check if EH is enabled,
and avoid the overhead if not, which at least avoids regressing the non-EH case
until that followup lands.
|
|
|
| |
This change applies to the Reaching Definitions Analysis.
|
|
|
|
|
|
|
|
|
| |
Start functions can have locals, which we previously ignored as we just
concatenated the bodies together. This makes us copy the second start
and call that, keeping them separate (the optimizer can then inline, if that
makes sense).
Fixes #5835
|
|
|
|
|
| |
The LLVM suffix tree expects to be provided with a vector of 32-bit unsigned integers. This PR makes it easier to integrate our wasm program string with the suffix tree.
Because the range of possible values is reduced from 2^64 to 2^32, a signed integer was added to manage the next separator value and ensure we're using every possible negative number.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This is necessary for WasmGC producers using the C API, so that they can set the
heap type of functions. Otherwise the heap type is set structurally using params
and results in the old API.
The old API is kept for backwards compatibility and convenience (for the structural
case, which is all code before WasmGC basically).
Fixes #5826
|
| |
|
|
|
|
|
|
| |
ControlFlowWalker builds a stack of control flow items and allows finding
the target of a branch using it. But the only use CFGWalker made of that was
to map a block or a loop to its branches. We can just use the name of the block
or loop in the map, and avoid the extra overhead.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This change implements a reaching definitions analysis which is intended
to be equivalent to the information provided by LocalGraph, specifically
the Flower class of LocalGraph.
It also introduces a CRTP utility in visitor-transfer-function.h which
implements most commonly found visitor-type transfer function
functionalities. The MonotoneCFGAnalyzer is also modified to add a phase
to collect results after the analysis is solved from the final CFG
states.
|
| |
|
|
|
|
|
|
| |
Fixes emscripten-core/emscripten#17485
This allows emscripten to complie code with MEMORY64 + PTHREADS by
fixing using the proper pointer type in the MemoryPacking pass.
|
|
|
|
|
| |
This PR adds LLVM's suffix tree data structure to Binaryen. This suffix tree is implemented using Ukkonen's algorithm for linear-time suffix tree construction, and is intended for fast substring queries.
Note: All of the .h and .cpp files included are from LLVM. These files were copied directly instead of imported into our existing LLVM integration (in third_party/) to avoid bumping the commit hash and avoid the potential for complications with upstream changes.
|
| |
|
|
|
|
|
|
|
| |
When a module item is imported and directly reexported by an
intermediate module, we need to perform several name lookups and use its
name in the initial module rather than the intermediate name when fusing
imports and exports.
|
| |
|
|
|
|
|
| |
This PR is part of the effort to bring Outlining to Binaryen.
HashStringifyWalker is a subclass of StringifyWalker #5772, and used to encode a wasm module as a "string". Our "string" is a vector and each symbol is a uint64_t, providing us with a capacity of 2^64 symbols.
|
|
|
| |
Updates comments. Moves wasm-traversal.h to transfer function classes.
|
| |
|
|
|
|
|
|
| |
Before we always created if-elses. Now we also create an If with one arm some of
the time, when we can.
Also, sometimes make one if arm unreachable, if we have two arms.
|
|
|
|
|
|
|
| |
If we see (ref.cast $A) but we have inferred that a more refined type will
be present there at runtime $B then we can refine the cast to (ref.cast $B).
We could do the same even when a cast is not present, but that would
increase code size. This optimization keeps code size constant.
|
|
|
|
|
|
|
|
|
|
| |
This change creates a lattice, FinitePowersetLattice, to represent finite
powerset lattices constructed from sets containing members of arbitrary type
The bitvector finite powerset lattice class is renamed FiniteIntPowersetLattice.
The templated FinitePowersetLattice class contains a FiniteIntPowersetLattice
object, and over that provides functionality to map lattice element members
to bitvector indices. Methods are also provided by the lattice to add or
remove members of the given type from lattice members as an abstraction
over flipping bits in the bitvector.
|
|
|
|
|
|
|
| |
Analyzer (#5794)
This change makes the transfer function an object separate from the monotone analyzer. The transfer function class type is a generic template of the monotone analyzer, and an instance of the transfer function is instantiated and then passed into the monotone analyzer via its constructor. The API of the transfer function is enforced by a static assertion.
This change also introduces LivenessTransferFunction as a transfer function for liveness analysis as an example.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement support in the type system for final types, which are not allowed to
have any subtypes. Final types are syntactically different from similar
non-final types, so type canonicalization is made aware of finality. Similarly,
TypeMerging and TypeSSA are updated to work correctly in the presence of final
types as well.
Implement binary and text parsing and emitting of final types. Use the standard
text format to represent final types and interpret the non-standard
"struct_subtype" and friends as non-final. This allows a graceful upgrade path
for users currently using the non-standard text format, where they can update
their code to use final types correctly at the point when they update to use the
standard format. Once users have migrated to using the fully expanded standard
text format, we can update update Binaryen's parsers to interpret the MVP
shorthands as final types to match the spec without breaking those users.
To make it safe for V8 to independently start interpreting types declared
without `sub` as final, also reserve that shorthand encoding only for types that
have no strict subtypes.
|
|
|
|
|
|
|
|
| |
Previously we incorrectly used "strict" to mean the immediate subtypes of a
type, when in fact a strict subtype of a type is any subtype excluding the type
itself. Rename the incorrect `getStrictSubTypes` to `getImmediateSubTypes`,
rename the redundant `getAllStrictSubTypes` to `getStrictSubTypes`, and rename
the redundant `getAllSubTypes` to `getSubTypes`. Fixing the capitalization of
"SubType" to "Subtype" is left as future work.
|
|
|
|
|
|
| |
(#5799)
It can be useful to optimize a function body after its parameters are refined,
like we do for other parameter changes.
|
|
|
|
|
|
| |
Use the standard "(sub $super ...)" format instead of the non-standard
"XXX_supertype ... $super" format. In a follow-on PR implementing final types,
this will allow us to print and parse the standard text format for final types
right away with a smaller diff.
|
|
|
|
|
| |
This parallels the code in RefCast. Previously we only looked at the type reaching us, but
intermediate fallthrough values can let us optimize too. In particular, we were not
optimizing (ref.test (local.tee ..)) if the tee was to a less-refined type.
|
|
|
| |
StringifyWalker is a new Walker with UnifiedExpressionVisitor. This walker performs a shallow visit of control-flow (try, if, block, loop) expressions and their simple expression siblings before then visiting the children of each control-flow expression in postorder. As a result, this walker un-nests nested control flow structures, so the expression visit order does not correspond to a normal postorder traversal of the function.
|
| |
|
|
|
|
| |
(#5791)
|
|
|
| |
This change introduces a finite-height powerset lattice FinitePowersetLattice where each element's height is determined when constructed in runtime. This is implemented using a vector of `bools. This change also modifies BlockState and MonotoneCFGAnalyzer to be templated on a generic lattice type instead of using a hardcoded lattice. It additionally fixes an iterator bug in MonotoneCFGAnalyzer::fromCFG which assigned a temporary iterator object as predecessor and successor pointers to BlockStates instead of pointers to actual BlockState objects.
|
|
|
|
|
|
|
|
| |
Previously we limited printing in a single Literals. But we can have
infinitely recursive GC literals, or just huge graphs even without
infinite recursion where no single Literals is that big (but we still
get exponential blowup). This PR adds a general limit on how much
we print once we start to print a Literal or Literals.
|
|
|
|
|
|
|
|
|
|
|
| |
This is a followup to #5333 . That fixed the selection of which passes to run, but
forgot to also fix the global state of the current optimize/shrink levels. This PR
fixes that. As a result, running -O3 -Oz will now work as expected: the first -O3
will run the right passes (as #5333 fixed) and while running them, the global
optimize/shrinkLevels will be -O3 (and not -Oz), which this PR fixes.
A specific result of this is that -O3 -Oz used to inline less, since the invocation
of inlining during -O3 thought we were optimizing for size. The new test verifies
that we do fully inline in the first -O3 now.
|
|
|
| |
This introduces a limited monotone flow-sensitive liveness analysis on local indices as an initial proof of concept for the creation of a monotone flow-sensitive static analysis framework. Tests are included in test/gtest/cfg.cpp.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
Without this, a massive GC array for example can end up being printed as a huge
amount of output, which is a problem in the fuzzer. I noticed this because a testcase
was taking minutes to run.
Perhaps we should add an option (BINARYEN_PRINT_FULL=1?) to disable this limit,
but let's see if we ever need that.
|
|
|
| |
Subtypes are allowed as well, not just exact matches, in the pop value's type.
|
|
|
|
|
|
|
| |
Rather than wrap a `TypeList`, make `Tuple` an alias of `TypeList`. This means
removing `Tuple::toString`, but that had no callers and was of limited use for
debugging anyway. In return, the use of tuples becomes much less verbose.
In the future, it may make sense to remove one of `Tuple` and `TypeList`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pass strips all EH stuff, including EH instructions and tags, from
the input module and disables the EH feature from the features section.
1. This removes `catch` and `catch_all` blocks from the code. So
```wast
(try
(do
(some code)
)
(catch
...
)
)
```
becomes just `(some code)`. Note that all `rethrow`s will be removed
with `catch`es. Note that all `rethrow`s will be removed with `catch`es.
2. This converts 'throw (...)` into `unreachable`. Note that `rethrows
3. This removes all tags from the module, which are unused anyway after
1 and 2.
4. This removes exception handling feature from the features section.
You can use the pass with
```console
$ wasm-opt --enable-exception-handling --strip-eh INPUT -o OUTPUT
```
This is not an optimization pass, so it is not run unless you specify
the pass explicitly.
This is in effect similar to Clang's `-fignore-exceptions`, in which you
can throw but it will result in a crash and we compile away all landing
pads. This can be used for people who don't (or can't) use
`-fignore-exceptions` in their build settings or who want to compile
away `catch` blocks later.
Closes emscripten-core/emscripten#19585.
|
|
|
|
|
|
|
| |
Rewrite the type canonicalization algorithm to fully canonicalize a single rec
group at a time rather than canonicalizing multiple rec groups at once in
multiple stages. The previous code was useful when it had to be shared with
equirecursive and nominal canonicalization, but was much more complicated than
necessary for just isorecursive canonicalization, which is all we support today.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#5764)
EffectAnalyzer::canReorder/invalidate now assume that the things from whom we
generated the effects both execute (or, rather, that if the first of them doesn't
transfer control flow then they execute). If they both execute then we can do more work
in TrapsNeverHappen mode, since we can then reorder this for example:
(global.set ..)
(i32.load ..)
The load may trap, but in TNH mode we assume it won't. So we can reorder those
two. However, if they did not both execute then we could be in this situation:
(global.set ..)
(br_if ..)
(i32.load)
Reordering the load and the set here would be invalid, because we could make the
load execute when it didn't execute before, and it could now start to actually
trap at runtime.
This new assumption seems obvious, since we don't compare the effects of
things unless they are adjacent and with no control flow between them - otherwise,
why compare them? To be sure, I manually reviewed every single use of
EffectAnalyzer::canReorder/invalidate in the entire codebase.
I've also been fuzzing this for several days (hundreds of thousands of iterations),
and have not seen any problem.
This was motivated by seeing that #5744 should be able to do more work in TNH
mode, but it wasn't. New tests show the benefits there in OptimizeCasts as well
as in SimplifyLocals.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we support only isorecursive typing, which canonicalizes heap types by
the rec group rather than individually, we no longer need a canonicalizing store
for heap types. Simplify the `Store` implementation, which was previously
generic over both `HeapType`s and `Type`s, to instead work only for `Type`s.
Replace `globalHeapTypeStore` with a simple vector to keep canonicalized
`HeapTypeInfo`s alive. Also simplify global canonicalization to not replace heap
type uses, since that is already done separately as part of isorecursive rec
group canonicalization.
This simplification does require adding new information to
`CanonicalizationState`, but further work will simplify that away as well.
|
|
|
|
| |
This is not used. We use the parent's `visit` method instead:
https://github.com/WebAssembly/binaryen/blob/585af93ec6a22feb8954bc118f0bff997d1fc165/src/wasm-stack.h#L233-L262
|
|
|
|
|
|
| |
We have `WasmBinaryBuilder` that read binary into Binaryen IR and
`WasmBinaryWriter` that writes Binaryen IR to binary. To me
`WasmBinaryBuilder` sounds similar to `WasmBinaryWriter`, which builds
binary. How about renaming it to `WasmBinaryReader`?
|