| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
In that mode a walk on an entire module will reuse the same CFGWalker
instance, so we must manually clear some fields, and we forgot some
before.
|
|
|
|
|
|
|
|
| |
This adds support `CFGWalker` for the new EH instructions (`try_table`
and `throw_ref`). `CFGWalker` is used by many different passes, but in
the same vein as #3494, this adds tests for `RedundantSetElimination`
pass. `rse-eh.wast` file is created from translated and simplified
version of `rse-eh-old.wast`, but many tests were removed because we
don't have special `catch` block or `delegate` anymore.
|
|
|
|
|
|
|
| |
This renames `***Stack` variables in `CFGWalker` to be consistent, as a
preparation for adding another stack for the new EH. Currently `ifStack`
and `loopStack` contains `BasicBlock*`s but `tryStack` contains
`Expression*`, and `Try` expressions are rather contained in
`unwindExprStack`, which to me is confusing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously CFGWalker designated a particular block as the "exit" block, but it
was just the block that happened to appear at the end of the function that
returned values by implicitly flowing them out. That exit block was not tied in
any way to other blocks that might end in returns, so analyses that needed to
perform some action at the end of the function would have had to perform that
action at the end of the designated exit block but also separately at any return
instruction.
Update CFGWalker to make the exit block a synthetic empty block that is a
successor of all other blocks tthat implicitly or explicitly return from the
function in case there are multiple such blocks, or to make the exit block the
single returning block if there is only one. This means that analyses will only
perform their end-of-function actions at the end of the exit block rather than
additionally at every return instruction.
|
|
|
|
|
|
|
|
|
| |
A pass that just operates on locals, for example, does not care about branches outside of
the function. That means that when we see a call, then even if EH is enabled we don't need
to create a new basic block right after it (unless the call is inside a try-catch - then it might
branch to the catch, of course).
This makes CFG-using passes 9% faster compared to before this PR. This plus #5827 offset
the slowdown from #5823 and overall give an improvement compared to before.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before this PR, if a call had no paths to a catch in the same function then we skipped
creating a new basic block right after it. As a result, we could have a call in the middle
of a basic block. If EH is enabled that means we might transfer control flow out of
the function from the middle of a block. But it is better to have the property that
any transfer of control flow - to another basic block, or outside of the function - can
only happen at the end of a basic block.
This causes some overhead, but a subsequent PR (#5838) will remove that as a
followup, and this PR adds a little code to pass the module and check if EH is enabled,
and avoid the overhead if not, which at least avoids regressing the non-EH case
until that followup lands.
|
|
|
|
|
|
| |
ControlFlowWalker builds a stack of control flow items and allows finding
the target of a branch using it. But the only use CFGWalker made of that was
to map a block or a loop to its branches. We can just use the name of the block
or loop in the map, and avoid the extra overhead.
|
|
|
| |
When we remove a tee there, we must not change the type at all.
|
|
|
|
| |
This code predates our adoption of C++14 and can now be removed in favor of
`std::make_unique`, which should be more efficient.
|
| |
|
|
|
|
| |
This is more modern and (IMHO) easier to read than that old C typedef
syntax.
|
| |
|
|
|
|
|
|
|
|
| |
We were missing CallRef in the CFG traversal code in a place where we
note possible exceptions. As a result we thought CallRef cannot throw, and
were missing some control flow edges.
To actually detect the problem, we need to validate non-nullable locals
properly, which we were not doing. This adds that as well.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in a function. (#4567)
* Lift the restriction in liveness-traversal.h that supported max 65535 locals in a function.
* Lint
* Fix typo
* Fix static
* Lint
* Lint
* Lint
* Add needed canRun function
* lint
* Use either a sparse or a dense matrix for tracking liveness copies, depending on the locals count.
* Lint
* Fix lint
* Lint
* Implement sparse_square_matrix class and use that as a backing.
* Lint
* Lint
* Lint #includes
* Lint
* Lint includes
* Remove unnecessary code
* Fix canonical accesses to copies matrix
* Lint
* Add missing variable update
* Remove canRun() function
* Address review
* Update expected test results
* Update test name
* Add asserts to sparse_square_matrix set and get functions that they are not out of bound.
* Lint includes
* Update test expectation
* Use .clear() + .resize() to reset totalCopies vector
|
|
|
|
|
|
|
|
| |
CoalesceLocals (#4574)
Normally we just replace unreachable local.gets with a constant (0, or null), but if
the local is non-nullable we can't do that.
Fixes #4573
|
| |
|
|
|
|
| |
This adds support for `try`-`delegate` to `CFGWalker`. This also adds a
single test for `catch`-less `try`.
|
|
|
|
|
|
|
| |
The current code the innermost (`i`th) case specially first and handles
`i-1`th `try` in each loop iteration. This puts the `i`th case in the
loop and each iteration handles `i`th `try`, which is simpler. Then we
don't need to check `throwingInstsStack.empty()` in the beginning
because the `for` loop wouldn't be entered if it's empty anyway.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some functions run only once with this pattern:
function foo() {
if (foo$ran) return;
foo$ran = 1;
...
}
If that global is not ever set to 0, then the function's payload (after the
initial if and return) will never execute more than once. That means we
can optimize away dominated calls:
foo();
foo(); // we can remove this
To do this, we find which globals are "once", which means they can
fit in that pattern, as they are never set to 0. If a function looks like the
above pattern, and it's global is "once", then the function is "once" as
well, and we can perform this optimization.
This removes over 8% of static calls in j2cl.
|
|
|
|
|
|
|
|
| |
Add a class to compute the dominator tree for a CFG consisting
of a list of basic blocks assumed to be in reverse postorder.
This will be useful once cfg-walker emits blocks in reverse-postorder
(which it almost does, another PR will handle that). Then we can write
optimization passes that use block dominance.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#4101)
Reverse postorder basically just means that a block's immediate dominator
must precede it in the list. That is useful because then algorithms that look at
dominance can simply process the list in order, and the immediate dominator
will have already been seen before each block.
Another way to put it is that in reverse postorder a block's dominators
appear before it in the list, as do all non-loop predecessors. At least in
reducible graphs that is the case, and our IR, like wasm, is reducible.
It is pretty natural to emit reverse postorder on wasm given the
reducibility: simply process the wasm in postorder, and make sure to
create new basic blocks only when reaching their code - that is, do not
create them "ahead of time". We were doing that in a single place, for
try-catch, so this PR refactors that. Specifically it makes us create the
basic blocks for catches right when we reach them, and not earlier.
So the data structure that used to store them becomes a list of things
to connect to them.
This is useful for #4100 , see more details there.
|
|
|
|
|
|
|
|
|
|
|
| |
We recently decided to change 'event' to 'tag', and to 'event section'
to 'tag section', out of the rationale that the section contains a
generalized tag that references a type, which may be used for something
other than exceptions, and the name 'event' can be confusing in the web
context.
See
- https://github.com/WebAssembly/exception-handling/issues/159#issuecomment-857910130
- https://github.com/WebAssembly/exception-handling/pull/161
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Without adding logic there, it simply ignored the branch, which could
lead to bad optimizations (thinking code is unreachable when it was).
There isn't a trivial way to add a static error to force us to add new
classes to CFGWalker. But this PR generalizes the code there to
handle all branches and all unreachable instructions in a generic
way. The only thing we'll need to remember to do in the future is to
add control flow structures. (And normally the fuzzer should quickly
find such bugs, but we don't have full fuzzing enabled for GC yet.)
Fixes #3907
|
|
|
|
|
|
|
|
|
|
| |
`Walker::TaskFunc` has changed from a function pointer to
`std::function` in #3494, mainly to make the EH support for `CFGWalker`
easier. We didn't notice much performance difference then, but it was
recently reported that it creased binaryen.js code size and performance.
This changes `Walker::TaskFunc` back to a function pointer and does a
little more work to manage catch index in `CFGWalker` side.
Hopefully fixes #3857.
|
| |
|
|
|
|
|
|
| |
Move the InsertOrderedSet and InsertOrderedMap implementations out of
Relooper.h and into a new insert_ordered.h so they can be used more widely. Only
changes the implementation code to use unordered_maps and
`WASM_UNREACHABLE` instead of `abort`.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#3594)
This was an unfortunate case of the order of execution of call
arguments. link(self->currBasicBlock, self->startBasicBlock()) would
run the call first, which sets currBasicBlock, so we'd end up with
the same value for both parameters.
Without this fix, the testcase would drop the result of the call,
as it thought it had no uses.
Also improve debug logging here a tiny bit.
Found by emscripten-core/emscripten#13485
|
|
|
| |
This removes `exnref` type and `br_on_exn` instruction.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This updates CFG traversal to match the new spec. Previously there was
only a single `catch` block that caught all exceptions, so all throwing
instructions needed to have a link to its innermost catch BB. But now we
can have multiple catches per try, requiring all throwing instrutions to
have an edge to all of those innermost catch BBs. Furthermore, if there
are only `catch`es and not a `catch_all` in a try, throwing instructions
can further unwind to outer catches until they find a `catch_all`.
`unwindCatchStack` and `unwindExprStack` are necessary to track and make
correct links between throwing instructions and their unwind destination
BBs.
`processCatchStack` is used to remember the catch BBs currently being
processed, so that after processing all of them, we can make a link from
each of those catch's last block to the continuation block after the
try-catch.
RSE test cases are updated because they use the CFG traversal. The tests
there mainly test that if all possible CFG edge to a `local.set` sets
the same value to a local, the `local.set` is redundant and thus can be
removed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This updates `try`-`catch`-`catch_all` and `rethrow` instructions to
match the new spec. `delegate` is not included. Now `Try` contains not a
single `catchBody` expression but a vector of catch
bodies and events.
This updates most existing routines, optimizations, and tests modulo the
interpreter and the CFG traversal. Because the interpreter has not been
updated yet, the EH spec test is temporarily disabled in check.py. Also,
because the CFG traversal for EH is not yet updated, several EH tests in
`rse_all-features.wast`, which uses CFG traversal, are temporarily
commented out.
Also added a few more tests in existing EH test functions in
test/passes. In the previous spec, `catch` was catching all exceptions
so it was assumed that anything `try` body throws is caught by its
`catch`, but now we can assume the same only if there is a `catch_all`.
Newly added tests test cases when there is a `catch_all` and cases there
are only `catch`es separately.
|
|
|
| |
Fixes the `Relooper` leaking `Branch`es in `Optimizer::SkipEmptyBlocks`, by refactoring the API so a `std::unique_ptr` is ensured for each `Block`, `Branch` and `Shape` upon adding to the relooper.
|
|
|
|
|
|
|
|
|
| |
* Unifies internal hashing helpers to naturally integrate with std::hash
* Removes the previous custom implementation
* Computed hashes are now always size_t
* Introduces a hash_combine helper
* Fixes an overwritten partial hash in Relooper.cpp
|
|
|
| |
This is needed for headers to show up in IDE projects, and has no other effect on the build.
|
|
|
|
|
|
|
|
| |
This adds EH instruction support for `CFGWalker`. This also implements
`call` instruction handling within a try-catch; every call can possibly
throw and unwind to the innermost catch block.
This adds tests for RedundantSetElimination pass, which uses
`CFGWalker`.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The analysis currently uses a dense matrix. If there are >65535
locals then the indexes don't fit in a 32-bit type like a wasm32
index, which led to overflows and incorrect behavior. To avoid
that, don't run passes with liveness analysis for now if they have
that many locals.
Note that skipping coalesce-locals (the main liveness-using
pass) is not that bad, as we run it more than once, and it's
likely that even if the first must be skipped, we can still run
the second (which is after simplify- and reorder-locals, which
can greatly reduce the local count).
Fixes #2559
|
|
|
|
|
| |
This works more like llvm's unreachable handler in that is preserves
information even in release builds.
|
|
|
|
|
|
|
|
| |
We always enable assertions by default, but this options allows for
a build without them.
Fix all errors in the ASSERTIONS=OFF build, even though we don't
normally build this its good to keep it building.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
That was needed for super-old wasm type system, where we allowed
(block $x
(br_if $x
(unreachable)
(nop)
)
)
That is, we differentiated "taken" branches from "named" ones (just
referred to by name, but not actually taken as it's in unreachable code).
We don't need to differentiate those any more. Remove the ReFinalize
code that considered it, and also remove the named/taken distinction in
other places.
|
|
|
| |
This is line with modern cmake conventions is much less SHOUTY!
|
|
|
|
|
|
|
|
|
| |
using the `$<TARGET_OBJECTS:objlib>` syntax. Use this variable when
adding `libbinaryen` as static or shared library. Additionally, use the
variable with the object files to simplify the `TARGET_LINK_LIBRARIES`
commands: add the object libraries to the sources of executables and
drop the use of our libraries in `TARGET_LINK_LIBRARIES`. (Object
libraries cannot be linked but must be used as sources. See
https://cmake.org/pipermail/cmake/2018-June/067721.html)
|
|
|
|
|
| |
(#2474)
This reverts commit bf8f36c31c0b8e6213bce840be66937dd6d0f6af.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Transform libraries created in subdirectories from statically linked
libraries to CMake object libraries.
* Link object libraries as `PRIVATE` to `libbinaryen`.
According to CMake documentation: "Libraries and targets following
PRIVATE are linked to, but are not made part of the link interface."
This is exactly what we want, as we only want the C API to be part of
the interface.
|
|
|
|
|
| |
Adds the AssemblyScript-specific passes post-assemblyscript
and post-assemblyscript-finalize, eliminating redundant ARC-style
retain/release patterns conservatively emitted by the compiler.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes in Relooper merge consecutive blocks:
Entry block getting removed when it is part of a loop:
bb1->AddBranchTo(bb2, nullptr);
bb1->AddBranchTo(bb3, ...);
bb2->AddBranchTo(bb1, nullptr);
bb3->AddBranchTo(bb4, nullptr);
relooper.AddBlock(bb1);
relooper.AddBlock(bb2);
relooper.AddBlock(bb3);
relooper.AddBlock(bb4);
relooper.Calculate(bb1);
Branches memory leak
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Reflected new renamed instruction names in code and tests:
- `get_local` -> `local.get`
- `set_local` -> `local.set`
- `tee_local` -> `local.tee`
- `get_global` -> `global.get`
- `set_global` -> `global.set`
- `current_memory` -> `memory.size`
- `grow_memory` -> `memory.grow`
- Removed APIs related to old instruction names in Binaryen.js and added
APIs with new names if they are missing.
- Renamed `typedef SortedVector LocalSet` to `SetsOfLocals` to prevent
name clashes.
- Resolved several TODO renaming items in wasm-binary.h:
- `TableSwitch` -> `BrTable`
- `I32ConvertI64` -> `I32WrapI64`
- `I64STruncI32` -> `I64SExtendI32`
- `I64UTruncI32` -> `I64UExtendI32`
- `F32ConvertF64` -> `F32DemoteI64`
- `F64ConvertF32` -> `F64PromoteF32`
- Renamed `BinaryenGetFeatures` and `BinaryenSetFeatures` to
`BinaryenModuleGetFeatures` and `BinaryenModuleSetFeatures` for
consistency.
|
|
|
| |
Applies the changes in #2065, and temprarily disables the hook since it's too slow to run on a change this large. We should re-enable it in a later commit.
|
|
|
| |
Mass change to apply clang-format to everything. We are applying this in a PR by me so the (git) blame is all mine ;) but @aheejin did all the work to get clang-format set up and all the manual work to tidy up some things to make the output nicer in #2048
|
|
|
|
| |
* Use modern T p = v; notation to initialize class fields
* Use modern X() = default; notation for empty class constructors
|