| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
| |
Allow outlining to be excluded from the command line on non-Emscripten builds.
|
|
|
|
|
|
| |
This is not needed in GUFA as it tracks local values precisely (each set is connected to
the gets that actually read from it), but in a future PR it will be useful to track local
values per index (each set is connected to all gets for that index, i.e., each local index
is a single "location").
|
|
|
|
|
|
|
|
| |
We had an assert there that was wrong. In fact the assert is just in one of two code paths,
and an optional one: the end situation is we have an expression and a constant to add to it,
and the assert was in the case that the expression is a Const so we can do the add at
compile time (the other code path does the add at runtime). This code path is optional as
Precompute would do such compile-time addition anyhow, but it is nice to fix and leave that
path so that this pass emits fully optimal code.
|
|
|
| |
Adds an outlining pass that performs outlining on a module end to end, and two tests.
|
|
|
|
|
|
|
|
|
|
|
| |
Class template argument deduction (CTAD) is a C++17 feature that allows
variables to be declared with class template types without specifying the
template parameters. Deduction guides are a mechanism by which template authors
can control how the template parameters are inferred when CTAD is used. The
Google style guide prohibits the use of CTAD except where template authors opt
in to supporting it by providing explicit deduction guides. For compatibility
with users adhering to Google style, set the compiler flag to check this
condition and add the necessary deduction guides to make the compiler happy
again.
|
|
|
|
|
|
| |
The new wat parser parses block, if, loop, then, and else keywords directly
rather than depending on code generated from gen-s-parser.py. Filter these
keywords out in gen-s-parser.py when generating the new wat parser and delete
the stub functions that the removed generated code used to depend on.
|
| |
|
|
|
|
| |
Also add testcases to be comprehensive and notice changes if we ever decide to
modify that behavior.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
interactions with a parent (#6089)
We had a simple rule that if we reach an expression twice then we give up, which makes
sense for say a block: if one allocation flows out of it, then another can't - it would get
mixed in with the other one, which is a case we don't optimize. However, there are
cases where a parent has multiple children and different interactions with them, like
a struct.set: the reference child does not escape, but the value child does. Before this
PR if we reached the value child first, we'd mark the parent as seen, and then the reference
child would see it isn't the first to get here, and not optimize.
To fix this, reorder the code to handle this case. The manner of interaction between the
child and the parent decides whether we mark the parent as seen and to be further
avoided.
Noticed by the determinism fuzzer, since the order of analysis mattered here.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This new optimization will eventually weaken casts by generalizing (i.e.
un-refining) their output types. If a cast is weakened enough that its output
type is a supertype of its input type, the cast will be able to be removed by
OptimizeInstructions.
Unlike refining cast inputs, generalizing cast outputs can break module
validation. For example, if the result of a cast is stored to a local and the
cast is weakened enough that its output type is no longer a subtype of that
local's type, then the local.set after the cast will no longer validate. To
avoid this validation failure, this optimization would have to generalize the
type of the local as well. In general, the more we can generalize the types of
program locations, the more we can weaken casts of values that flow into those
locations.
This initial implementation only generalizes the types of locals and does not
actually weaken casts yet. It serves as a proof of concept for the analysis
required to perform the full optimization, though. The analysis uses the new
analysis framework to perform a reverse analysis tracking type requirements for
each local and reference-typed stack value in a function.
Planned and potential future work includes:
- Implementing the transfer function for all kinds of expressions.
- Tracking requirements on the dynamic types of each location to generalize
allocations as well.
- Making the analysis interprocedural and generalizing the types of more
program locations.
- Optimizing tuple-typed locations.
- Generalizing only those locations necessary to eliminate at least one cast
(although this would make the anlysis bidirectional, so it is probably better
left to separate passes).
|
|
|
|
|
|
|
|
| |
Because we currently strip some data segments (i.e. EM_JS strings)
during `--post-emscripten` this is too late as `--separate-data-segments`
always runs in `wasm-emscripten-finalize`.
Once emscripten switches over to using the pass directly we can remove
the support from `wasm-emscripten-finalize`
|
|
|
|
|
|
|
|
|
| |
LocalCSE is nice for large expressions, but for small things it has always been of
unclear benefit since VMs also do GVN/CSE anyhow. So we are likely not speeding
anything up, but hopefully we are reducing code size at least. Doing LocalCSE on
something small like a global.get is very possibly going to increase code size,
however (since we add a tee, and since the local gets are of similar size to global
gets - depends on LUB sizes). On real-world Java code that overhead is noticeable,
so this PR makes us more careful, and we skip things of size 1 (no children).
|
|
|
|
| |
To support parsing calls, add support for parsing function indices and building
calls with IRBuilder.
|
|
|
|
| |
Update the C++20 builder to use Ubuntu 20.04 to catch problems building with its
system compiler. Also fix such a problem in wasm-fuzz-lattices.cpp.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously CFGWalker designated a particular block as the "exit" block, but it
was just the block that happened to appear at the end of the function that
returned values by implicitly flowing them out. That exit block was not tied in
any way to other blocks that might end in returns, so analyses that needed to
perform some action at the end of the function would have had to perform that
action at the end of the designated exit block but also separately at any return
instruction.
Update CFGWalker to make the exit block a synthetic empty block that is a
successor of all other blocks tthat implicitly or explicitly return from the
function in case there are multiple such blocks, or to make the exit block the
single returning block if there is only one. This means that analyses will only
perform their end-of-function actions at the end of the exit block rather than
additionally at every return instruction.
|
|
|
| |
Helps #5951
|
|
|
|
|
|
|
|
|
| |
Combine the `transfer` and `getDependents` methods of a transfer function so
that a transfer function only has to implement `transfer`, which now returns a
range of basic blocks that may need to be re-analyzed.
To make it easier to implement the returned basic block range, change the
requirement so that it provides iterators to `const BasicBlock*` rather than
`BasicBlock`. This allows us to entirely remove cfg-impl.h.
|
|
|
|
|
|
|
|
|
| |
Previously, modifying a single vector element of a `Shared<Vector>` element
required materializing a full vector to do the join. When there is just a single
element to update, materializing all the other elements with bottom value is
useless work. Add a `Vector<L>::SingletonElement` utility that represents but
does not materialize a vector with a single non-bottom element and allow it to
be passed to `Vector<L>::join`. Also update `Shared` and `Inverted` so that
`SingletonElement` joins still work on vectors wrapped in those other lattices.
|
|
|
|
|
|
|
| |
Remove the ability to represent the top element of the stack lattice since it
isn't necessary. Also simplify the element type to be a simple vector, update
the lattice method implementations to be more consistent with implementations in
other lattices, and make the tests more consistent with the tests for other
lattices.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The analysis framework stores a separate lattice element for each basic block
being analyzed to represent the program state at the beginning of the block.
However, in many analyses a significant portion of program state is not
flow-sensitive, so does not benefit from having a separate copy per block. For
example, an analysis might track constraints on the types of locals that do not
vary across blocks, so it really only needs a single copy of the constrains for
each local. It would be correct to simply duplicate the state across blocks
anyway, but it would not be efficient.
To make it possible to share a single copy of a lattice element across basic
blocks, introduce a `Shared<L>` lattice. Mathematically, this lattice represents
a single ascending chain in the underlying lattice and its elements are ordered
according to sequence numbers corresponding to positions in that chain.
Concretely, though, the `Shared<L>` lattice only ever materializes a single,
monotonically increasing element of `L` and all of its elements provide access
to that shared underlying element.
`Shared<L>` will let us get the benefits of having mutable shared state in the
concrete implementation of analyses without losing the benefits of keeping those
analyses expressible purely in terms of the monotone framework.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously the fuzzer never added gets or sets of globals from initial content. That was
an oversight, I'm pretty sure - it's just that the code that sets up the lists from which we
pick globals for gets and sets was in another place. That is, any globals in the initial
content file were never used in new random code the fuzzer generates (only new
globals the fuzzer generated were used there).
This PR allows us to use those globals, but also ignores them with some probability,
to avoid breaking patterns like "once" globals (that we want to only be used from
initial content, at least much of the time).
Also simplify the code here: we don't need isInvalidGlobal just to handle the hang
limit global, which is already handled by not being added to the lists we pick names
from anyhow.
|
|
|
|
|
|
|
|
| |
argument (#6074)
In wasm64, function pointers are 64-bit like all pointers.
fixes #6073
|
|
|
| |
We handled references but not tuples in one place.
|
|
|
|
|
|
| |
Many of the lattice tests were essentially copy-pasted from one lattice to the
next because they all tested isomorphic subsets of the various lattices,
specifically in the shape of a diamond. Refactor the code so that all lattices
that have tests of this shape use the same utility test functions.
|
|
|
|
|
|
| |
Add a lattice that is a thin wrapper around `wasm::Type` giving it the interface
of a lattice. As usual, `Type::unreachable` is the bottom element, but unlike in
the underlying API, we uniformly treat `Type::none` as the top type so that we
have a proper lattice.
|
|
|
|
| |
Followup to #6061
|
|
|
|
| |
In particular, if the body just calls another "once" function, then we can
skip the early-exit logic.
|
|
|
|
|
|
|
| |
This lattice combines any number of other lattices into a single lattice whose
elements are tuples of elements of the other lattices. This will be one of the
most important lattices in the analysis framework because it will be used to
combine information about different parts of the program, e.g. locals and the
value stack, into a single lattice.
|
|
|
|
|
|
|
|
|
|
| |
The vector lattice is nearly identical to the array lattice, except that the
size of the elements is specified at runtime when the lattice object is created
rather than at compile time. The code and tests are largely copy-pasted and
fixed up from the array implementation, but there are a couple differences.
First, initializing vector elements does not need the template magic used to
initialize array elements. Second, the obvious implementations of join and meet
do not work for vectors of bools because they might be specialized to be bit
vectors, so we need workarounds for that particular case.
|
|
|
|
|
| |
The elements of `Array<L, N>` lattice are arrays of length `N` of elements of
`L`, compared pairwise with each other. This lattice is a concrete
implementation of what would be written L^N with pen and paper.
|
|
|
|
|
|
| |
Since these methods, which operate on lattice elements, moved to the lattice
types, it no longer makes much sense for their parameters to be called `self`
and `other`. Rename them to `joinee` and `joiner` for joins and `meetee` and
`meeter` for meets.
|
|
|
|
|
|
|
| |
If there are newlines in the list, then we split using them in a simple manner
(that does not take into account nesting of any other delimiters).
Fixes #6047
Fixes #5271
|
|
|
|
|
|
|
|
|
|
| |
Implement new `RandomLattice` and `RandomFullLattice` utilities that are
lattices randomly created from other lattices. By recursively using themselves
as the parameter lattices for lattices like `Inverted` and `Lift`, these random
lattices can become arbitrarily nested.
Decouple the checking of lattice properties from the checking of transfer
function properties by creating a new, standalone `checkLatticeProperties`
function.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Closed-world mode allows function types to escape if they are on exported functions,
because that has been possible since wasm MVP and cannot be avoided. But we need to
also allow all types in those type's rec groups as well. Consider this case:
(module
(rec
(type $0 (func))
(type $1 (func))
)
(func "0" (type $0)
(nop)
)
(func "1" (type $1)
(nop)
)
)
The two exported functions make the two types public, so this module validates in
closed world mode. Now imagine that metadce removes one export:
(module
(rec
(type $0 (func))
(type $1 (func))
)
(func "0" (type $0)
(nop)
)
;; The export "1" is gone.
)
Before this PR that no longer validates, because it only marks the type $0 as public.
But when a type is public that makes its entire rec group public, so $1 is errored on.
To fix that, this PR allows all types in a rec group of an exported function's type, which
makes that last module validate.
|
|
|
|
| |
This lattice "lifts" another lattice by inserting a new bottom element
underneath it.
|
|
|
| |
Followup to #6048, we did not handle nondefaultable tuples because of this.
|
|
|
|
|
| |
Given a type `T`, `Flat<T>` is the lattice where none of the values of `T` are
comparable except with themselves, but they are all greater than a common bottom
element not in `T` and less than a common top element also not in `T`.
|
|
|
|
|
|
| |
The FullLattice concept extends the base Lattice with `getTop` and `meet`
operations. The `Inverted` lattice uses these operations to reverse the order of
an arbitrary full lattice, for example to create a lattice of integers ordered
by `>` rather than by `<`.
|
|
|
| |
Implement a generic lattice template for integral types ordered by `<`.
|
|
|
|
|
| |
This is a lattice with two elements: `false` is bottom and `true` is top.
Add a new gtest file for testing lattices.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lattice (#6035)
In general, the operation of taking the least upper bound of two lattice
elements may depend on some state stored in the lattice object rather than in
the elements themselves. To avoid forcing the elements to be larger and more
complicated in that case (by storing a parent pointer back to the lattice), move
the least upper bound operation to make it a method of the lattice rather than
the lattice element. This is also more consistent with where we put e.g. the
`compare` method.
While we are at it, rename `makeLeastUpperBound` to `join`, which is much
shorter and nicer. Usually we avoid using esoteric mathematical jargon like
this, but "join" as a normal verb actually describes the operation nicely, so I
think it is ok in this case.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Simplify the monotone analyzer by replacing all the state it used to store in
`BlockState` with a simple vector of lattice elements. Use simple indices to
refer to both blocks and their associated states in the vector. Remove the
ability for transfer functions to control the initial enqueued order of basic
blocks since that was a leaky abstraction. Replace the worklist with a
UniqueDeferredQueue since that has generally proven to be more efficient in
smiilarly contexts, and more importantly, it has a nicer API. Make miscellaneous
simplifications to other code as well.
Delete a few unit tests that exposed the order in which blocks were analyzed
because they printed intermediate results. These tests should be replaced with
tests of analyses' public APIs in the future.
|
|
|
|
| |
These missing includes were not a problem in our standard build configuration,
but were breaking other build configurations.
|
|
|
|
|
|
|
|
|
| |
This PR is part of a series that adds basic support for the [typed continuations proposal](https://github.com/wasmfx/specfx).
This PR adds continuation types, of the form `(cont $foo)` for some function type `$foo`.
The only notable changes affecting existing code are the following:
- This is the first `HeapType` which has another `HeapType` (rather than, say, a `Type`) as its immediate child. This required fixes to certain traversals that have a flag for being at the toplevel of a type.
- Some shared logic for parsing `HeapType`s has been factored out.
|
|
|
|
|
|
|
|
|
|
| |
Followup to #6046 - the fuzzer found we missed handling the case of the entry itself
being unreachable, or of an unreachable loop later. Properly identifying unreachable
code requires a flow analysis, unfortunately, so this PR gives up on that and instead
allows LocalGraph to be imprecise in unreachable code. That avoids adding any
overhead, but does mean the IR may be slightly confusing when debugging. It does
not have any optimization downsides, however, as it only affects unreachable code.
Also add a dump() impl in that file which helps debugging.
|