| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Have a single implementation for lexing each of unsigned, signed, and
uninterpreted integers, each generic over the bit width of the integer. This
reduces duplication in the existing code and it will make it much easier to
support lexing more 8- and 16-bit integers.
|
| |
|
|
|
|
| |
Instead of e.g. `(i32 i32)`, use `(tuple i32 i32)`. Having a keyword to
introduce the s-expression is more consistent with the rest of the language.
|
|
|
|
|
|
|
|
|
|
|
|
| |
We previously supported (and primarily used) a non-standard text format for
conditionals in which the condition, if-true expression, and if-false expression
were all simply s-expression children of the `if` expression. The standard text
format, however, requires the use of `then` and `else` forms to introduce the
if-true and if-false arms of the conditional. Update the legacy text parser to
require the standard format and update all tests to match. Update the printer to
print the standard format as well.
The .wast and .wat test inputs were mechanically updated with this script:
https://gist.github.com/tlively/85ae7f01f92f772241ec994c840ccbb1
|
|
|
|
|
|
| |
Checking a couple of testing TODOs off and adding more tests of the outlining pass for outlining:
- a sequence at the beginning of an existing function
- a sequence that is outlined into a function that takes no arguments
- multiple sequences from the same source function into different outlined functions
|
|
|
| |
Adds an outlining pass that performs outlining on a module end to end, and two tests.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously CFGWalker designated a particular block as the "exit" block, but it
was just the block that happened to appear at the end of the function that
returned values by implicitly flowing them out. That exit block was not tied in
any way to other blocks that might end in returns, so analyses that needed to
perform some action at the end of the function would have had to perform that
action at the end of the designated exit block but also separately at any return
instruction.
Update CFGWalker to make the exit block a synthetic empty block that is a
successor of all other blocks tthat implicitly or explicitly return from the
function in case there are multiple such blocks, or to make the exit block the
single returning block if there is only one. This means that analyses will only
perform their end-of-function actions at the end of the exit block rather than
additionally at every return instruction.
|
|
|
|
|
|
|
|
|
| |
Previously, modifying a single vector element of a `Shared<Vector>` element
required materializing a full vector to do the join. When there is just a single
element to update, materializing all the other elements with bottom value is
useless work. Add a `Vector<L>::SingletonElement` utility that represents but
does not materialize a vector with a single non-bottom element and allow it to
be passed to `Vector<L>::join`. Also update `Shared` and `Inverted` so that
`SingletonElement` joins still work on vectors wrapped in those other lattices.
|
|
|
|
|
|
|
| |
Remove the ability to represent the top element of the stack lattice since it
isn't necessary. Also simplify the element type to be a simple vector, update
the lattice method implementations to be more consistent with implementations in
other lattices, and make the tests more consistent with the tests for other
lattices.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The analysis framework stores a separate lattice element for each basic block
being analyzed to represent the program state at the beginning of the block.
However, in many analyses a significant portion of program state is not
flow-sensitive, so does not benefit from having a separate copy per block. For
example, an analysis might track constraints on the types of locals that do not
vary across blocks, so it really only needs a single copy of the constrains for
each local. It would be correct to simply duplicate the state across blocks
anyway, but it would not be efficient.
To make it possible to share a single copy of a lattice element across basic
blocks, introduce a `Shared<L>` lattice. Mathematically, this lattice represents
a single ascending chain in the underlying lattice and its elements are ordered
according to sequence numbers corresponding to positions in that chain.
Concretely, though, the `Shared<L>` lattice only ever materializes a single,
monotonically increasing element of `L` and all of its elements provide access
to that shared underlying element.
`Shared<L>` will let us get the benefits of having mutable shared state in the
concrete implementation of analyses without losing the benefits of keeping those
analyses expressible purely in terms of the monotone framework.
|
|
|
|
|
|
| |
Many of the lattice tests were essentially copy-pasted from one lattice to the
next because they all tested isomorphic subsets of the various lattices,
specifically in the shape of a diamond. Refactor the code so that all lattices
that have tests of this shape use the same utility test functions.
|
|
|
|
|
|
| |
Add a lattice that is a thin wrapper around `wasm::Type` giving it the interface
of a lattice. As usual, `Type::unreachable` is the bottom element, but unlike in
the underlying API, we uniformly treat `Type::none` as the top type so that we
have a proper lattice.
|
|
|
|
|
|
|
| |
This lattice combines any number of other lattices into a single lattice whose
elements are tuples of elements of the other lattices. This will be one of the
most important lattices in the analysis framework because it will be used to
combine information about different parts of the program, e.g. locals and the
value stack, into a single lattice.
|
|
|
|
|
|
|
|
|
|
| |
The vector lattice is nearly identical to the array lattice, except that the
size of the elements is specified at runtime when the lattice object is created
rather than at compile time. The code and tests are largely copy-pasted and
fixed up from the array implementation, but there are a couple differences.
First, initializing vector elements does not need the template magic used to
initialize array elements. Second, the obvious implementations of join and meet
do not work for vectors of bools because they might be specialized to be bit
vectors, so we need workarounds for that particular case.
|
|
|
|
|
| |
The elements of `Array<L, N>` lattice are arrays of length `N` of elements of
`L`, compared pairwise with each other. This lattice is a concrete
implementation of what would be written L^N with pen and paper.
|
|
|
|
| |
This lattice "lifts" another lattice by inserting a new bottom element
underneath it.
|
|
|
|
|
| |
Given a type `T`, `Flat<T>` is the lattice where none of the values of `T` are
comparable except with themselves, but they are all greater than a common bottom
element not in `T` and less than a common top element also not in `T`.
|
|
|
|
|
|
| |
The FullLattice concept extends the base Lattice with `getTop` and `meet`
operations. The `Inverted` lattice uses these operations to reverse the order of
an arbitrary full lattice, for example to create a lattice of integers ordered
by `>` rather than by `<`.
|
|
|
| |
Implement a generic lattice template for integral types ordered by `<`.
|
|
|
|
|
| |
This is a lattice with two elements: `false` is bottom and `true` is top.
Add a new gtest file for testing lattices.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lattice (#6035)
In general, the operation of taking the least upper bound of two lattice
elements may depend on some state stored in the lattice object rather than in
the elements themselves. To avoid forcing the elements to be larger and more
complicated in that case (by storing a parent pointer back to the lattice), move
the least upper bound operation to make it a method of the lattice rather than
the lattice element. This is also more consistent with where we put e.g. the
`compare` method.
While we are at it, rename `makeLeastUpperBound` to `join`, which is much
shorter and nicer. Usually we avoid using esoteric mathematical jargon like
this, but "join" as a normal verb actually describes the operation nicely, so I
think it is ok in this case.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Simplify the monotone analyzer by replacing all the state it used to store in
`BlockState` with a simple vector of lattice elements. Use simple indices to
refer to both blocks and their associated states in the vector. Remove the
ability for transfer functions to control the initial enqueued order of basic
blocks since that was a leaky abstraction. Replace the worklist with a
UniqueDeferredQueue since that has generally proven to be more efficient in
smiilarly contexts, and more importantly, it has a nicer API. Make miscellaneous
simplifications to other code as well.
Delete a few unit tests that exposed the order in which blocks were analyzed
because they printed intermediate results. These tests should be replaced with
tests of analyses' public APIs in the future.
|
|
|
|
|
| |
The problem was if you construct a try expression which references a nonexistent tag in
one of its catch blocks, the validation code successfully identified the null pointer but
then proceeded to try to read from it.
|
|
|
| |
Also shorten various names in the implementation to improve readability.
|
|
|
| |
Adds a function to filter branch and return instructions from being included in potential outlining candidates.
|
|
|
|
|
| |
Adds a general purpose walker named FilterStringifyWalker, intended to walk control flow and take note of whether any of the expressions satisfy the condition.
Also includes an << overload for SuffixTree::RepeatedSubstring to make debugging easier.
|
|
|
| |
Fixes #6003
|
|
|
|
| |
With this, the fuzzer can replace e.g. an eq expression with a specific struct type,
because now it is away that struct types have eq as their ancestor.
|
|
|
|
| |
A later PR will add getSuperType which will mean "get the general super type -
either declared, or not".
|
|
|
| |
Adds a std::variant to represent the context of why a unique symbol was inserted in the stringified module. This allows us to pass necessary contextual data to subclasses of StringifyWalker in a structured manner.
|
|
|
| |
This PR adds a StringProcessor struct intended to hold functions that filter vectors of SuffixTree::RepeatedSubstring, and a test of its first functionality, removing overlapping repeated substrings.
|
|
|
| |
Abstracts repeat code
|
|
|
|
|
|
| |
And put the new files in a new source directory, "parser". This is a rough split
and is not yet expected to dramatically improve compile times. The exact
organization of the new files is subject to change, but this splitting should be
enough to make further parser development more pleasant.
|
|
|
|
|
|
|
| |
Remove support for the "struct_subtype", "array_subtype", "func_subtype", and
"extends" notations we used at various times to declare WasmGC types, leaving
only support for the standard text fromat for declaring types. Update all the
tests using the old formats and delete tests that existed solely to test the old
formats.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Match the spec and parse the shorthand binary and text formats as final and emit
final types without supertypes using the shorthands as well. This is a
potentially-breaking change, since the text and binary shorthands can no longer
be used to define types that have subtypes.
Also make TypeBuilder entries final by default to better match the spec and
update the internal APIs to use the "open" terminology rather than "final"
terminology. Future changes will update the text format to use the standard "sub
open" rather than the current "sub final" keywords. The exception is the new wat
parser, which supporst "sub open" as of this change, since it didn't support
final types at all previously.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change introduces StackLattice, a lattice to model stack-related
behavior. It is templated on a separate lattice whose elements model
some property of individual values on the stack. The StackLattice
allows users to access the top of the stack, push abstract values, and
pop them. Comparisons and least upper bound operations are done on a
value by value basis starting from the top of the stack and moving
toward the bottom. This is because it allows stacks from different
scopes to be joined easily.
An application of StackLattice is to model the wasm value stack. The goal
is to organize lattice elements representing individual stack values in a
natural way which mirrors the wasm value stack. Transfer functions operate
on each stack value individually. The stack lattice is an intermediate
structure which is not intended to be directly operated on. Rather, it
simulates the push and pop behavior of instructions.
|
|
|
| |
Adds an integration test that identifies the substrings of a stringified wasm module using the suffix_tree.
|
|
|
|
|
|
|
|
| |
None of that code is speed-sensitive, or at least doesn't need to be inlined to be
fast. Move it to cpp for faster compile times.
This caused a cascade of necessary header fixes (i.e. after removing unneeded
header inclusions in module-utils.h, files that improperly depended on that
stopped working and needed an added include).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a TrapsNeverHappen oracle that is used inside the main PossibleContents
oracle of GUFA. The idea is that when traps never happen we can reason "backwards"
from information to things that must be true before it:
temp = x.field;
x.cast_to<Y>(); // Y is a subtype of x's type X
Here we cast x to a subtype. If we assume traps never happen then the cast must
succeed, and that means we can assume we had a Y on the previous line, where
perhaps that information lets us infer the value of x.field.
This PR focuses on calls, which are the more interesting situation to optimize
because other passes do some work already inside functions. Specifically, we look
for things that will trap in the called function or the caller, such as if the called
function always casts a param to some type, we can assume the caller passes
such a type in. And if we have a call_ref then any target that would trap cannot be
called (at least in a closed world).
This has some benefits, in particular when combined with --gufa-cast-all since
that casts more things, which lets us apply the inferences made here. I see 3.3%
fewer call_ref instructions on a Kotlin testcase, for example. This helps more
on -Os when we inline less.
|
| |
|
|
|
|
|
| |
Changes the static asserts checking a lattice type to require a non-static
compare function instead of a static one. Also changes the compare
functions of existing lattices to be non-static.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before this PR, if a call had no paths to a catch in the same function then we skipped
creating a new basic block right after it. As a result, we could have a call in the middle
of a basic block. If EH is enabled that means we might transfer control flow out of
the function from the middle of a block. But it is better to have the property that
any transfer of control flow - to another basic block, or outside of the function - can
only happen at the end of a basic block.
This causes some overhead, but a subsequent PR (#5838) will remove that as a
followup, and this PR adds a little code to pass the module and check if EH is enabled,
and avoid the overhead if not, which at least avoids regressing the non-EH case
until that followup lands.
|
|
|
|
|
| |
The LLVM suffix tree expects to be provided with a vector of 32-bit unsigned integers. This PR makes it easier to integrate our wasm program string with the suffix tree.
Because the range of possible values is reduced from 2^64 to 2^32, a signed integer was added to manage the next separator value and ensure we're using every possible negative number.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This change implements a reaching definitions analysis which is intended
to be equivalent to the information provided by LocalGraph, specifically
the Flower class of LocalGraph.
It also introduces a CRTP utility in visitor-transfer-function.h which
implements most commonly found visitor-type transfer function
functionalities. The MonotoneCFGAnalyzer is also modified to add a phase
to collect results after the analysis is solved from the final CFG
states.
|
|
|
|
|
| |
This PR adds LLVM's suffix tree data structure to Binaryen. This suffix tree is implemented using Ukkonen's algorithm for linear-time suffix tree construction, and is intended for fast substring queries.
Note: All of the .h and .cpp files included are from LLVM. These files were copied directly instead of imported into our existing LLVM integration (in third_party/) to avoid bumping the commit hash and avoid the potential for complications with upstream changes.
|
|
|
|
|
| |
This PR is part of the effort to bring Outlining to Binaryen.
HashStringifyWalker is a subclass of StringifyWalker #5772, and used to encode a wasm module as a "string". Our "string" is a vector and each symbol is a uint64_t, providing us with a capacity of 2^64 symbols.
|
|
|
|
|
|
|
|
|
|
| |
This change creates a lattice, FinitePowersetLattice, to represent finite
powerset lattices constructed from sets containing members of arbitrary type
The bitvector finite powerset lattice class is renamed FiniteIntPowersetLattice.
The templated FinitePowersetLattice class contains a FiniteIntPowersetLattice
object, and over that provides functionality to map lattice element members
to bitvector indices. Methods are also provided by the lattice to add or
remove members of the given type from lattice members as an abstraction
over flipping bits in the bitvector.
|
|
|
|
|
|
|
| |
Analyzer (#5794)
This change makes the transfer function an object separate from the monotone analyzer. The transfer function class type is a generic template of the monotone analyzer, and an instance of the transfer function is instantiated and then passed into the monotone analyzer via its constructor. The API of the transfer function is enforced by a static assertion.
This change also introduces LivenessTransferFunction as a transfer function for liveness analysis as an example.
|