| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
Extend the splitting logic to handle splitting modules with a single table
segment with a non-const offset. In this situation the placeholder function
names are interpreted as offsets from the table base global rather than absolute
indices into the table. Since addition is not allowed in segment offset
expressions, the secondary module's segment must start at the same place as the
first table's segment. That means that some primary functions must be duplicated
in the secondary segment to fill any gaps. They are exported and imported as
necessary.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
types (#3388)
This adds the new feature and starts to use the new types where relevant. We
use them even without the feature being enabled, as we don't know the features
during wasm loading - but the hope is that given the type is a subtype, it should
all work out. In practice, if you print out the internal type you may see a typed
function reference-specific type for a ref.func for example, instead of a generic
funcref, but it should not affect anything else.
This PR does not support non-nullable types, that is, everything is nullable
for now. As suggested by @tlively this is simpler for now and leaves nullability
for later work (which will apparently require let or something else, and many
passes may need to be changed).
To allow this PR to work, we need to provide a type on creating a RefFunc. The
wasm-builder.h internal API is updated for this, as are the C and JS APIs,
which are breaking changes. cc @dcodeIO
We must also write and read function types properly. This PR improves
collectSignatures to find all the types, and also to sort them by the
dependencies between them (as we can't emit X in the binary if it depends
on Y, and Y has not been emitted - we need to give Y's index). This sorting
ends up changing a few test outputs.
InstrumentLocals support for printing function types that are not funcref
is disabled for now, until we figure out how to make that work and/or
decide if it's important enough to work on.
The fuzzer has various fixes to emit valid types for things (mostly
whitespace there). Also two drive-by fixes to call makeTrivial where it
should be (when we fail to create a specific node, we can't just try to make
another node, in theory it could infinitely recurse).
Binary writing changes here to replace calls to a standalone function to
write out a type with one that is called on the binary writer object itself,
which maintains a mapping of type indexes (getFunctionSignatureByIndex).
|
|
|
|
|
|
|
|
|
|
|
| |
We did not really model the effects of unreachable properly before. It
always traps, so it's not an implicit trap, but we didn't do anything but mark
it as "branches out", which is not really enough, as while yes it does
branch inside the current function, it also traps which is noticeable outside.
To fix that, add a trap effect to track this. implicitTrap will set trap as well,
automatically, if we do not ignore implicit traps, so it is enough to check just
that (unless one cares about the difference between implicit and explicit
ones).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- atomic.notify -> memory.atomic.notify
- i32.atomic.wait -> memory.atomic.wait32
- i64.atomic.wait -> memory.atomic.wait64
See WebAssembly/threads#149.
This renames instruction name printing but not the internal data
structure names, such as `AtomicNotify`, which are not always the same
as printed instruction names anyway. This also does not modify C API.
But this fixes interface functions in binaryen.js because it seems
binaryen.js's interface functions all follow the corresponding
instruction names.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds the capability to programatically split a module into a primary and
secondary module such that the primary module can be compiled and run before the
secondary module has been instantiated. All calls to secondary functions (i.e.
functions that have been split out into the secondary module) in the primary
module are rewritten to be indirect calls through the table. Initially, the
table slots of all secondary functions contain references to imported
placeholder functions. When the secondary module is instantiated, it will
automatically patch the table to insert references to the original functions.
The process of module splitting involves these steps:
1. Create the new secondary module.
2. Export globals, events, tables, and memories from the primary module and
import them in the secondary module.
3. Move the deferred functions from the primary to the secondary module.
4. For any secondary function exported from the primary module, export in
its place a trampoline function that makes an indirect call to its
placeholder function (and eventually to the original secondary function),
allocating a new table slot for the placeholder if necessary.
5. Rewrite direct calls from primary functions to secondary functions to be
indirect calls to their placeholder functions (and eventually to their
original secondary functions), allocating new table slots for the
placeholders if necessary.
6. For each primary function directly called from a secondary function, export
the primary function if it is not already exported and import it into the
secondary module.
7. Replace all references to secondary functions in the primary module's table
segments with references to imported placeholder functions.
8. Create new active table segments in the secondary module that will replace
all the placeholder function references in the table with references to
their corresponding secondary functions upon instantiation.
Functions can be used or referenced three ways in a WebAssembly module: they can
be exported, called, or placed in a table. The above procedure introduces a
layer of indirection to each of those mechanisms that removes all references to
secondary functions from the primary module but restores the original program's
semantics once the secondary module is instantiated. As more mechanisms that
reference functions are added in the future, such as ref.func instructions, they
will have to be modified to use a similar layer of indirection.
The code as currently written makes a few assumptions about the module that is
being split:
1. It assumes that mutable-globals is allowed. This could be worked around by
introducing wrapper functions for globals and rewriting secondary code that
accesses them, but now that mutable-globals is shipped on all browsers,
hopefully that extra complexity won't be necessary.
2. It assumes that all table segment offsets are constants. This simplifies the
generation of segments to actively patch in the secondary functions without
overwriting any other table slots. This assumption could be relaxed by 1)
having secondary segments re-write primary function slots as well, 2)
allowing addition in segment offsets, or 3) synthesizing a start function to
modify the table instead of using segments.
3. It assumes that each function appears in the table at most once. This isn't
necessarily true in general or even for LLVM output after function
deduplication. Relaxing this assumption would just require slightly more
complex code, so it is a good candidate for a follow up PR.
Future Binaryen work for this feature includes providing a command line tool
exposing this functionality as well as C API, JS API, and fuzzer support. We
will also want to provide a simple instrumentation pass for finding dead or
late-executing functions that would be good candidates for splitting out. It
would also be good to integrate that instrumentation with future function
outlining work so that dead or exceptional basic blocks could be split out into
a separate module.
|
|
|
|
|
|
| |
We mistakenly tried to run all passes there, but should run only
the function ones.
Fixes #3333
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Expands on #3294:
* Scope names must be distinguished as either defs or uses.
* Error when a core #define is missing, which is less error-prone, as
suggested by @tlively
* Add DELEGATE_GET_FIELD which lets one define "get the field"
once and then all the loops can use it. This helps avoid boilerplate for
loops at least in some cases (when there is a single object on which
to get the field).
With those, it is possible to replace boilerplate in comparisons and
hashing logic. This also fixes a bug where BrOnExn::sent was not
scanned there.
Add some unit tests for hashing. We didn't have any, and hashing can be
subtly wrong without observable external effects (just more collisions).
|
|
|
|
|
|
| |
As proposed in https://github.com/WebAssembly/simd/pull/379. Since this
instruction is still being evaluated for inclusion in the SIMD proposal, this PR
does not add support for it to the C/JS APIs or to the fuzzer. This PR also
performs a drive-by fix for unrelated instructions in c-api-kitchen-sink.c
|
|
|
|
| |
Fixes: #3226
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I believe originally wasm did not allow overlapping segments, that is, where
one memory segment tramples the data from a previous one. But then the
spec changed its mind and we allowed it. Binaryen seems to have assumed
the original case, and not checked for trampling.
If there is a chance of trampling, we cannot optimize out zeros - the zero
may have an effect if it tramples data from a previous segment. This does
not occur in practice in LLVM output, which is why this wasn't a problem
so far, I think.
An existing testcase hit this issue, so I split it up.
|
|
|
| |
Implements the parts of the Extended Name Section Proposal that are trivially applicable to Binaryen, in particular table, memory and global names. Does not yet implement label, type, elem and data names.
|
|
|
|
|
| |
Adds new matchers that allow for matching any unary or binary operation and
optionally extracting it. The previous matchers only allowed matching specific
unary and binary operations. This should help simplify #3132.
|
|
|
| |
Adds the `i31.new` and `i31.get_s/u` instructions for creating and working with `i31ref` typed values. Does not include fuzzer integration just yet because the fuzzer expects that trivial values it creates are suitable in global initializers, which is not the case for trivial `i31ref` expressions.
|
| |
|
|
|
| |
With `eqref` now integrated, the `ref.eq` instruction can be implemented. The only valid LHS and RHS value is `(ref.null eq)` for now, but implementation and fuzzer integration is otherwise complete.
|
|
|
| |
Adds the `eqref` and `i31ref` types to their respective code locations. Implements what can be implemented trivially and otherwise traps with a TODO for now. Integration of `eqref` is mostly complete due to it being nullable, just like `anyref`, but `i31ref` needs to remain disabled in the fuzzer because we are lacking the functionality to create trivial `i31ref` values, i.e. `(i31.new (i32.const 0))`, which is left for follow-ups to implement.
|
|
|
|
|
|
|
|
|
|
|
| |
Provides an easily extensible layered API for matching expression patterns and
extracting their components. The low-level API provides modular building blocks
for creating matchers for any data type and the high-level API provides a
succinct and flexible interface for matching expressions and extracting useful
information from them.
Matchers are currently provided for Const, Unary, Binary, and Select
instructions. Adding a matcher for a new type of expression is straightforward
enough that I expect to add them as they become useful as part of other changes.
|
|
|
| |
Also includes a lot of new spec tests that eventually need to go into the spec repo
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Complete 64-bit cases in range `AddInt64` ... `ShrSInt64`
- `ExtendSInt32` and `ExtendUInt32` for unary cases
- For binary cases
- `AddInt32` / `AddInt64`
- `MulInt32` / `MulInt64`
- `RemUInt32` / `RemUInt64`
- `RemSInt32` / `RemSInt64`
- `DivUInt32` / `DivUInt64`
- `DivSInt32` / `DivSInt64`
- and more
Also more fast paths for some getMaxBits calculations
|
|
|
| |
Aligns the internal representations of `memory.size` and `memory.grow` with other more recent memory instructions by removing the legacy `Host` expression class and adding separate expression classes for `MemorySize` and `MemoryGrow`. Simplifies related APIs, but is also a breaking API change.
|
|
|
| |
Adds the `--enable-gc` feature flag, so far enabling the `anyref` type incl. subtyping, and removes the temporary `--enable-anyref` feature flag that it replaces.
|
|
|
|
|
|
|
|
|
|
|
| |
Previously Pops were printed as ({type}.pop), and if the popped type was a
tuple, something like ((i32, i64).pop) would get printed. However, the parser
didn't support pops of anything besides single basic types.
This PR changes the text format to be (pop <type>*) and adds support for parsing
pops of tuples of basic types. The text format change is designed to make
parsing simpler. This change is necessary for writing Poppy IR tests (see #3059)
that contain break or return instructions that consume multiple values, since in
Poppy IR that requires tuple-typed pops.
|
|
|
| |
Adds `anyref` type, which is enabled by a new feature `--enable-anyref`. This type is primarily used for testing that passes correctly handle subtype relationships so that the codebase will continue to be prepared for future subtyping. Since `--enable-anyref` is meaningless without also using `--enable-reference-types`, this PR also makes it a validation error to pass only the former (and similarly makes it a validation error to enable exception handling without enabling reference types).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BinaryenIRWriter was previously inconsistent about whether or not it
emitted an instruction if that instruction was not reachable.
Instructions that produced values were not emitted if they were
unreachable, but instructions that did not produce values were always
emitted. Additionally, blocks continued to emit their children even
after emitting an unreachable child.
Since it was not possible to tell whether an unreachable instruction's
parent would be emitted, BinaryenIRWriter had to be very defensive and
emit many extra `unreachable` instructions around unreachable code to
avoid type errors.
This PR unifies the logic for emitting all non-control flow
instructions and changes the behavior of BinaryenIRWriter so that it
never emits instructions that cannot be reached due to having
unreachable children. This means that extra `unreachable` instructions
now only need to be emitted after unreachable control flow
constructs. BinaryenIRWriter now also stops emitting instructions
inside blocks after the first unreachable instruction as an extra
optimization.
This change will also simplify Poppy IR stackification (see #3059) by
guaranteeing that instructions with unreachable children will not be
emitted into the stackifier. This makes satisfying the Poppy IR rule
against unreachable Pops trivial, whereas previously satisfying this
rule would have required about about 700 additional lines of code to
recompute the types of all unreachable children for any instruction.
|
|
|
|
|
|
|
| |
Align with the current state of the reference types proposal:
* Remove `nullref`
* Remove `externref` and `funcref` subtyping
* A `Literal` of a nullable reference type can now represent `null` (previously was type `nullref`)
* Update the tests and temporarily comment out those tests relying on subtyping
|
|
|
|
|
|
| |
Implement and test utilities for manipulating and analyzing a new
stacky form of Binaryen IR that is able to express arbitrary stack
machine code. This new Poppy IR will eventually replace Stack IR, and
new optimization passes will be built with these utilities. See #3059.
|
|
|
| |
Extends compound types introduced in #3012 with a representation of `Rtt`s as described in the GC proposal, by also introducing the concept of a `HeapType` shared between `TypeInfo` and `Rtt`. Again, this should be a non-functional change since `Rtt`s are not used anywhere yet. Subtyping rules and updating the `xref` aliases is left for future work.
|
|
|
|
|
| |
Extends the `Type` hash-consing infrastructure to handle type-parameterized and constructed types introduced in the typed function references and GC proposals. This should be a non-functional change since the new types are not used anywhere yet. Recursive type construction and canonicalization is also left as future work.
Co-authored-by: Thomas Lively <tlively@google.com>
|
|
|
|
|
|
|
|
|
| |
testing for it (#3019)
getMaxBits just moves around, no logic is changed.
Aside from adding getMaxBits, the change in bits.h is 99% whitespace.
helps #2879
|
|
|
|
|
|
|
| |
anyref future semantics were changed to only represent opaque host values, and thus renamed to externref.
[Chromium](https://bugs.chromium.org/p/v8/issues/detail?id=7748#c360) was just updated to today (not yet released). I couldn't find a Mozilla bugzilla ticket mentioning externref so I don't immediately know if they've updated yet.
https://github.com/WebAssembly/reference-types/pull/87
|
|
|
| |
As specified in https://github.com/WebAssembly/simd/pull/232.
|
|
|
|
|
|
| |
Push and Pop have been superseded by tuples for their original
intended purpose of supporting multivalue. Pop is still used to
represent block arguments for exception handling, but there are no
plans to use Push for anything now or in the future.
|
|
|
|
| |
This is the only instruction in the current spec proposal that had not
yet been implemnented in the tools.
|
|
|
| |
As specified in https://github.com/WebAssembly/simd/pull/122.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In WebAssembly/exception-handling#52, We decided to put `try` bodies in
a `do` clause to be more consistent with `catch`.
- Before
```wast
(try
...
(catch
...
)
)
```
- After
```wast
(try
(do
...
)
(catch
...
)
)
```
Another upside of this change is when there are multiple instructions
within a `try` body, we no longer need to wrap them in a `block`.
|
|
|
|
|
|
| |
This feature was very useful in the early days of the C API,
but has not shown usefuless for quite a while, and has a
significant maintenance burden, so it it's makes sense to
remove it now.
|
|
|
|
|
|
|
| |
Refactors most of the precompute pass's expression runner into its
base class so it can also be used via the C and JS APIs. Also adds
the option to populate the runner with known constant local and global
values upfront, and remembers assigned intermediate values as well
as traversing into functions if requested.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of using indices into the global interned type table. This
means that a lock is *never* needed to access an expanded Type. The
Type lock is now only acquired when a complex Type is created. On a
real-world wasm2js workload this improves wall clock time by 23% on my
machine with 72 cores and makes traffic on the Type lock entirely
insignificant.
**Before**
72 cores
real 0m6.914s
user 184.014s
sys 0m3.995s
1 core
real 0m25.903s
user 0m25.658s
sys 0m0.253s
**After**
72 cores
real 5.349s
user 70.309s
sys 9.691s
1 core
real 25.859s
user 25.615s
sys 0.253s
|
|
|
|
|
|
|
|
| |
`BinaryIndexes` was only used in two places (Print.cpp and
wasm-binary.h), so it didn't seem to be a great fit for
module-utils.h. This change moves it to wasm-binary.h and removes its
usage in Print.cpp. This means that function indexes are no longer
printed, but those were of limited utility and were the source of
annoying noise when updating tests, anyway.
|
|
|
|
| |
Adds functions for creating and inspecting tuple.make and
tuple.extract expressions in the C and JS APIs.
|
|
|
|
|
|
| |
Adds full support for the {i8x16,i16x8,i32x4}.abs instructions merged
to the SIMD proposal in https://github.com/WebAssembly/simd/pull/128
as well as the {i8x16,i16x8,i32x4}.bitmask instructions proposed in
https://github.com/WebAssembly/simd/pull/201.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for the reference type proposal. This includes support
for all reference types (`anyref`, `funcref`(=`anyfunc`), and `nullref`)
and four new instructions: `ref.null`, `ref.is_null`, `ref.func`, and
new typed `select`. This also adds subtype relationship support between
reference types.
This does not include table instructions yet. This also does not include
wasm2js support.
Fixes #2444 and fixes #2447.
|
|
|
| |
As specified in https://github.com/WebAssembly/simd/pull/126.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to the current spec, `local.tee`'s return type should be the
same as its local's type. (Discussions on whether we should change this
rule is going on in WebAssembly/reference-types#55, but here I will
assume this spec does not change. If this changes, we should change many
parts of Binaryen transformation anyway...)
But currently in Binaryen `local.tee`'s type is computed from its
value's type. This didn't make any difference in the MVP, but after we
have subtype relationship in #2451, this can become a problem. For
example:
```
(func $test (result funcref) (local $0 anyref)
(local.tee $0
(ref.func $test)
)
)
```
This shouldn't validate in the spec, but this will pass Binaryen
validation with the current `local.tee` implementation.
This makes `local.tee`'s type computed from the local's type, and makes
`LocalSet::makeTee` get a type parameter, to which we should pass the
its corresponding local's type. We don't embed the local type in the
class `LocalSet` because it may increase memory size.
This also fixes the type of `local.get` to be the local type where
`local.get` and `local.set` pair is created from `local.tee`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Function signatures were previously redundantly stored on Function
objects as well as on FunctionType objects. These two signature
representations had to always be kept in sync, which was error-prone
and needlessly complex. This PR takes advantage of the new ability of
Type to represent multiple value types by consolidating function
signatures as a pair of Types (params and results) stored on the
Function object.
Since there are no longer module-global named function types,
significant changes had to be made to the printing and emitting of
function types, as well as their parsing and manipulation in various
passes.
The C and JS APIs and their tests also had to be updated to remove
named function types.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently `none` and `unreachable` types are stored as the same empty
`{}` in src/wasm/wasm-type.cpp. This makes `Type::operator<` incorrectly
when given `none` and `unreachable`, because it expands both given types
and lexicographically compare them, when both of the expanded vector
will be empty.
This was found by the fuzzer. This line in `Modder::visitExpression`
tries to retrieve candidates of the same type. Because we can't really
compare these two types, if you give `unreachable` as the key,
candidates of `none` type can be returned. This generates incorrect code
that ends up failing in validation in a very weird way.
It was hard to generate a small testcase to trigger this part because it
was found by generating fuzzed code from a random data file. But I guess
this fix is pretty straightforward.
Fixes #2512.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current `<<` operator on `Literal` prints `[type].const` with it. But
`[type].const` is rather an instruction than a literal itself, and
printing it with the literals makes less sense when we later have
literals whose type don't have `const` instructions (such as reference
types).
This patch
- Makes `<<` operator on `Literal` print only its value
- Makes wasm-shell's shell interface comply with the spec interpreter's
printing format (`value : type`).
- Prints wasm-shell's `[trap]` message to stderr
These make all `fix_` routines for spec tests in check.py unnecessary.
|
|
|
|
|
| |
This makes sure example tests validate by adding missing `assert` on
`BinaryenModuleValidate` calls and fixes existing errors in the example
tests.
|