| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Monomorphization finds cases where we send more refined types to a function
than it declares. In such cases we can copy the function and refine the parameters:
// B is a subtype of A
foo(new B());
function foo(x : A) { ..}
=>
foo_B(new B()); // call redirected to refined copy
function foo(x : A) { ..} // unchanged
function foo_B(x : B) { ..} // refined copy
This increases code size so it may not be worth it in all cases. This initial PR is
hopefully enough to start experimenting with this on performance, and so it does
not enable the pass by default.
This adds two variations of monomorphization, one that always does it, and the
default which is "careful": it sees whether monomorphizing lets the refined function
actually be better than the original (say, by removing a cast). If there is no
improvement then we do not make any changes. This saves a significant amount
of code size - on j2wasm the careful version increases by 13% instead of 20% -
but it does run more slowly obviously.
|
|
|
| |
Per the wasm spec, memory.grow instructions should return -1 when there is a failure to allocate enough memory. This PR adds support for returning this error code.
|
|
|
|
|
|
|
|
|
|
|
| |
OptimizeInstructions in rare cases can add unreachability. We propagate it out at
the end all at once. The fuzzer was smart enough to find a very special combination
of code + passes that can hit an issue, see the testcase.
As mentioned in the TODO, we should perhaps avoid adding unreachability in
OptimizeInstructions at all. If this happens again that might be worth the effort. But
also checking the type of the child as in this PR doesn't add much complexity in the
code.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Update MemoryPacking for array.new_data
The MemoryPacking pass looks at all instructions that reference memory segments
to determine how they can be optimized. #5214 introduced a new instruction that
references memory segments, array.new_data, but did not update MemoryPacking
accordingly. This omission meant that MemoryPacking could produce invalid or
misoptimized modules in the presence of array.new_data.
Fix the problem by making MemoryPacking aware of array.new_data. Consider
array.new_data when determining whether a segment is used and update
array.new_data to reflect the new, optimized segment numberings afterward. To
keep things simple, do not try to split any segment that is referred to by
a array.new_data instruction.
* fix
* Add test explanations
* Fix possible-contents.h for `array.new_{data,elem}`
This code was not properly updated in #5214, so GUFA would incorrectly optimize
out `array.new_data` and `array.new_elem` instructions. Fix the problem by
making these instructions data flow roots.
* fix
* move tests
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Update MemoryPacking for array.new_data
The MemoryPacking pass looks at all instructions that reference memory segments
to determine how they can be optimized. #5214 introduced a new instruction that
references memory segments, array.new_data, but did not update MemoryPacking
accordingly. This omission meant that MemoryPacking could produce invalid or
misoptimized modules in the presence of array.new_data.
Fix the problem by making MemoryPacking aware of array.new_data. Consider
array.new_data when determining whether a segment is used and update
array.new_data to reflect the new, optimized segment numberings afterward. To
keep things simple, do not try to split any segment that is referred to by
a array.new_data instruction.
* fix
* Add test explanations
|
|
|
|
|
|
|
|
|
|
| |
Instead of automatically determining which exports will be async they will
be explicitly set by the user. We'll rely on the runtime trapping if they
are incorrectly set.
Two new arguments that behave similar to asyncify-imports:
- jspi-imports
- jspi-exports
|
| |
|
|
|
|
|
|
|
|
|
| |
In order to test them, fix the binary and text parsers to accept passive data
segments even if a module has no memory. In addition to parsing and emitting the
new instructions, also implement their validation and interpretation. Test the
interpretation directly with wasm-shell tests adapted from the upstream spec
tests. Running the upstream spec tests directly would require fixing too many
bugs in the legacy text parser, so it will have to wait for the new text parser
to be ready.
|
|
|
| |
Adds support for the Asyncify pass to use Multi-Memories. This is specified by passing flag --asyncify-in-secondary-memory. Another flag, --asyncify-secondary-memory-size, is used to specify the initial and max size of the secondary memory.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Similar to #5194 but for RedundantSetElimination. This has similar benefits in terms
of using a more refined local in hopes of avoiding casts in followup opts, but unlike
SimplifyLocals this will operate across basic blocks.
To do this, we need to track not just local.set but also local.get in that pass. Then
in each basic block we can track the equivalent locals and pick from them.
I see a few dozen casts removed in the J2Wasm binary. Often stuff like this happens:
y = cast(x);
if (..) {
foo(x); // this could use y
}
|
|
|
| |
We did not preserve the ordering of the fixed-size storage there.
|
|
|
|
|
|
|
| |
These operations emit a completely different type than their input, so they must be
marked as roots, and not as things that flow values through them (because then
we filter everything out as the types are not compatible).
Fixes #5219
|
|
|
| |
See: https://reviews.llvm.org/D125728
|
|
|
|
|
|
|
|
|
|
|
|
| |
The binary parser was eagerly getting the name of memories to set the `memory`
field of data segments, but that meant that when the memory names were updated
later while parsing the names section, the data segment memory fields would
become out of date. Update the issue by deferring setting the `memory` fields
like we do for other parts of IR that reference memories.
Also fix a segfault in the validator that was triggered by the reproducer for
this bug before the bug was fixed.
Fixes #5204.
|
|
|
|
|
|
|
|
|
|
|
| |
This can help in rare cases in MVP wasm, say for the return value of a block. But for
wasm GC it is very important due to casts.
Similar logic was added as part of #5194 for SimplifyLocals. It should probably have
been in a separate PR then. This does the right thing for RedundantSetElimination,
as a separate PR. Full tests will appear in that later PR (it is not really possible to test
the GC side yet - we need the logic in the later PR that actually switches to a more
refined local index when available).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds C APIs to inspect compound struct, array and signature heap types:
Obtain field types, field packed types and field mutabilities of struct types:
BinaryenStructTypeGetNumFields (to iterate)
BinaryenStructTypeGetFieldType
BinaryenStructTypeGetFieldPackedType
BinaryenStructTypeIsFieldMutable
Obtain element type, element packed type and element mutability of array types:
BinaryenArrayTypeGetElementType
BinaryenArrayTypeGetElementPackedType
BinaryenArrayTypeIsElementMutable
Obtain parameter and result types of signature types:
BinaryenSignatureTypeGetParams
BinaryenSignatureTypeGetResults
|
|
|
|
|
|
|
| |
We just checked if the new type we prefer (when switching a local to a more
refined one in #5194) is different than the old type. But that check at the end
must check it is a subtype as well.
Diff without whitespace is smaller.
|
|
|
|
|
|
|
|
|
| |
This sorts globals by their usage (and respecting dependencies). If the module
has very many globals then using smaller LEBs can matter.
If there are fewer than 128 globals then we cannot reduce size, and the pass
exits early (so this pass will not slow down MVP builds, which usually have just
1 global, the stack pointer). But with wasm GC it is common to use globals for
vtables etc., and often there is a very large number of them.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
possible (#5194)
(local.set $refined (cast (local.get $plain)))
..
.. (local.get $plain) .. ;; we can change this to read from $refined
By using the more refined type we may be able to eliminate casts later.
To do this, look at the fallthrough value (so we can look through a cast or a block
value - this is the reason for the small wasm2js improvements in tests), and also
extend the code that picks which local index to read to look at types (previously
we just ignored any pairs of locals with different types).
|
|
|
|
|
|
|
|
|
|
| |
E.g.
Atomic operation (atomics are disabled)
=>
Atomic operations require threads [--enable-threads]
|
|
|
|
|
|
|
|
|
|
| |
Adds a multi-memories lowering pass that will create a single combined memory from the memories added to the module. This pass assumes that each memory is configured the same (type, shared).
This pass also:
- replaces existing memory.size instructions with a custom function that returns the size of each memory as if they existed independently
- replaces existing memory.grow instructions with a custom function, using global offsets to track the page size of each memory so data doesn't overlap in the singled combined memory
- adjusts the offsets of active data segments
- adjusts the offsets of Loads/Stores
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously the pass only pushed past an if or a br_if. This does the same but into an
if arm. On Wasm GC for example this can perform allocation sinking:
function foo() {
x = new A();
if (..) {
use(x);
}
}
=>
function foo() {
if (..) {
x = new A(); // this moved
use(x);
}
}
The allocation won't happen if we never enter the if. This helps wasm MVP too,
and in fact some existing tests benefit.
|
|
|
|
|
|
|
|
|
| |
If a heap type only ever appears as the result of a read, we must include it in
the analysis in ModuleUtils, even though it isn't written in the binary format.
Otherwise analyses using ModuleUtils can error on not finding all types in the
list of types.
Fixes #5180
|
| |
|
|
|
|
|
|
|
| |
The fallthrough there is trickier because the value is evaluated before the condition.
Unlike other fallthroughs, the value is not last, so we need to check if the condition
(which is after it) interferes with it.
|
|
|
| |
See #5188
|
| |
|
|
|
|
|
| |
This makes the logic symmetric and easier to read.
Measuring speed, this seems identical to before, so that concern seems fine.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Adds heap type utility to the C API:
BinaryenHeapTypeIsBasic
BinaryenHeapTypeIsSignature
BinaryenHeapTypeIsStruct
BinaryenHeapTypeIsArray
BinaryenHeapTypeIsSubType
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is safe since we "partially remove" it: we don't move it to a place it might
execute more, but make it possibly execute less. See the new comment for
more details.
Motivated by wasm GC but this can help wasm MVP as well. In both cases
loads from memory can trap, which limits what the VM can do to optimize
them past conditions, but in trapsNeverHappens we can do that at the
toolchain level:
x = read();
if () { .. }
use(x);
=>
if () { .. }
x = read(); // moved to here, and might not execute if the if did a break/return
use(x);
|
|
|
|
|
|
|
|
|
| |
Unlike in the legacy parser, we cannot depend on the folded text format to
determine how many values to return, so we determine that solely based on the
current function context.
To handle multivalue return correctly, fix a bug in which we could synthesize
new `unreachable`s and place them before existing unreachable instructions (like
returns) at the end of instruction sequences.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Poking in the git history, this was some kind of hack for a problem that appears to
no longer exist since at least 5 years ago.
Logically, refinalize should never change a type from unreachable to none, or at
least if we have a place that does this, we should manually do the necessary things
around that, like updating the function's return type. The opposite, none (or anything
else) to unreachable is the common case, where we use refinalize to propagate
that type upwards.
Fuzzing also finds no issues.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Doing so shortens the code by removing duplicate logic.
Also this will avoid a compile error in a future PR, as by inheriting from
Visitor we include functions like visitFunction which were otherwise
missing from OverriddenVisitor. We could duplicate those like we
duplicated the expression logic, but just removing all the duplication
seems best.
I manually verified OverriddenVisitor still provides the same error messages
as before.
|
|
|
|
|
| |
ParseDefsCtx was the only client of the CRTP InstrParserCtx utility and the
separation between the two did not serve a real purpose. Simplify the code by
combining them.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Add parsing functions for `memarg`s, the offset and align fields of load and
store instructions. These fields are interesting because they are lexically
reserved words that need to be further parsed to extract their actual values. On
top of that, add support for parsing all of the load and store instructions.
This required fixing a buffer overflow problem in the generated parser code and
adding more information to the signatures of the SIMD load and store
instructions. `SIMDLoadStoreLane` instructions are particularly interesting
because they may require backtracking to parse correctly.
|
|
|
|
|
| |
These are encoded as RefAs operations, and we have optimizations that assume those
trap on null, but Externalize/Internalize do not. Skip them there to avoid an error on the
type being incorrect later.
|
| |
|
|
|
|
|
|
|
|
|
| |
Since gen-s-parser.py is essentially a giant table mapping instruction names to
the information necessary to construct the corresponding IR nodes, there should
be no need to further parse instruction names after the code generated by
gen-s-parser.py runs. However, memory instruction parsing still parsed
instruction names to get information such as size and default alignment. The new
parser does not have the ability to parse that information out of instruction
names, so put it in the gen-s-parser.py table instead.
|
|
|
|
|
| |
This wasn't noticed since we apparently only use module code scanning to find stuff
like function references atm (which can't be in a data segment). But newer passes will
need to scan everything (#5163).
|
|
|
|
|
|
|
|
| |
Specifically if a segment offset was a const, we checked that it made sense. But the
wasm spec doesn't do that, and it actually causes some issues (#5163).
In theory this extra validation might be useful - compile-time error rather than runtime -
but if we want this it should probably be an optional thing, like an opt-in flag or a --lint
pass or such.
|
|
|
|
|
| |
I believe all locations that create one already set it (or else we'd see errors), but it's not
easy to see that when reading the code. And other similar locations (like DataSegment)
do initialize to null, so do so for consistency.
|
|
|
|
|
| |
Also add the ability to parse memory indexes to correctly handle the
multi-memory versions of these instructions. Add and use a conversion from
`Result` to `MaybeResult` as well.
|
|
|
|
|
|
|
| |
Parse 32-bit and 64-bit memories, including their initial and max sizes. Shared
memories are left to a follow-up PR. The memory abbreviation that includes
inline data is parsed, but the associated data segment is not yet created. Also
do some minor simplifications in neighboring helper functions for other kinds of
module elements.
|
|
|
|
|
| |
We already provided a specialization of `std::hash` for arbitrary pairs, so add
one for `std::tuple` as well. Use the new specialization where we were
previously using nested pairs just to be able to use the pair specialization.
|
|
|
| |
`Push` expressions were removed in #2867, so we no longer need to make them.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
When we read from a struct/array using a cone type, read from the types in the cone
and nothing else. Previously we used the declared type in the wasm, which might be
larger (both in the base type and the depth). Likewise, in a write.
To do this, this extends ConeReadLocation with a depth (previously the depth there
was assumed to be infinite, and now it is to a potentially limited depth).
After this we are fully utilizing cone types in GUFA, as the test changes show (or at
least I can't think of any other uses of cones).
|