| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
When IRBuilder builds an empty non-block scope such as a function body,
an if arm, a try block, etc, it needs to produce some expression to
represent the empty contents. Previously it produced a nop, but change
it to produce an empty block instead. The binary writer and printer have
special logic to elide empty blocks, so this produces smaller output.
Update J2CLOpts to recognize functions containing empty blocks as
trivial to avoid regressing one of its tests.
|
|
|
| |
Fixes #6690
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
#6587 was incorrect: It checked generativity early in an incremental manner, but
it did not accumulate that information as we do with hashes. As a result we
could end up optimizing something with a generative child, and sadly we lacked
testing for that case.
This adds incremental generativity computation alongside hashes. It also splits
out this check from isRelevant.
Also add a test for nested effects (as opposed to generativity), but that already
worked before this PR (as we compute effects and invalidation as we go, already).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Given:
(ORIGINAL)
(in between)
(COPY)
We want to change that to
(local.tee $temp (ORIGINAL))
(in between)
(local.get $temp)
It is fine if "in between" traps: then we never reach the new local.get. This is a safer
situation than most optimizations because we are not reordering anything, only
replacing known-equivalent code.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When printing Binaryen IR, we previously generated names for unnamed heap types
based on their structure. This was useful for seeing the structure of simple
types at a glance without having to separately go look up their definitions, but
it also had two problems:
1. The same name could be generated for multiple types. The generated names did
not take into account rec group structure or finality, so types that differed
only in these properties would have the same name. Also, generated type names
were limited in length, so very large types that shared only some structure
could also end up with the same names. Using the same name for multiple types
produces incorrect and unparsable output.
2. The generated names were not useful beyond the most trivial examples. Even
with length limits, names for nontrivial types were extremely long and visually
noisy, which made reading disassembled real-world code more challenging.
Fix these problems by emitting simple indexed names for unnamed heap types
instead. This regresses readability for very simple examples, but the trade off
is worth it.
This change also reduces the number of type printing systems we have by one.
Previously we had the system in Print.cpp, but we had another, more general and
extensible system in wasm-type-printing.h and wasm-type.cpp as well. Remove the
old type printing system from Print.cpp and replace it with a much smaller use
of the new system. This requires significant refactoring of Print.cpp so that
PrintExpressionContents object now holds a reference to a parent
PrintSExpression object that holds the type name state.
This diff is very large because almost every test output changed slightly. To
minimize the diff and ease review, change the type printer in wasm-type.cpp to
behave the same as the old type printer in Print.cpp except for the differences
in name generation. These changes will be reverted in much smaller PRs in the
future to generally improve how types are printed.
|
|
|
|
|
|
|
|
|
|
| |
This makes Binaryen's default type system match the WasmGC spec.
Update the way type definitions without supertypes are printed to reduce the
output diff for MVP tests that do not involve WasmGC. Also port some
type-builder.cpp tests from test/example to test/gtest since they needed to be
rewritten to work with isorecursive type anyway.
A follow-on PR will remove equirecursive types completely.
|
|
|
|
|
| |
(#5266)
This reverts commit 570007dbecf86db5ddba8d303896d841fc2b2d27.
|
|
|
|
|
| |
This reverts commit b2054b72b7daa89b7ad161c0693befad06a20c90.
It looks like the necessary V8 change has not rolled out everywhere yet.
|
|
|
|
| |
They were optional for a while to allow users to gracefully transition to using
them, but now make them mandatory to match the upstream WasmGC spec.
|
|
|
|
|
|
|
| |
Emit call_ref instructions with type annotations and a temporary opcode. Also
implement support for parsing optional type annotations on call_ref in the text
and binary formats. This is part of a multi-part graceful update to switch
Binaryen and all of its users over to using the type-annotated version of
call_ref without there being any breakage.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An overview of this is in the README in the diff here (conveniently, it is near the
top of the diff). Basically, we fix up nn locals after each pass, by default. This keeps
things easy to reason about - what validates is what is valid wasm - but there are
some minor nuances as mentioned there, in particular, we ignore nameless blocks
(which are commonly added by various passes; ignoring them means we can keep
more locals non-nullable).
The key addition here is LocalStructuralDominance which checks which local
indexes have the "structural dominance" property of 1a, that is, that each get has
a set in its block or an outer block that precedes it. I optimized that function quite
a lot to reduce the overhead of running that logic after each pass. The overhead
is something like 2% on J2Wasm and 0% on Dart (0%, because in this mode we
shrink code size, so there is less work actually, and it balances out).
Since we run fixups after each pass, this PR removes logic to manually call the
fixup code from various places we used to call it (like eh-utils and various passes).
Various passes are now marked as requiresNonNullableLocalFixups => false.
That lets us skip running the fixups after them, which we normally do automatically.
This helps avoid overhead. Most passes still need the fixups, though - any pass
that adds a local, or a named block, or moves code around, likely does.
This removes a hack in SimplifyLocals that is no longer needed. Before we
worked to avoid moving a set into a try, as it might not validate. Now, we just do it
and let fixups happen automatically if they need to: in the common code they
probably don't, so the extra complexity seems not worth it.
Also removes a hack from StackIR. That hack tried to avoid roundtrip adding a
nondefaultable local. But we have the logic to fix that up now, and opts will
likely keep it non-nullable as well.
Various tests end up updated here because now a local can be non-nullable -
previous fixups are no longer needed.
Note that this doesn't remove the gc-nn-locals feature. That has been useful for
testing, and may still be useful in the future - it basically just allows nn locals in
all positions (that can't read the null default value at the entry). We can consider
removing it separately.
Fixes #4824
|
|
|
| |
I was reading these tests and failing to find the names script.
|
|
|
|
|
|
|
| |
RTTs were removed from the GC spec and if they are added back in in the future,
they will be heap types rather than value types as in our implementation.
Updating our implementation to have RTTs be heap types would have been more work
than deleting them for questionable benefit since we don't know how long it will
be before they are specced again.
|
|
|
|
|
|
|
|
|
|
|
| |
This allows common patterns in J2CL to be optimized, where we write
to various array indices and get the values or the reference from a
struct.
It would be nice to do even better here, and look at actually specific
types, but I think we should be careful to keep the runtime constant.
That seems hard to do if we accumulate a list of types and do
Type::isSubType on them etc. But maybe someone has a better
idea than this PR?
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we replace
A
A
A
with
(local.set A)
(local.get)
(local.get)
then it is ok for A to trap (so long as it does so deterministically), as if
it does trap then the first appearance will do so, and the others not
be reached anyhow.
This helps GC code as often there are repeated struct.gets and such that
may trap.
|
|
Technically this is not a new pass, but it is a rewrite almost from scratch.
Local Common Subexpression Elimination looks for repeated patterns,
stuff like this:
x = (a + b) + c
y = a + b
=>
temp = a + b
x = temp + c
y = temp
The old pass worked on flat IR, which is inefficient, and was overly
complicated because of that. The new pass uses a new algorithm that
I think is pretty simple, see the detailed comment at the top.
This keeps the pass enabled only in -O4, like before - right after
flattening the IR. That is to make this as minimal a change as possible.
Followups will enable the pass in the main pipeline, that is, we will
finally be able to run it by default. (Note that to make the pass work
well after flatten, an extra simplify-locals is added - the old pass used
to do part of simplify-locals internally, which was one source of
complexity. Even so, some of the -O4 tests have changes, due to
minor factors - they are just minor orderings etc., which can be
seen by inspecting the outputs before and after using e.g.
--metrics)
This plus some followup work leads to large wins on wasm GC output.
On j2cl there is a common pattern of repeated struct.gets, so common
that this pass removes 85% of all struct.gets, which makes the total
binary 15% smaller. However, on LLVM-emitted code the benefit is
minor, less than 1%.
|