| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
When IRBuilder builds an empty non-block scope such as a function body,
an if arm, a try block, etc, it needs to produce some expression to
represent the empty contents. Previously it produced a nop, but change
it to produce an empty block instead. The binary writer and printer have
special logic to elide empty blocks, so this produces smaller output.
Update J2CLOpts to recognize functions containing empty blocks as
trivial to avoid regressing one of its tests.
|
|
|
|
| |
`ModuleUtils::copyTable` was not copying the `indexType` property.
|
|
|
|
|
|
|
| |
Support 5-segment source mappings, which add a name.
Reference:
https://github.com/tc39/source-map/blob/main/source-map-rev3.md#proposed-format
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rec groups need to be topologically sorted for the output module to be
valid, but the specific order of rec groups also affects the module size
because types at lower indices requires fewer bytes to reference. We
previously optimized for code size when gathering types by sorting the
list of groups before doing the topological sort. This was brittle,
though, and depended on implementation details of the topological sort
to be correct.
Replace the old topological sort with use of the new
`TopologicalSort::minSort` utility, which is a more principled method of
achieving a minimal topological sort with respect to some comparator.
Also draw inspiration from ReorderGlobals and apply an exponential
factor to take the users of a rec group into account when determining
its weight.
|
|
|
|
|
|
| |
* Add interpreter support for exnref values.
* Fix optimization passes to support try_table.
* Enable the interpreter (but not in V8, see code) on exceptions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The old ordering in that pass did a topological sort while sorting by uses
both within topological groups and between them. That could be unoptimal
in some cases, however, and actually on J2CL output this pass made the
binary larger, which is how we noticed this.
The problem is that such a toplogical sort keeps topological groups in
place, but it can be useful to interleave them sometimes. Imagine this:
$c - $a
/
$e
\
$d - $b
Here $e depends on $c, etc. The optimal order may interleave the two
arms here, e.g. $a, $b, $d, $c, $e. That is because the dependencies define
a partial order, and so the arms here are actually independent.
Sorting by toplogical depth first might help in some cases, but also is not
optimal in general, as we may want to mix toplogical depths:
$a, $c, $b, $d, $e does so, and it may be the best ordering.
This PR implements a natural greedy algorithm that picks the global with
the highest use count at each step, out of the set of possible globals, which
is the set of globals that have no unresolved dependencies. So we start by
picking the first global with no dependencies and add at at the front; then
that unlocks anything that depended on it and we pick from that set, and
so forth.
This may also not be optimal, but it is easy to make it more flexible by
customizing the counts, and we consider 4 sorts here:
* Set all counts to 0. This means we only take into account dependencies,
and we break ties by the original order, so this is as close to the original
order as we can be.
* Use the actual use counts. This is the simple greedy algorithm.
* Set the count of each global to also contain the counts of its children,
so the count is the total that might be unlocked. This gives more weight
to globals that can unlock more later, so it is less greedy.
* As last, but weight children's counts lower in an exponential way, which
makes sense as they may depend on other globals too.
In practice it is simple to generate cases where 1, 2, or 3 is optimal (see
new tests), but on real-world J2CL I see that 4 (with a particular exponential
coefficient) is best, so the pass computes all 4 and picks the best. As a
result it will never worsen the size and it has a good chance of
improving.
The differences between these are small, so in theory we could pick any
of them, but given they are all modifications of a single algorithm it is
very easy to compute them all with little code complexity.
The benefits are rather small here, but this can save a few hundred
bytes on a multi-MB Java file. This comes at a tiny compile time cost, but
seems worth it for the new guarantee to never regress size.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new text parser is faster and more standards compliant than the old text
parser. Enable it by default in wasm-opt and update the tests to reflect the
slightly different results it produces. Besides following the spec, the new
parser differs from the old parser in that it:
- Does not synthesize `loop` and `try` labels unnecessarily
- Synthesizes different block names in some cases
- Parses exports in a different order
- Parses `nop`s instead of empty blocks for empty control flow arms
- Does not support parsing Poppy IR
- Produces different error messages
- Cannot parse `pop` except as the first instruction inside a `catch`
|
|
|
|
|
|
|
| |
- Only write explicit function names.
- When merging modules, the name of types, globals and tags in all
modules but the first were lost.
- Set name as explicit when copying a function with a new name.
|
|
|
|
| |
The new parser, not yet enabled, requires that imports precede declared module
items except for types.
|
| |
|
|
|
|
| |
The checked in test outputs were out of sync with what the auto update script
produces.
|
| |
|
|
|
|
|
|
|
|
|
| |
Get as many of the lit tests as possible to parse with the new parser, mostly by
moving declared module items to be after imports. Also fix a bug in the new
parser's pop validation to allow supertypes of the expected type.
The two big issues that still prevent some lit tests from working correctly
under the new parser are missing support for symbolic field names and missing
support for source map annotations.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the first module has a global that reads from a global that appears in a later
module, then we need to reorder the globals, because if we just append the
globals from the later module we'd end up with a global reading from another
that is not before it.
Changes to the existing renamings test are just due to the global sorting
pass that now runs (it not only fixes up validation errors but also tries to sort
in a more optimal order for size).
Fixes #6220
|
|
|
|
|
|
|
|
|
| |
The new wat parser is much more strict than the legacy wat parser; the latter
accepts all sorts of things that the spec does not allow. To ease an eventual
transition to using the new wat parser by default, update the tests to use the
standard text format in many places where they previously did not. We do not yet
have a way to prevent new errors from being introduced into the test suite, but
at least there will now be many fewer errors when it comes time to make the
switch.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We previously supported a non-standard `(func "name" ...` syntax for declaring
functions exported with the quoted name. Since that is not part of the standard
text format, drop support for it, replacing it with the standard `(func $name
(export "name") ...` syntax instead.
Also replace our other usage of the quoted form in our text output, which was
where we quoted names containing characters that are not allowed to appear in
standard names. To handle that case, adjust our output from `"$name"` to
`$"name"`, which is the standards-track way of supporting such names. Also fix
how we detect non-standard name characters to match the spec.
Update the lit test output generation script to account for these changes,
including by making the `$` prefix on names mandatory. This causes the script to
stop interpreting declarative element segments with the `(elem declare ...`
syntax as being named "declare", so prevent our generated output from regressing
by counting "declare" as a name in the script.
|
|
|
| |
That optimization uncovered some LLVM and Binaryen bugs.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we export a function that just calls another function, we can export that one
instead. Then the one in the middle may be unused,
function foo() {
return bar();
}
export foo; // can be an export of bar
This saves a few bytes in rare cases, but probably more important is that it saves
the trampoline, so if this is on a hot path, we save a call.
Context: emscripten-core/emscripten#20478 (comment)
In general this is not needed as inlining helps us out by inlining foo() into the
caller (since foo is tiny, that always ends up happening). But exports are a case
the inliner cannot handle, so we do it here.
|
|
|
|
|
|
|
|
|
|
|
| |
This PR is part of a series that adds basic support for the typed continuations proposal.
This PR relaxes the restriction that tags must not have results , only params. Tags with
results must not be used for exception handling and are only allowed if the typed
continuations feature is enabled.
As a minor point, this PR also changes the printing of tags without params: To make the
presentation consistent, (param) is omitted when printing a tag.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When printing Binaryen IR, we previously generated names for unnamed heap types
based on their structure. This was useful for seeing the structure of simple
types at a glance without having to separately go look up their definitions, but
it also had two problems:
1. The same name could be generated for multiple types. The generated names did
not take into account rec group structure or finality, so types that differed
only in these properties would have the same name. Also, generated type names
were limited in length, so very large types that shared only some structure
could also end up with the same names. Using the same name for multiple types
produces incorrect and unparsable output.
2. The generated names were not useful beyond the most trivial examples. Even
with length limits, names for nontrivial types were extremely long and visually
noisy, which made reading disassembled real-world code more challenging.
Fix these problems by emitting simple indexed names for unnamed heap types
instead. This regresses readability for very simple examples, but the trade off
is worth it.
This change also reduces the number of type printing systems we have by one.
Previously we had the system in Print.cpp, but we had another, more general and
extensible system in wasm-type-printing.h and wasm-type.cpp as well. Remove the
old type printing system from Print.cpp and replace it with a much smaller use
of the new system. This requires significant refactoring of Print.cpp so that
PrintExpressionContents object now holds a reference to a parent
PrintSExpression object that holds the type name state.
This diff is very large because almost every test output changed slightly. To
minimize the diff and ease review, change the type printer in wasm-type.cpp to
behave the same as the old type printer in Print.cpp except for the differences
in name generation. These changes will be reverted in much smaller PRs in the
future to generally improve how types are printed.
|
|
|
|
|
|
|
|
|
| |
Start functions can have locals, which we previously ignored as we just
concatenated the bodies together. This makes us copy the second start
and call that, keeping them separate (the optimizer can then inline, if that
makes sense).
Fixes #5835
|
| |
|
|
|
|
|
|
|
| |
When a module item is imported and directly reexported by an
intermediate module, we need to perform several name lookups and use its
name in the initial module rather than the intermediate name when fusing
imports and exports.
|
|
|
|
| |
The import information of Tags and Memories was not preserved.
|
|
|
|
| |
The function type should be printed there just like for non-imported
functions.
|
|
We used to have a wasm-merge tool but removed it for a lack of use cases. Recently
use cases have been showing up in the wasm GC space and elsewhere, as people are
using more diverse toolchains together, for example a project might build some C++
code alongside some wasm GC code. Merging those wasm files together can allow
for nice optimizations like inlining and better DCE etc., so it makes sense to have a
tool for merging.
Background:
* Removal: #1969
* Requests:
* wasm-merge - why it has been deleted #2174
* Compiling and linking wat files #2276
* wasm-link? #2767
This PR is a compete rewrite of wasm-merge, not a restoration of the original
codebase. The original code was quite messy (my fault), and also, since then
we've added multi-memory and multi-table which makes things a lot simpler.
The linking semantics are as described in the "wasm-link" issue #2767 : all we do
is merge normal wasm files together and connect imports and export. That is, we
have a graph of modules and their names, and each import to a module name can
be resolved to that module. Basically, like a JS bundler would do for JS, or, in other
words, we do the same operations as JS code would do to glue wasm modules
together at runtime, but at compile time. See the README update in this PR for a
concrete example.
There are no plans to do more than that simple bundling, so this should not
really overlap with wasm-ld's use cases.
This should be fairly fast as it works in linear time on the total input code. However,
it won't be as fast as wasm-ld, of course, as it does build Binaryen IR for each
module. An advantage to working on Binaryen IR is that we can easily do some
global DCE after merging, and further optimizations are possible later.
|