| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
IRBuilder is a utility for turning arbitrary valid streams of Wasm
instructions into valid Binaryen IR. It is already used in the text
parser, so now use it in the binary parser as well. Since the IRBuilder
API for building each intruction requires only the information that the
binary and text formats include as immediates to that instruction, the
parser is now much simpler than before. In particular, it does not need
to manage a stack of instructions to figure out what the children of
each expression should be; IRBuilder handles this instead.
There are some differences between the IR constructed by IRBuilder and
the IR the binary parser constructed before this change. Most
importantly, IRBuilder generates better multivalue code because it
avoids eagerly breaking up multivalue results into individual components
that might need to be immediately reassembled into a tuple. It also
parses try-delegate more correctly, allowing the delegate to target
arbitrary labels, not just other `try`s. There are also a couple
superficial differences in the generated label and scratch local names.
As part of this change, add support for recording binary source
locations in IRBuilder.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When printing Binaryen IR, we previously generated names for unnamed heap types
based on their structure. This was useful for seeing the structure of simple
types at a glance without having to separately go look up their definitions, but
it also had two problems:
1. The same name could be generated for multiple types. The generated names did
not take into account rec group structure or finality, so types that differed
only in these properties would have the same name. Also, generated type names
were limited in length, so very large types that shared only some structure
could also end up with the same names. Using the same name for multiple types
produces incorrect and unparsable output.
2. The generated names were not useful beyond the most trivial examples. Even
with length limits, names for nontrivial types were extremely long and visually
noisy, which made reading disassembled real-world code more challenging.
Fix these problems by emitting simple indexed names for unnamed heap types
instead. This regresses readability for very simple examples, but the trade off
is worth it.
This change also reduces the number of type printing systems we have by one.
Previously we had the system in Print.cpp, but we had another, more general and
extensible system in wasm-type-printing.h and wasm-type.cpp as well. Remove the
old type printing system from Print.cpp and replace it with a much smaller use
of the new system. This requires significant refactoring of Print.cpp so that
PrintExpressionContents object now holds a reference to a parent
PrintSExpression object that holds the type name state.
This diff is very large because almost every test output changed slightly. To
minimize the diff and ease review, change the type printer in wasm-type.cpp to
behave the same as the old type printer in Print.cpp except for the differences
in name generation. These changes will be reverted in much smaller PRs in the
future to generally improve how types are printed.
|
|
|
|
|
|
|
|
|
|
| |
We already did this if the block was a child of a control flow structure, which is
the common case (see the new added comment around that code, which clarifies
why). This does the same for all other blocks. This is simple to do and a minor
optimization, but the main benefit from this is just to make our handling of blocks
uniform: after this, we never emit a block with no name. This will make 1a non-
nullable locals easier to handle (since they will be able to assume that property;
and not emitting such blocks avoids some work to handle non-nullable locals
in them).
|
|
|
|
| |
Fixes: #3226
|
|
|
| |
Implements the parts of the Extended Name Section Proposal that are trivially applicable to Binaryen, in particular table, memory and global names. Does not yet implement label, type, elem and data names.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BinaryenIRWriter was previously inconsistent about whether or not it
emitted an instruction if that instruction was not reachable.
Instructions that produced values were not emitted if they were
unreachable, but instructions that did not produce values were always
emitted. Additionally, blocks continued to emit their children even
after emitting an unreachable child.
Since it was not possible to tell whether an unreachable instruction's
parent would be emitted, BinaryenIRWriter had to be very defensive and
emit many extra `unreachable` instructions around unreachable code to
avoid type errors.
This PR unifies the logic for emitting all non-control flow
instructions and changes the behavior of BinaryenIRWriter so that it
never emits instructions that cannot be reached due to having
unreachable children. This means that extra `unreachable` instructions
now only need to be emitted after unreachable control flow
constructs. BinaryenIRWriter now also stops emitting instructions
inside blocks after the first unreachable instruction as an extra
optimization.
This change will also simplify Poppy IR stackification (see #3059) by
guaranteeing that instructions with unreachable children will not be
emitted into the stackifier. This makes satisfying the Poppy IR rule
against unreachable Pops trivial, whereas previously satisfying this
rule would have required about about 700 additional lines of code to
recompute the types of all unreachable children for any instruction.
|
|
|
|
|
|
|
|
| |
`BinaryIndexes` was only used in two places (Print.cpp and
wasm-binary.h), so it didn't seem to be a great fit for
module-utils.h. This change moves it to wasm-binary.h and removes its
usage in Print.cpp. This means that function indexes are no longer
printed, but those were of limited utility and were the source of
annoying noise when updating tests, anyway.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Function signatures were previously redundantly stored on Function
objects as well as on FunctionType objects. These two signature
representations had to always be kept in sync, which was error-prone
and needlessly complex. This PR takes advantage of the new ability of
Type to represent multiple value types by consolidating function
signatures as a pair of Types (params and results) stored on the
Function object.
Since there are no longer module-global named function types,
significant changes had to be made to the printing and emitting of
function types, as well as their parsing and manipulation in various
passes.
The C and JS APIs and their tests also had to be updated to remove
named function types.
|
|
|
|
|
|
|
|
|
| |
This is the start of a larger refactoring to remove FunctionType entirely and
store types and signatures directly on the entities that use them. This PR
updates BrOnExn and Events to remove their use of FunctionType and makes the
BinaryWriter traverse the module and collect types rather than using the global
FunctionType list. While we are collecting types, we also sort them by frequency
as an optimization. Remaining uses of FunctionType in Function, CallIndirect,
and parsing will be removed in a future PR.
|
|
|
|
|
|
| |
Automated renaming according to
https://github.com/WebAssembly/spec/issues/884#issuecomment-426433329.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously the relooper would do some optimizations when deciding when to use an if vs a switch, how to group blocks, etc. This PR adds an additional pre-optimization phase with some basic but useful simplify-cfg style passes,
* Skip empty blocks when they have just one exit.
* Merge exiting branches when they are equivalent.
* Canonicalize block contents to make such comparisons more useful.
* Turn a trivial one-target switch into a simple branch.
This can help in noticeable ways when running the rereloop pass, e.g. on LLVM wasm backend output.
Also:
* Binaryen C API changes to the relooper, which now gets a Module for its constructor. It needs it for the optimizations, as it may construct new nodes.
* Many relooper-fuzzer improvements.
* Clean up HashType usage.
|
|
|
|
|
|
| |
* Error if there are more locals than browsers allow (50,000). We usually just warn about stuff like this, but we do need some limit (or else we hang or OOM), and if so, why not use the agreed-upon Web limit.
* Do not generate nice string names for locals in binary parsing - the name is just $var$x instead of $x, so not much benefit, and worse as our names are interned this is actually slow (which is why the fuzz testcase here hangs instead of OOMing).
Testcases and bugreport in #1663.
|
| |
|
|
|
|
|
|
| |
* don't emit a toplevel block if we don't need to, as in wasm it is a list context
* don't create unnecessary blocks in wasm reading
|
|
|
| |
Ignoring unreachable code in wasm binaries lets us avoid corner cases with unstructured code in wasm binaries that is a poor fit for Binaryen's structured IR.
|
|
|
|
| |
* emit optimal-size LEBs in section/subsection/function body sizes, instead of preallocating 5 bytes
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See https://github.com/WebAssembly/binaryen/issues/914.
* extensible name section support: read function names, too
* c-api-unused-mem.txt: change expected size to match new name section
* * check subsection size matches
* print warning for unknown name subsections (including the local
section)
|
|
|
|
| |
unruly (#928)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* an unreachable block is one with an unreachable child, plus no breaks
* document new difference between binaryen IR and wasm
* fix relooper missing finalize
* add a bunch of tests
* don't assume that test/*.wast files print to themselves exactly; print to from.wast. this allows wast tests with comments in them
* emit unreachable blocks as (block .. unreachable) unreachable
* if without else and unreachable ifTrue is still not unreachable, it should be none
* update wasm.js
* cleanups
* empty blocks have none type
|
|
|