| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
The only internal use was in wasm2js, which doesn't need it. Fix API
tests to explicitly drop expressions as necessary.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Poking in the git history, this was some kind of hack for a problem that appears to
no longer exist since at least 5 years ago.
Logically, refinalize should never change a type from unreachable to none, or at
least if we have a place that does this, we should manually do the necessary things
around that, like updating the function's return type. The opposite, none (or anything
else) to unreachable is the common case, where we use refinalize to propagate
that type upwards.
Fuzzing also finds no issues.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously only WalkerPasses had access to the `getPassRunner` and
`getPassOptions` methods. Move those methods to `Pass` so all passes can use
them. As a result, the `PassRunner` passed to `Pass::run` and
`Pass::runOnFunction` is no longer necessary, so remove it.
Also update `Pass::create` to return a unique_ptr, which is more efficient than
having it return a raw pointer only to have the `PassRunner` wrap that raw
pointer in a `unique_ptr`.
Delete the unused template `PassRunner::getLast()`, which looks like it was
intended to enable retrieving previous analyses and has been in the code base
since 2015 but is not implemented anywhere.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An overview of this is in the README in the diff here (conveniently, it is near the
top of the diff). Basically, we fix up nn locals after each pass, by default. This keeps
things easy to reason about - what validates is what is valid wasm - but there are
some minor nuances as mentioned there, in particular, we ignore nameless blocks
(which are commonly added by various passes; ignoring them means we can keep
more locals non-nullable).
The key addition here is LocalStructuralDominance which checks which local
indexes have the "structural dominance" property of 1a, that is, that each get has
a set in its block or an outer block that precedes it. I optimized that function quite
a lot to reduce the overhead of running that logic after each pass. The overhead
is something like 2% on J2Wasm and 0% on Dart (0%, because in this mode we
shrink code size, so there is less work actually, and it balances out).
Since we run fixups after each pass, this PR removes logic to manually call the
fixup code from various places we used to call it (like eh-utils and various passes).
Various passes are now marked as requiresNonNullableLocalFixups => false.
That lets us skip running the fixups after them, which we normally do automatically.
This helps avoid overhead. Most passes still need the fixups, though - any pass
that adds a local, or a named block, or moves code around, likely does.
This removes a hack in SimplifyLocals that is no longer needed. Before we
worked to avoid moving a set into a try, as it might not validate. Now, we just do it
and let fixups happen automatically if they need to: in the common code they
probably don't, so the extra complexity seems not worth it.
Also removes a hack from StackIR. That hack tried to avoid roundtrip adding a
nondefaultable local. But we have the logic to fix that up now, and opts will
likely keep it non-nullable as well.
Various tests end up updated here because now a local can be non-nullable -
previous fixups are no longer needed.
Note that this doesn't remove the gc-nn-locals feature. That has been useful for
testing, and may still be useful in the future - it basically just allows nn locals in
all positions (that can't read the null default value at the entry). We can consider
removing it separately.
Fixes #4824
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Updating wasm.h/cpp for DataSegments
* Updating wasm-binary.h/cpp for DataSegments
* Removed link from Memory to DataSegments and updated module-utils, Metrics and wasm-traversal
* checking isPassive when copying data segments to know whether to construct the data segment with an offset or not
* Removing memory member var from DataSegment class as there is only one memory rn. Updated wasm-validator.cpp
* Updated wasm-interpreter
* First look at updating Passes
* Updated wasm-s-parser
* Updated files in src/ir
* Updating tools files
* Last pass on src files before building
* added visitDataSegment
* Fixing build errors
* Data segments need a name
* fixing var name
* ran clang-format
* Ensuring a name on DataSegment
* Ensuring more datasegments have names
* Adding explicit name support
* Fix fuzzing name
* Outputting data name in wasm binary only if explicit
* Checking temp dataSegments vector to validateBinary because it's the one with the segments before we processNames
* Pass on when data segment names are explicitly set
* Ran auto_update_tests.py and check.py, success all around
* Removed an errant semi-colon and corrected a counter. Everything still passes
* Linting
* Fixing processing memory names after parsed from binary
* Updating the test from the last fix
* Correcting error comment
* Impl kripken@ comments
* Impl tlively@ comments
* Updated tests that remove data print when == 0
* Ran clang format
* Impl tlively@ comments
* Ran clang-format
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Merge similar functions that only differs constant values (like immediate
operand of const and call insts) by parameterization.
Performing this pass at post-link time can merge more functions across
objects. Inspired by Swift compiler's optimization which is derived from
LLVM's one:
https://github.com/apple/swift/blob/main/lib/LLVMPasses/LLVMMergeFunctions.cpp
https://github.com/llvm/llvm-project/blob/main/llvm/docs/MergeFunctions.rst
The basic ideas here are constant value parameterization and direct callee
parameterization by indirection.
Constant value parameterization is like below:
;; Before
(func $big-const-42 (result i32)
[[many instr 1]]
(i32.const 44)
[[many instr 2]]
)
(func $big-const-43 (result i32)
[[many instr 1]]
(i32.const 45)
[[many instr 2]]
)
;; After
(func $byn$mgfn-shared$big-const-42 (result i32)
[[many instr 1]]
(local.get $0) ;; parameterized!!
[[many instr 2]]
)
(func $big-const-42 (result i32)
(call $byn$mgfn-shared$big-const-42
(i32.const 42)
)
)
(func $big-const-43 (result i32)
(call $byn$mgfn-shared$big-const-42
(i32.const 43)
)
)
Direct callee parameterization is similar to the constant value parameterization,
but it parameterizes callee function i by ref.func instead. Therefore it is enabled
only when reference-types and typed-function-references features are enabled.
I saw 1 ~ 2 % reduction for SwiftWasm binary and Ruby's wasm port
using wasi-sdk, and 3 ~ 4.5% reduction for Unity WebGL binary when -Oz.
|
|
|
|
|
|
| |
This adds and tests the new method. It will be used in a new pass later, where
computing shallow hashes allows it to be done in linear time.
99% of the diff is whitespace.
|
|
|
|
|
|
|
|
|
|
|
| |
We have seen some cases in both Chrome and Firefox where
extremely large modules cause overhead,
#3730 (comment) (and link therein)
emscripten-core/emscripten#13899 (comment)
There is no "right" value to use as a limit here, but pick an
arbitrary one that is very high. (This setting is verified to have
no effect on the emscripten benchmark suite.)
|
|
|
|
|
|
|
|
|
| |
When using nominal types, func.ref of two functions with identical signatures
but different HeapTypes will yield different types. To preserve these semantics,
Functions need to track their HeapTypes, not just their Signatures.
This PR replaces the Signature field in Function with a HeapType field and adds
new utility methods to make it almost as simple to update and query the function
HeapType as it was to update and query the Function Signature.
|
|
|
|
|
|
|
|
|
|
|
| |
We recently decided to change 'event' to 'tag', and to 'event section'
to 'tag section', out of the rationale that the section contains a
generalized tag that references a type, which may be used for something
other than exceptions, and the name 'event' can be confusing in the web
context.
See
- https://github.com/WebAssembly/exception-handling/issues/159#issuecomment-857910130
- https://github.com/WebAssembly/exception-handling/pull/161
|
|
|
|
|
|
|
|
|
|
|
| |
These files are special in that they use define symbols that are not
defined within those files or other files included in those files; they
are supposed to be defined in source files that include these headers.
This has caused clang-tidy to fail every time these files have changed
because they are not compilable per se.
This PR solves the problem by changing their extension to `def`, which
is also used in LLVM codebase. LLVM has dozens of files like this whose
extension is `def`, which makes these not checked by clang-tidy.
|
|
|
|
|
|
|
| |
This is a partial revert of #3669, which removed the old implementation of
Type::getLeastUpperBound that did not correctly handle recursive types. The new
implementation in this PR uses a TypeBuilder to construct LUBs and for recursive
types, it returns a temporary HeapType that has not yet been fully constructed
to break what would otherwise be infinite recursions.
|
|
|
|
|
|
|
|
| |
Since correct LUB calculation for recursive types is complicated, stop depending
on LUBs throughout the code base. This also fixes a validation bug in which the
validator required blocks to be typed with the LUB of all the branch types, when
in fact any upper bound should have been valid. In addition to fixing that bug,
this PR simplifies the code for break handling by not storing redundant
information about the arity of types.
|
|
|
|
|
|
|
|
|
|
|
| |
Passive element segments do not belong to any table, so the link between
Table and elem needs to be weaker; i.e. an elem may have a table in case
of active segments, or simply be a collection of function references in
case of passive/declarative segments.
This PR takes Table::Segment out and turns it into a first class module
element just like tables and functions. It also implements early support
for parsing, printing, encoding and decoding passive/declarative elem
segments.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This updates `try`-`catch`-`catch_all` and `rethrow` instructions to
match the new spec. `delegate` is not included. Now `Try` contains not a
single `catchBody` expression but a vector of catch
bodies and events.
This updates most existing routines, optimizations, and tests modulo the
interpreter and the CFG traversal. Because the interpreter has not been
updated yet, the EH spec test is temporarily disabled in check.py. Also,
because the CFG traversal for EH is not yet updated, several EH tests in
`rse_all-features.wast`, which uses CFG traversal, are temporarily
commented out.
Also added a few more tests in existing EH test functions in
test/passes. In the previous spec, `catch` was catching all exceptions
so it was assumed that anything `try` body throws is caught by its
`catch`, but now we can assume the same only if there is a `catch_all`.
Newly added tests test cases when there is a `catch_all` and cases there
are only `catch`es separately.
|
|
|
| |
Also avoid needing to #undef DELEGATE all the time.
|
|
|
|
|
|
|
| |
These instructions are proposed in https://github.com/WebAssembly/simd/pull/350.
This PR implements them throughout Binaryen except in the C/JS APIs and in the
fuzzer, where it leaves TODOs instead. Right now these instructions are just
being implemented for prototyping so adding them to the APIs isn't critical and
they aren't generally available to be fuzzed in Wasm engines.
|
|
|
| |
NFC, except adding most of the boilerplate for the remaining GC instructions. Each implementation site is marked with a respective `TODO (gc): theInstruction` in between the typical boilerplate code.
|
|
|
| |
Adds the `i31.new` and `i31.get_s/u` instructions for creating and working with `i31ref` typed values. Does not include fuzzer integration just yet because the fuzzer expects that trivial values it creates are suitable in global initializers, which is not the case for trivial `i31ref` expressions.
|
|
|
| |
With `eqref` now integrated, the `ref.eq` instruction can be implemented. The only valid LHS and RHS value is `(ref.null eq)` for now, but implementation and fuzzer integration is otherwise complete.
|
|
|
| |
Aligns the internal representations of `memory.size` and `memory.grow` with other more recent memory instructions by removing the legacy `Host` expression class and adding separate expression classes for `MemorySize` and `MemoryGrow`. Simplifies related APIs, but is also a breaking API change.
|
|
|
|
|
|
|
|
|
| |
* Unifies internal hashing helpers to naturally integrate with std::hash
* Removes the previous custom implementation
* Computed hashes are now always size_t
* Introduces a hash_combine helper
* Fixes an overwritten partial hash in Relooper.cpp
|
| |
|
|
|
|
|
|
| |
Push and Pop have been superseded by tuples for their original
intended purpose of supporting multivalue. Pop is still used to
represent block arguments for exception handling, but there are no
plans to use Push for anything now or in the future.
|
|
|
|
|
|
|
|
|
| |
Implements parsing and emitting of tuple creation and extraction and tuple-typed control flow for both the text and binary formats.
TODO:
- Extend Precompute/interpreter to handle tuple values
- C and JS API support/testing
- Figure out how to lower in stack IR
- Fuzzing
|
|
|
|
|
| |
This adds AutoDrop (+ ReFinalize) support for Try. We don't have
`--autodrop` option so I can't add a separate test for this, but this is
basically the same as what If does.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for the reference type proposal. This includes support
for all reference types (`anyref`, `funcref`(=`anyfunc`), and `nullref`)
and four new instructions: `ref.null`, `ref.is_null`, `ref.func`, and
new typed `select`. This also adds subtype relationship support between
reference types.
This does not include table instructions yet. This also does not include
wasm2js support.
Fixes #2444 and fixes #2447.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Function signatures were previously redundantly stored on Function
objects as well as on FunctionType objects. These two signature
representations had to always be kept in sync, which was error-prone
and needlessly complex. This PR takes advantage of the new ability of
Type to represent multiple value types by consolidating function
signatures as a pair of Types (params and results) stored on the
Function object.
Since there are no longer module-global named function types,
significant changes had to be made to the printing and emitting of
function types, as well as their parsing and manipulation in various
passes.
The C and JS APIs and their tests also had to be updated to remove
named function types.
|
|
|
|
|
| |
This works more like llvm's unreachable handler in that is preserves
information even in release builds.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds the ability to create multivalue types from vectors of concrete value
types. All types are transparently interned, so their representation is still a
single uint32_t. Types can be extracted into vectors of their component parts,
and all the single value types expand into vectors containing themselves.
Multivalue types are not yet used in the IR, but their creation and inspection
functionality is exposed and tested in the C and JS APIs.
Also makes common type predicates methods of Type and improves the ergonomics of
type printing.
|
|
|
|
|
|
|
| |
Introduces a new instruction class, `SIMDLoad`. Implements encoding,
decoding, parsing, printing, and interpretation of the load and splat
instructions, including in the C and JS APIs. `v128.load` remains in
the `Load` instruction class for now because the interpreter code
expects a `Load` to be able to load any memory value type.
|
|
|
|
|
|
|
|
|
| |
Renames the SIMDBitselect class to SIMDTernary and adds the new
{f32x4,f64x2}.qfm{a,s} ternary instructions. Because the SIMDBitselect
class is no more, this is a backwards-incompatible change to the C
interface. The new instructions are not yet used in the fuzzer because
they are not yet implemented in V8.
The corresponding LLVM commit is https://reviews.llvm.org/rL370556.
|
|
|
|
|
|
|
| |
This adds `atomic.fence` instruction:
https://github.com/WebAssembly/threads/blob/master/proposals/threads/Overview.md#fence-operator
This also fix bugs in `atomic.wait` and `atomic.notify` instructions in
binaryen.js and adds tests for them.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds basic support for exception handling instructions, according
to the spec:
https://github.com/WebAssembly/exception-handling/blob/master/proposals/Exceptions.md
This PR includes support for:
- Binary reading/writing
- Wast reading/writing
- Stack IR
- Validation
- binaryen.js + C API
- Few IR routines: branch-utils, type-updating, etc
- Few passes: just enough to make `wasm-opt -O` pass
- Tests
This PR does not include support for many optimization passes, fuzzer,
or interpreter. They will be follow-up PRs.
Try-catch construct is modeled in Binaryen IR in a similar manner to
that of if-else: each of try body and catch body will contain a block,
which can be omitted if there is only a single instruction. This block
will not be emitted in wast or binary, as in if-else. As in if-else,
`class Try` contains two expressions each for try body and catch body,
and `catch` is not modeled as an instruction. `exnref` value pushed by
`catch` is get by `pop` instruction.
`br_on_exn` is special: it returns different types of values when taken
and not taken. We make `exnref`, the type `br_on_exn` pushes if not
taken, as `br_on_exn`'s type.
|
|
|
|
|
|
|
| |
This is the first stage of adding support for stacky/multivaluey things. It adds new push/pop instructions, and so far just shows that they can be read and written, and that the optimizer doesn't do anything immediately wrong on them.
No fuzzer support, since there isn't a "correct" way to use these yet. The current test shows some "incorrect" usages of them, which is nice to see that we can parse/emit them, but we should replace them with proper usages of push/pop once we actually have those (see comments in the tests).
This should be enough to unblock exceptions (which needs a pop in try-catches). It is also a step towards multivalue (I added some docs about that), but most of multivalue is left to be done.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for the event and the event section, as specified in
https://github.com/WebAssembly/exception-handling/blob/master/proposals/Exceptions.md#changes-to-the-binary-model.
Wasm events are features that suspend the current execution and transfer
the control flow to a corresponding handler. Currently the only
supported event kind is exceptions.
For events, this includes support for
- Binary file reading/writing
- Wast file reading/writing
- Binaryen.js API
- Fuzzer
- Validation
- Metadce
- Passes: metrics, minify-imports-and-exports,
remove-unused-module-elements
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Reflected new renamed instruction names in code and tests:
- `get_local` -> `local.get`
- `set_local` -> `local.set`
- `tee_local` -> `local.tee`
- `get_global` -> `global.get`
- `set_global` -> `global.set`
- `current_memory` -> `memory.size`
- `grow_memory` -> `memory.grow`
- Removed APIs related to old instruction names in Binaryen.js and added
APIs with new names if they are missing.
- Renamed `typedef SortedVector LocalSet` to `SetsOfLocals` to prevent
name clashes.
- Resolved several TODO renaming items in wasm-binary.h:
- `TableSwitch` -> `BrTable`
- `I32ConvertI64` -> `I32WrapI64`
- `I64STruncI32` -> `I64SExtendI32`
- `I64UTruncI32` -> `I64UExtendI32`
- `F32ConvertF64` -> `F32DemoteI64`
- `F64ConvertF32` -> `F64PromoteF32`
- Renamed `BinaryenGetFeatures` and `BinaryenSetFeatures` to
`BinaryenModuleGetFeatures` and `BinaryenModuleSetFeatures` for
consistency.
|
|
|
| |
Applies the changes in #2065, and temprarily disables the hook since it's too slow to run on a change this large. We should re-enable it in a later commit.
|
|
|
| |
Mass change to apply clang-format to everything. We are applying this in a PR by me so the (git) blame is all mine ;) but @aheejin did all the work to get clang-format set up and all the manual work to tidy up some things to make the output nicer in #2048
|
|
|
|
|
|
|
|
| |
This renames the following:
- `i32.wait` -> `i32.atomic.wait`
- `i64.wait` -> `i64.atomic.wait`
- `wake` -> `atomic.notify`
to match the spec.
|
|
|
|
|
| |
Trying to refactor the code to be simpler and less redundant, I ran into some perf issues that it seems like a small vector, with fixed-size storage and optional additional storage as needed, might help with. This implements that class and uses it in a few places.
This seems to help, I see some 1-2% fewer instructions and cycles in `perf stat`, but it's hard to tell if it really makes a noticeable difference.
|
|
|
|
|
|
| |
Bulk memory operations
The only parts missing are the interpreter implementation
and spec tests.
|
|
|
|
|
|
|
|
|
| |
Implement and test the following functionality for SIMD.
- Parsing and printing
- Assembling and disassembling
- Interpretation
- C API
- JS API
|
|
|
|
|
|
| |
If we refinalize after adding a value that flows out of a block, we need to fix up any branches that might exist without a value, which is possible if the branches were not taken in practice
Also refactor ReFinalize into a separate file.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously the relooper would do some optimizations when deciding when to use an if vs a switch, how to group blocks, etc. This PR adds an additional pre-optimization phase with some basic but useful simplify-cfg style passes,
* Skip empty blocks when they have just one exit.
* Merge exiting branches when they are equivalent.
* Canonicalize block contents to make such comparisons more useful.
* Turn a trivial one-target switch into a simple branch.
This can help in noticeable ways when running the rereloop pass, e.g. on LLVM wasm backend output.
Also:
* Binaryen C API changes to the relooper, which now gets a Module for its constructor. It needs it for the optimizations, as it may construct new nodes.
* Many relooper-fuzzer improvements.
* Clean up HashType usage.
|
|
|
|
|
|
|
| |
Handle a corner case in ReFinalize, which incrementally re-types code after changes. The problem is that if we need to figure out the type of a block, we look to the last element flowing out, or to breaks with values. If there is no such last element, and the breaks are not taken - they have unreachable values - then they don't tell us the block's proper type. We asserted that in such a case the block still had a type, and didn't handle this.
To fix it, we could look on the parent to see what type would fit. However, it seem simpler to just remove untaken breaks/switches as part of ReFinalization - they carry no useful info anyhow. After removing them, if the block has no other signal of a concrete type, it can just be unreachable.
This bug existed for at least 1.5 years - I didn't look back further. I think it was noticed by the fuzzer now due to recent fuzzing improvements and optimizer improvements, as I just saw this bug found a second time.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes #1649
This moves us to a single object for functions, which can be imported or nor, and likewise for globals (as a result, GetGlobals do not need to check if the global is imported or not, etc.). All imported things now inherit from Importable, which has the module and base of the import, and if they are set then it is an import.
For convenient iteration, there are a few helpers like
ModuleUtils::iterDefinedGlobals(wasm, [&](Global* global) {
.. use global ..
});
as often iteration only cares about imported or defined (non-imported) things.
|
|
|
|
|
|
|
|
|
| |
Background: google/souper#323
This adds a --souperify pass, which emits Souper IR in text format. That can then be read by Souper which can emit superoptimization rules. We hope that eventually we can integrate those rules into Binaryen.
How this works is we emit an internal "DataFlow IR", which is an SSA-based IR, and then write that out into Souper text.
This also adds a --dfo pass, which stands for data-flow optimizations. A DataFlow IR is generated, like in souperify, and then performs some trivial optimizations using it. There are very few things that can do that our other optimizations can't already, but this is also good testing for the DataFlow IR, plus it is good preparation for using Souper's superoptimization output (which would also construct DataFlow IR, like here, but then do some matching on the Souper rules).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a new IR, "Stack IR". This represents wasm at a very low level, as a simple stream of instructions, basically the same as wasm's binary format. This is unlike Binaryen IR which is structured and in a tree format.
This gives some small wins on binary sizes, less than 1% in most cases, usually 0.25-0.50% or so. That's not much by itself, but looking forward this prepares us for multi-value, which we really need an IR like this to be able to optimize well. Also, it's possible there is more we can do already - currently there are just a few stack IR optimizations implemented,
DCE
local2stack - check if a set_local/get_local pair can be removed, which keeps the set's value on the stack, which if the stars align it can be popped instead of the get.
Block removal - remove any blocks with no branches, as they are valid in wasm binary format.
Implementation-wise, the IR is defined in wasm-stack.h. A new StackInst is defined, representing a single instruction. Most are simple reflections of Binaryen IR (an add, a load, etc.), and just pointers to them. Control flow constructs are expanded into multiple instructions, like a block turns into a block begin and end, and we may also emit extra unreachables to handle the fact Binaryen IR has unreachable blocks/ifs/loops but wasm does not. Overall, all the Binaryen IR differences with wasm vanish on the way to stack IR.
Where this IR lives: Each Function now has a unique_ptr to stack IR, that is, a function may have stack IR alongside the main IR. If the stack IR is present, we write it out during binary writing; if not, we do the same binaryen IR => wasm binary process as before (this PR should not affect speed there). This design lets us use normal Passes on stack IR, in particular this PR defines 3 passes:
Generate stack IR
Optimize stack IR (might be worth splitting out into separate passes eventually)
Print stack IR for debugging purposes
Having these as normal passes is convenient as then they can run in parallel across functions and all the other conveniences of our current Pass system. However, a downside of keeping the second IR as an option on Functions, and using normal Passes to operate on it, means that we may get out of sync: if you generate stack IR, then modify binaryen IR, then the stack IR may no longer be valid (for example, maybe you removed locals or modified instructions in place etc.). To avoid that, Passes now define if they modify Binaryen IR or not; if they do, we throw away the stack IR.
Miscellaneous notes:
Just writing Stack IR, then writing to binary - no optimizations - is 20% slower than going directly to binary, which is one reason why we still support direct writing. This does lead to some "fun" C++ template code to make that convenient: there is a single StackWriter class, templated over the "mode", which is either Binaryen2Binary (direct writing), Binaryen2Stack, or Stack2Binary. This avoids a lot of boilerplate as the 3 modes share a lot of code in overlapping ways.
Stack IR does not support source maps / debug info. We just don't use that IR if debug info is present.
A tiny text format comment (if emitting non-minified text) indicates stack IR is present, if it is ((; has Stack IR ;)). This may help with debugging, just in case people forget. There is also a pass to print out the stack IR for debug purposes, as mentioned above.
The sieve binaryen.js test was actually not validating all along - these new opts broke it in a more noticeable manner. Fixed.
Added extra checks in pass-debug mode, to verify that if stack IR should have been thrown out, it was. This should help avoid any confusion with the IR being invalid.
Added a comment about the possible future of stack IR as the main IR, depending on optimization results, following some discussion earlier today.
|