summaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
...
* Add a `tuple.drop` text pseudoinstruction (#6170)Thomas Lively2023-12-125-1/+38
| | | | | | | | | | | | | | | | | We previously overloaded `drop` to mean both normal drops of single values and also drops of tuple values. That works fine in the legacy text parser since it can infer parent-child relationships directly from the s-expression structure of the input, so it knows that a drop should drop an entire tuple if the tuple-producing instruction is a child of the drop. The new text parser, however, is much more like the binary parser in that it uses instruction types to create parent-child instructions. The new parser always assumes that `drop` is meant to drop just a single value because that's what it does in WebAssembly. Since we want to continue to let `Drop` IR expressions consume tuples, and since we will need a way to write tests for that IR pattern that work with the new parser, introduce a new pseudoinstruction, `tuple.drop`, to represent drops of tuples. This pseudoinstruction only exists in the text format and it parses to normal `Drop` expressions. `tuple.drop` takes the arity of its operand as an immediate, which will let the new parser parse it correctly in the future.
* Update `tuple.make` text format to include arity (#6169)Thomas Lively2023-12-124-4/+11
| | | | | | | | | | Previously, the number of tuple elements was inferred from the number of s-expression children of the `tuple.make` expression, but that scheme would not work in the new wat parser, where s-expressions are optional and cannot be semantically meaningful. Update the text format to take the number of tuple elements (i.e. the tuple arity) as an immediate. This new format will be able to be implemented in the new parser as follow-on work.
* Add J2CL optimization pass to binaryen. (#6151)Goktug Gokdogan2023-12-124-0/+224
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This PR creates a new pass to optimize J2CL specific patterns that would otherwise difficult to recognize/prove generically by other binaryen passes. The pass currently handles fields what we call as "constant-like". These fields are fields initialized once and unconditionally through "clinit" function and technically they do have 2 observable states; - initial null/0 state - initialized state. However you can only observe initial null/0 state in contrived examples, not in real world/correct applications. This pass moves such "clinit" initialized fields to global initialization. Above pattern also matches other lazy init construct like String and Class literals (which binaryen already reduces to constant expressions). So the pass is generalized to include them as well. (by matching any functions with the name pattern "_@once_") In order for this pass to be effective: 1. It needs to run between O3 passes 2. We need to stop inlining of "once" functions. Stopping inlining of the once functions are important to preserve their structure. This both helps existing OnceReducer pass and new J2CL pass to be a lot more effective. Also it is not useful to inline these functions as by defintion they only executed once. This could be achieved by passing no-inline filter. Although the inlining is generally disabled for these functions, it is still needed for some cases since inliner is effectively responsible for removal of the once functions that are simplified into empty or simple delegating functions. For this reason, the pass will rename such trivial function so no-inline filter will no longer match them. Also note that after all optimizations completed, it does make sense to have a final stage where the "partial inline" of all once functions are allowed. This will speed them up by moving the initialization check to call-site.
* Inlining: Copy no-inline flags when copying a function (#6165)Alon Zakai2023-12-121-0/+3
| | | | Those fields should be copied together with all the rest of the metadata that already is. This was just missed in the prior PR.
* [EH] Use random value for exnref encoding when legacy GC is used (#6166)Heejin Ahn2023-12-111-6/+20
| | | | | Currently the legacy GC encoding's nullexternref encoding overlaps with exnref's. We assume the legacy GC encoding won't be used with the exnref for the moment and assign a random value to it to prevent the clash.
* [EH] Add exnref type back (#6149)Heejin Ahn2023-12-089-11/+131
| | | | | | | | | | | | | At the Oct hybrid CG meeting, we decided to add back `exnref`, which was removed in 2020: https://github.com/WebAssembly/meetings/blob/main/main/2023/CG-10.md The new version of the proposal reflected in the explainer: https://github.com/WebAssembly/exception-handling/blob/main/proposals/exception-handling/Exceptions.md While adding support for `exnref` in the current codebase which has all GC subtype hierarchies, I noticed we might need `noexn` heap type for the bottom type of `exn`. We don't have it now so I just set it to 0xff for the moment.
* [NFC] Use a reference instead of a pointer in Inlining (#6153)Alon Zakai2023-12-071-11/+11
|
* [Outlining] Add loop instruction supportAshley Nelson2023-12-071-0/+3
| | | | | | | | | | Adds support for the loop instruction to be outlined and a test showing a repeat loop being outlined. Reviewers: tlively Reviewed By: tlively Pull Request: https://github.com/WebAssembly/binaryen/pull/6141
* [Outlining] Improve debug loggingAshley Nelson2023-12-073-22/+36
| | | | | | | | | | | | - Change outlining debug logs to use std::cerr - Add controlFlowQueue push log - Fix build error with wasm-ir-builder log's use of ShallowExpression Reviewers: tlively Reviewed By: tlively Pull Request: https://github.com/WebAssembly/binaryen/pull/6140
* [Outlining] Fix outlining control flowAshley Nelson2023-12-063-10/+23
| | | | | | | | | | Changes the controlFlowQueue used in stringify-walker to push values of Expression*, This ensures that we walk the Wasm module in the same order, regardless of whether the control flow expression is outlined. Reviewers: tlively Reviewed By: tlively Pull Request: https://github.com/WebAssembly/binaryen/pull/6139
* [Parser] Parse call_indirect and return_call_indirect (#6148)Thomas Lively2023-12-064-3/+36
|
* [Parser] Parse tables and element segments (#6147)Thomas Lively2023-12-066-20/+540
| | | | | | | These module fields are especially complex to parse because they contain both nontrivial types and instructions, so their parsing logic needs to be spread out across the ParseDecls, ParseModuleTypes, and ParseDefs phases of parsing. This applies to in-line elements in table definitions as well, which means we need to be able to match a table to its in-line element segment across multiple phases.
* Add no-inline IR annotation, and passes to set it based on function name (#6146)Alon Zakai2023-12-066-8/+108
| | | | | | | | | | | | | Any function can now be annotated as not to be inlined fully (normally) or not to be inlined partially. In the future we'll want to read those annotations from the proposed wasm metadata section on code hints, and from wat text as well, but for now add trivial passes that set those fields based on function name wildcards, e.g.: --no-inline=*leave-alone* --inlining That will not inline any function whose name contains "leave-alone" in the name. --no-inline disables all inlining (full or partial) while --no-full-inline and --no-partial-inline affect only full or partial inlining.
* Inlining: Inline trivial calls (#6143)Alon Zakai2023-12-052-13/+49
| | | | | | | | | | | | | | | | A trivial call is something like a function that just calls another immediately, function foo(x, y) { return bar(y, 15); } We can inline those and expect to benefit in most cases, though we might increase code size slightly. Hence it makes sense to inline such cases, even though in general we are careful and do not inline functions with calls in them; a "trampoline" like that likely has most of the work in the call itself, which we can avoid by inlining. Suggested based on findings in Java.
* wasm-metadce all the things (#6142)Alon Zakai2023-11-304-141/+162
| | | | | | | | | | | | | | | Remove hardcoded paths for globals/functions/etc. in favor of general code paths that support all the module elements uniformly. As a result of that, we now support all parts of wasm, such as tables and element segments, that we didn't before. This refactoring is NFC aside from adding functionality. Note that this reduces the size of wasm-metadce by 10% while increasing its functionality - the benefits of writing generic code. To support this, add some trivial generic helpers to get or iterate over module elements using their kind in a dynamic manner. Using them might make wasm-metadce slightly slower, but I can't measure any difference.
* wasm-metadce: Improve name deduplication (#6138)Alon Zakai2023-11-301-2/+5
| | | | | | | | | | | | Avoid adding suffixes when we don't need them to keep names unique. As background, the suffixes are not used by emcc at all, so they are just for internal use in the tool. How that works is that metadce gets as input the list of things the user cares about, with names for them, so it knows the proper names to give imports and exports, and makes up names for other things. Those made up names will not be read by the user, so we can make them prettier as this PR does without breaking anything. The main benefit of this PR is to make debugging easier.
* [Parser] Parse try/catch/catch_all/delegate (#6128)Thomas Lively2023-11-295-44/+443
| | | | | | | | | | | | | | Parse the legacy v3 syntax for try/catch/catch_all/delegate in both its folded and unfolded forms. The first sources of significant complexity is the optional IDs after `catch` and `catch_all` in the unfolded form, which can be confused for tag indices and require backtracking to parse correctly. The second source of complexity is the handling of delegate labels, which are relative to the try's parent scope despite being parsed after the try's scope has already started. Handling this correctly requires punching a whole big enough to drive a truck through through both the parser and IRBuilder abstractions.
* C API: Add BinaryenTableGetType and BinaryenTableSetType (#6137)KinderGartenKiller2023-11-302-0/+11
| | | Fixes #6136
* [NFC] Move InstrumentedPass logic out and use it in another place (#6132)Alon Zakai2023-11-283-75/+113
| | | | | | | | | | | | | | | | | | Asyncify gained a way to wrap a pass so that it only runs on a given set of functions, rather than on all functions, so the wrapper "filters" what the pass operates on. That was useful in Asyncify as we wanted to only do work on functions that Asyncify actually instrumented. There is another place in the code that needs such functionality, optimizeAfterInlining, which runs optimizations after we inline; again, we only want to optimize on the functions we know are relevant because they changed. To do that, move that logic out to a general place so it can be reused. This makes the code there a lot less hackish. While doing so make the logic only work on function-parallel passes. It never did anyhow, but now it asserts on that. (It can't run on a general pass because a general one does not provide an interface to affect which functions it operates on; a general pass is entirely opaque in that way.)
* [wasm-emscripten-finalize] Remove --separate-data-segments (#6091)Sam Clegg2023-11-271-28/+0
| | | See #6088
* [Parser] Parse tags and throw (#6126)Thomas Lively2023-11-2010-48/+166
| | | | Also fix the parser to correctly error if an imported item appears after a non-imported item and make the corresponding fix to the test.
* Fix a bug with unreachable control flow in IRBuilder (#6130)Thomas Lively2023-11-202-2/+11
| | | | | | | | | | | | When branches target control flow structures other than blocks or loops, the IRBuilder wraps those control flow structures with an extra block for the branches to target in Binaryen IR. Usually that block has the same type as the control flow structure it wraps, but when the control flow structure is unreachable because all its bodies are unreachable, the wrapper block may still need to have a non-unreachable type if it is targeted by branches. Previously the wrapper block would also be unreachable in that case. Fix the bug by tracking whether the wrapper block will be targeted by any branches and use the control flow structure's original, non-unreachable type if so.
* [IRBuilder] Add visitCallIndirect and makeCallIndirect (#6127)Ashley Nelson2023-11-212-0/+15
| | | Adds support for call_indirect to wasm-ir-builder. Tests this works by outlining a sequence including call_indirect.
* Update IRBuilder to visit control flow correctly (#6124)Thomas Lively2023-11-163-5/+85
| | | | | | | | | | | Besides If, no control flow structure consumes values from the stack. Fix a bug in IRBuilder that was causing it to pop control flow children. Also fix a follow on bug in outlining where it did not make the If condition available on the stack when starting to visit an If. This required making push() part of the public API of IRBuilder. As a drive-by, also add helpful debug logging to IRBuilder. Co-authored-by: Ashley Nelson <nashley@google.com>
* [Outlining] Adding more tests (#6117)Ashley Nelson2023-11-152-13/+45
| | | | | | Checking a couple of testing TODOs off and adding more tests of the outlining pass for outlining: - a sequence at the beginning of an existing function - a sequence that is outlined into a function that takes no arguments - multiple sequences from the same source function into different outlined functions
* [NFC] Refactor out subtyping discovery code (#6106)Alon Zakai2023-11-152-259/+358
| | | | | | | | | | This implements an idea I mentioned in the past, to extract the subtyping discovery code out of Unsubtyping so it could be reused elsewhere. Example possible uses: the validator could use to remove a lot of code, and also a future PR of mine will need it. Separately from those, I think this is a nice refactoring as it makes Unsubtyping much smaller. This just moves the code out and adds some C++ template elbow grease as needed.
* Implement more TypeGeneralizing transfer functions (#6118)Thomas Lively2023-11-152-77/+538
| | | | | | | Finish the transfer functions for all expressions except for string instructions, exception handling instructions, tuple instructions, and branch instructions that carry values. The latter require more work in the CFG builder because dropping the extra stack values happens after the branch but before the target block.
* SignatureRefining: Notice LUB requirements of intrinsic calls (#6122)Alon Zakai2023-11-141-0/+26
| | | | | | call.without.effects implies a call to the function reference in the last parameter, so the values sent in the other parameters must be taken into account when computing LUBs for refining arguments, otherwise we might refine so much that the intrinsic call no longer validates.
* [Parser] Parse call_ref (#6103)Thomas Lively2023-11-154-8/+37
| | | | Also mark array.new_elem as unimplemented as a drive-by; it previously had an incorrect implementation.
* [Parser] Parse array.new_fixed (#6102)Thomas Lively2023-11-155-5/+40
|
* [Parser] Parse RefAs expressions (#6101)Thomas Lively2023-11-155-3/+14
|
* [Parser] Parse BrOn expressions (#6100)Thomas Lively2023-11-155-10/+33
|
* [Parser] Parse ref.test and ref.cast (#6099)Thomas Lively2023-11-155-6/+32
|
* [Parser] Parse br_table (#6098)Thomas Lively2023-11-154-8/+68
|
* [Parser] Parse ref.func (#6097)Thomas Lively2023-11-154-5/+16
|
* [Outlining] Add SKIP_OUTLINING macroAshley Nelson2023-11-142-2/+5
| | | Allow outlining to be excluded from the command line on non-Emscripten builds.
* [NFC] Add LocalLocation for future use (#6105)Alon Zakai2023-11-132-0/+27
| | | | | | This is not needed in GUFA as it tracks local values precisely (each set is connected to the gets that actually read from it), but in a future PR it will be useful to track local values per index (each set is connected to all gets for that index, i.e., each local index is a single "location").
* OptimizeAddedConstants: Handle a final added constant properly (#6115)Alon Zakai2023-11-131-9/+11
| | | | | | | | We had an assert there that was wrong. In fact the assert is just in one of two code paths, and an optional one: the end situation is we have an expression and a constant to add to it, and the assert was in the case that the expression is a Const so we can do the add at compile time (the other code path does the add at runtime). This code path is optional as Precompute would do such compile-time addition anyhow, but it is nice to fix and leave that path so that this pass emits fully optimal code.
* [Outlining] Adds Outlining pass (#6110)Ashley Nelson2023-11-137-38/+455
| | | Adds an outlining pass that performs outlining on a module end to end, and two tests.
* [NFC] Add explicit deduction guides for CTAD (#6094)Thomas Lively2023-11-097-0/+25
| | | | | | | | | | | Class template argument deduction (CTAD) is a C++17 feature that allows variables to be declared with class template types without specifying the template parameters. Deduction guides are a mechanism by which template authors can control how the template parameters are inferred when CTAD is used. The Google style guide prohibits the use of CTAD except where template authors opt in to supporting it by providing explicit deduction guides. For compatibility with users adhering to Google style, set the compiler flag to check this condition and add the necessary deduction guides to make the compiler happy again.
* [Parser][NFC] Filter out unused instructions in gen-s-parser.py (#6095)Thomas Lively2023-11-092-110/+59
| | | | | | The new wat parser parses block, if, loop, then, and else keywords directly rather than depending on code generated from gen-s-parser.py. Filter these keywords out in gen-s-parser.py when generating the new wat parser and delete the stub functions that the removed generated code used to depend on.
* [NFC] Simplify LiteralUtils::canMakeZero (#6093)Alon Zakai2023-11-091-13/+1
|
* [NFC] StackIR: Add comments on local2stack handling of tuples (#6092)Alon Zakai2023-11-091-3/+7
| | | | Also add testcases to be comprehensive and notice changes if we ever decide to modify that behavior.
* Heap2Local: Fix an ordering issue with children having different ↵Alon Zakai2023-11-091-21/+33
| | | | | | | | | | | | | | | | | | interactions with a parent (#6089) We had a simple rule that if we reach an expression twice then we give up, which makes sense for say a block: if one allocation flows out of it, then another can't - it would get mixed in with the other one, which is a case we don't optimize. However, there are cases where a parent has multiple children and different interactions with them, like a struct.set: the reference child does not escape, but the value child does. Before this PR if we reached the value child first, we'd mark the parent as seen, and then the reference child would see it isn't the first to get here, and not optimize. To fix this, reorder the code to handle this case. The manner of interaction between the child and the parent decides whether we mark the parent as seen and to be further avoided. Noticed by the determinism fuzzer, since the order of analysis mattered here.
* [analysis] Add an experimental TypeGeneralizing optimization (#6080)Thomas Lively2023-11-086-0/+497
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This new optimization will eventually weaken casts by generalizing (i.e. un-refining) their output types. If a cast is weakened enough that its output type is a supertype of its input type, the cast will be able to be removed by OptimizeInstructions. Unlike refining cast inputs, generalizing cast outputs can break module validation. For example, if the result of a cast is stored to a local and the cast is weakened enough that its output type is no longer a subtype of that local's type, then the local.set after the cast will no longer validate. To avoid this validation failure, this optimization would have to generalize the type of the local as well. In general, the more we can generalize the types of program locations, the more we can weaken casts of values that flow into those locations. This initial implementation only generalizes the types of locals and does not actually weaken casts yet. It serves as a proof of concept for the analysis required to perform the full optimization, though. The analysis uses the new analysis framework to perform a reverse analysis tracking type requirements for each local and reference-typed stack value in a function. Planned and potential future work includes: - Implementing the transfer function for all kinds of expressions. - Tracking requirements on the dynamic types of each location to generalize allocations as well. - Making the analysis interprocedural and generalizing the types of more program locations. - Optimizing tuple-typed locations. - Generalizing only those locations necessary to eliminate at least one cast (although this would make the anlysis bidirectional, so it is probably better left to separate passes).
* Move --separate-data-segments into a pass so it can be run from wasm-opt (#6088)Sam Clegg2023-11-087-45/+89
| | | | | | | | Because we currently strip some data segments (i.e. EM_JS strings) during `--post-emscripten` this is too late as `--separate-data-segments` always runs in `wasm-emscripten-finalize`. Once emscripten switches over to using the pass directly we can remove the support from `wasm-emscripten-finalize`
* LocalCSE: Do not optimize small things like global.get (#6087)Alon Zakai2023-11-081-3/+6
| | | | | | | | | LocalCSE is nice for large expressions, but for small things it has always been of unclear benefit since VMs also do GVN/CSE anyhow. So we are likely not speeding anything up, but hopefully we are reducing code size at least. Doing LocalCSE on something small like a global.get is very possibly going to increase code size, however (since we add a tee, and since the local gets are of similar size to global gets - depends on LUB sizes). On real-world Java code that overhead is noticeable, so this PR makes us more careful, and we skip things of size 1 (no children).
* [Parser] Parse `call` and `return_call` (#6086)Thomas Lively2023-11-074-5/+63
| | | | To support parsing calls, add support for parsing function indices and building calls with IRBuilder.
* Fix build failure on older Ubuntu (#6085)Thomas Lively2023-11-071-80/+94
| | | | Update the C++20 builder to use Ubuntu 20.04 to catch problems building with its system compiler. Also fix such a problem in wasm-fuzz-lattices.cpp.
* Update CFGWalker to generate consolidated exit blocks (#6079)Thomas Lively2023-11-063-21/+93
| | | | | | | | | | | | | | | | Previously CFGWalker designated a particular block as the "exit" block, but it was just the block that happened to appear at the end of the function that returned values by implicitly flowing them out. That exit block was not tied in any way to other blocks that might end in returns, so analyses that needed to perform some action at the end of the function would have had to perform that action at the end of the designated exit block but also separately at any return instruction. Update CFGWalker to make the exit block a synthetic empty block that is a successor of all other blocks tthat implicitly or explicitly return from the function in case there are multiple such blocks, or to make the exit block the single returning block if there is only one. This means that analyses will only perform their end-of-function actions at the end of the exit block rather than additionally at every return instruction.