summaryrefslogtreecommitdiff
path: root/src/passes/Inlining.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [NFC-ish] Stop creating unneeded blocks around calls when inlining (#6969)Alon Zakai2024-09-261-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Inlining was careful about nested calls like this: (call $a (call $b) ) If we inlined the outer call first, we'd have (block $inlined-code-from-a ..code.. (call $b) ) After that, the inner call is a child of a block, not of a call. That is, we've moved the inner call to another parent. To replace that inner call when we inline, we'd need to update the new parent, which would require work. To avoid that work, the pass simply created a block in the middle: (call $a (block (call $b) ) ) Now the inner call's immediate parent will not change when we inline the outer call. However, it turns out that this was entirely unnecessary. We find the calls using a post-order traversal, and we store the actions in a vector that we traverse in order, so we only ever process things in the optimal order of children before parents. And in that order there is no problem: inlining the inner call first leads to (call $a (block $inlined-code-from-b (..code..) ) ) That does not affect the outer call's parent. This PR removes the creation of the unnecessary blocks. This doesn't improve the final output as optimizations remove the unneeded blocks later anyhow, but it does make the code simpler and a little faster. It also makes debugging less confusing. But this is not truly NFC because --inlining (but not --inlining-optimizing) will actually emit fewer blocks now (but only --inlining-optimizing is used by default in production). The diff on tests here is very small when ignoring whitespace. The remaining differences are just emitting fewer obviously-unneeded blocks. There is also one test that needed manual changes, inlining-eh-legacy, because it tested that we do Pop fixups, but after emitting one fewer block, those fixups were not needed. I added a new test there with two nested calls, which does end up needing those fixups. I also added such a test in inlining_all-features so that we have coverage for such nested calls (we might remove the eh-legacy file some day, and other existing tests with nested calls that I found were more complex).
* [NFC-ish] Avoid repeated ReFinalize etc. when inlining (#6967)Alon Zakai2024-09-241-6/+24
| | | | | | | | | | | | | | | | | | | | | | | | | We may inline multiple times into a single function. Previously, if we did so, we did the "fixups" such as ReFinalize and non-nullable local fixes once per such inlining. But that is wasteful as each ReFinalize etc. scans the whole function, and could be done after we copy all the code from all the inlinings, which is what this PR does: it splits doInlining() into one function that inlines code and one that does the updates after, and the update is done after all inlinings. This turns out to be very important, a 5x speedup on two large real-world wasm files I am looking at. The reason is that we actually inline more than once in half the cases, and sometimes far more - in one case we inline over 1,000 times into a function! (and ReFinalized 1,000 times too many) This is practically NFC, but it turns out that there are some tiny noticeable differences between running ReFinalize once at the end vs. once after each inlining. These differences are not really functional or observable in the behavior of the code, and optimizations would remove them anyhow, but they are noticeable in two tests here. The changes to tests are, in order: * Different block names, just because the counter we use sees more things. * In a testcase with unreachable code, we inline twice into a function, and the first inlining brings in an unreachable, and ReFinalizing early will lead to it propagating differently than if we wait to ReFinalize. (It actually leads to another cycle of inlining in that case, as a fluke.)
* [NFC] Parallelize the actual inlining part of the Inlining pass (#6966)Alon Zakai2024-09-241-27/+86
| | | | | | | | | | | | | | We already parallelized collecting info about functions and finding inlining opportunities, but the actual inlining - copying the code into the target function - was done sequentially. It turns out that a lot of work happens there: this PR makes the pass over 2x faster. Details: * Move nameHint to InliningAction, so it is there when we apply the actions. * Add a DoInlining internal pass which calls doInlining(). * Refactor OptUtils a little to make it easy to run DoInlining before opts. * Also remove the return value of doInlining which was not used.
* Make FunctionInfo's assignment operator argument const (#6780)Derek Schuff2024-07-221-1/+1
| | | | | | | | Aside from the fact that there's no need for this to be non-const and this is the usual way to write an assignment operator, this is also needed because of a recent change to std::pair (https://github.com/llvm/llvm-project/pull/89652). This seems to be forcing pair to want the const version of the assignment operator of its members.
* [DebugInfo] Copy debug info in call-utils.h (#6652)Alon Zakai2024-06-121-2/+2
| | | | | | | | | | | | | | | | | | | | | | | We automatically copy debuginfo in replaceCurrent(), but there are a few places that do other operations than simple replacements. call-utils.h will turn a call_ref with a select target into two direct calls, and we were missing the logic to copy debuginfo from the call_ref to the calls. To make this work, refactor out the copying logic from wasm-traversal, into debuginfo.h, and use it in call-utils.h. debuginfo.h itself is renamed from debug.h (as now this needs to be included from wasm-traversal, which nearly everything does, and it turns out some files have internal stuff like a debug() helper that ends up conflicing with the old debug namespace). Also rename the old copyDebugInfo function to copyDebugInfoBetweenFunctions which is more explicit. That is also moved from the header to a cpp file because it depends on wasm-traversal (so we'd end up with recursive headers otherwise). That is fine, as that method is called after copying a function, which is not that frequent. The new copyDebugInfoToReplacement (which was refactored out of wasm-traversal) is in the header because it can be called very frequently (every single instruction we optimize) and we want it to get inlined.
* Improve inlining of `return_call*` (#6477)Jérôme Vouillon2024-04-101-28/+60
| | | | | Use the previous implementation when no return_call is in a try block. This avoids moving code around (as a sibling of the caller body or the inlined body), so that should allow more local optimizations after inlining.
* Handle return calls correctlyThomas Lively2024-04-081-58/+152
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a combined commit covering multiple PRs fixing the handling of return calls in different areas. The PRs are all landed as a single commit to ensure internal consistency and avoid problems with bisection. Original PR descriptions follow: * Fix inlining of `return_call*` (#6448) Previously we transformed return calls in inlined function bodies into normal calls followed by branches out to the caller code. Similarly, when inlining a `return_call` callsite, we simply added a `return` after the body inlined at the callsite. These transformations would have been correct if the semantics of return calls were to call and then return, but they are not correct for the actual semantics of returning and then calling. The previous implementation is observably incorrect for return calls inside try blocks, where the previous implementation would run the inlined body within the try block, but the proper semantics would be to run the inlined body outside the try block. Fix the problem by transforming inlined return calls to branches followed by calls rather than as calls followed by branches. For the case of inlined return call callsites, insert branches out of the original body of the caller and inline the body of the callee as a sibling of the original caller body. For the other case of return calls appearing in inlined bodies, translate the return calls to branches out to calls inserted as siblings of the original inlined body. In both cases, it would have been convenient to use multivalue block return to send call parameters along the branches to the calls, but unfortunately in our IR that would have required tuple-typed scratch locals to unpack the tuple of operands at the call sites. It is simpler to just use locals to propagate the operands in the first place. * Fix interpretation of `return_call*` (#6451) We previously interpreted return calls as calls followed by returns, but that is not correct both because it grows the size of the execution stack and because it runs the called functions in the wrong context, which can be observable in the case of exception handling. Update the interpreter to handle return calls correctly by adding a new `RETURN_CALL_FLOW` that behaves like a return, but carries the arguments and reference to the return-callee rather than normal return values. `callFunctionInternal` is updated to intercept this flow and call return-called functions in a loop until a function returns with some other kind of flow. Pull in the upstream spec tests return_call.wast, return_call_indirect.wast, and return_call_ref.wast with light editing so that we parse and validate them successfully. * Handle return calls in wasm-ctor-eval (#6464) When an evaluated export ends in a return call, continue evaluating the return-called function. This requires propagating the parameters, handling the case that the return-called function might be an import, and fixing up local indices in case the final function has different parameters than the original function. * Update effects.h to handle return calls correctly (#6470) As far as their surrounding code is concerned return calls are no different from normal returns. It's only from a caller's perspective that a function containing a return call also has the effects of the return-callee. To model this more precisely in EffectAnalyzer, stash the throw effect of return-callees on the side and only merge it in at the end when analyzing the effects of a full function body.
* Add sourcemap support to wasm-metadce and wasm-merge (#6372)Jérôme Vouillon2024-03-061-3/+1
|
* [NFC] Fix typo in Inlining (#6187)Alon Zakai2023-12-201-2/+2
|
* [NFC] Use a reference instead of a pointer in Inlining (#6153)Alon Zakai2023-12-071-11/+11
|
* Add no-inline IR annotation, and passes to set it based on function name (#6146)Alon Zakai2023-12-061-8/+14
| | | | | | | | | | | | | Any function can now be annotated as not to be inlined fully (normally) or not to be inlined partially. In the future we'll want to read those annotations from the proposed wasm metadata section on code hints, and from wat text as well, but for now add trivial passes that set those fields based on function name wildcards, e.g.: --no-inline=*leave-alone* --inlining That will not inline any function whose name contains "leave-alone" in the name. --no-inline disables all inlining (full or partial) while --no-full-inline and --no-partial-inline affect only full or partial inlining.
* Inlining: Inline trivial calls (#6143)Alon Zakai2023-12-051-9/+43
| | | | | | | | | | | | | | | | A trivial call is something like a function that just calls another immediately, function foo(x, y) { return bar(y, 15); } We can inline those and expect to benefit in most cases, though we might increase code size slightly. Hence it makes sense to inline such cases, even though in general we are careful and do not inline functions with calls in them; a "trampoline" like that likely has most of the work in the call itself, which we can avoid by inlining. Suggested based on findings in Java.
* [NFC] Move ModuleUtils copying and renaming logic from header to cpp (#5855)Alon Zakai2023-08-021-0/+1
| | | | | | | | None of that code is speed-sensitive, or at least doesn't need to be inlined to be fast. Move it to cpp for faster compile times. This caused a cascade of necessary header fixes (i.e. after removing unneeded header inclusions in module-utils.h, files that improperly depended on that stopped working and needed an added include).
* Gate all partial inlining behind the partial-inlining-ifs flag (#5710)Alon Zakai2023-05-101-6/+10
| | | | | | | | | #4191 meant to do that, I think, but only did so for "pattern B". This does it for all patterns, and adds assertions. In theory this could regress code that benefits from partial inlining of "pattern A" (since this PR stops doing it by default), but I did not see a significant difference on any benchmarks, and it is easy to re-enable that behavior by doing --partial-inlining-ifs=1.
* [NFC] Add some comments for split inlining crossing the normal limit (#5702)Alon Zakai2023-05-051-3/+46
|
* Generate unique block names when inlining (#5697)Alon Zakai2023-05-051-3/+16
| | | | | | | | | | | Each time we inline we put the contents in a block. Before we used the same name each time we inlined the same method, and as a result had many conflicts if a function was inlined many times. With this PR we emit a different name each time. This is not 100% NFC as it does change block names, which is observable in the IR (as can be seen in the test updates). This helps #5696 in speeding up UniqueNameManner.
* Fallback to direct inlining if the outline will be inlined. (#5698)Goktug Gokdogan2023-05-041-4/+38
| | | | | | | | | | | | | | | | This workarounds the extra work around the edge case where; - Function is too big to full-inline - It is a candidate for partial inline - Outlined version becomes eligible for full-inline. In such a case, binaryen would introduce a temporary state with partial inlined functions and later on inline them. J2CL hit this scenario for String literal which resulted in significant regressions in compilation time. This patch updates partial inlining analysis to identify the edge case and direct to full-inlining when that happens.
* [NFC] Start tracking InliningMode and memoize it to avoid re-analysis. (#5695)Goktug Gokdogan2023-05-021-73/+72
|
* [NFC] Inlining: Split maybeSplit into canSplit/doSplit (#5691)Goktug Gokdogan2023-04-281-115/+108
| | | This will make future improvements easier.
* Never flip a call from unreachable to reachable during inlining (#5493)Alon Zakai2023-02-161-8/+33
| | | | | | | | | | | | See example in the new comment. In general, we never want to go from unreachable to reachable (refinalize doesn't even try), as it misses out on DCE opportunities. Also it may have validation issues, which is how the fuzzer found this. Remove code in the same place that was redundant after the refinalize was added in #5492. That simplifies some existing testcases slightly, by removing an unneeded br we added, and now we add a new unreachable at the end in other cases that need it.
* Unreachability fixes for inlining (#5492)Alon Zakai2023-02-151-5/+10
| | | | | | | | | | | | We must refinalize as inlining unreachable code can lead to more things becoming unreachable. We also must uniquify label names before refinalizing, as the IR must be valid at that time, so that code is moved. This causes some minor changes to existing test code (some label changes, and refinalization makes more things unreachable), but only the two new tests show actual problems that needed to be fixed.
* Replace `RefIs` with `RefIsNull` (#5401)Thomas Lively2023-01-091-1/+1
| | | | | | | | | | | | | | | * Replace `RefIs` with `RefIsNull` The other `ref.is*` instructions are deprecated and expressible in terms of `ref.test`. Update binary and text parsing to parse those instructions as `RefTest` expressions. Also update the printing and emitting of `RefTest` expressions to emit the legacy instructions for now to minimize test changes and make this a mostly non-functional change. Since `ref.is_null` is the only `RefIs` instruction left, remove the `RefIsOp` field and rename the expression class to `RefIsNull`. The few test changes are due to the fact that `ref.is*` instructions are now subject to `ref.test` validation, and in particular it is no longer valid to perform a `ref.is_func` on a value outside of the `func` type hierarchy.
* Fix an Inlining bug with a name collision in a br nested in a call param (#5323)Alon Zakai2022-12-061-5/+19
|
* Use C++17's [[maybe_unused]]. NFC (#5309)Sam Clegg2022-12-021-2/+1
|
* Fix validation and inlining bugs (#5301)Thomas Lively2022-11-291-1/+3
| | | | | | | | | | | | | | Inlining had a bug where it gave return_calls in inlined callees concrete types even when they should have remained unreachable. This bug flew under the radar because validation had a bug where it allowed expressions to have concrete types when they should have been unreachable. The fuzzer found this bug by adding another pass after inlining where the unexpected types caused an assertion failure. Fix the bugs and add a test that would have triggered the inlining bug. Unfortunately the test would have also passed before this change due to the validation bug, but it's better than nothing. Fixes #5294.
* Switch from `typedef` to `using` in C++ code. NFC (#5258)Sam Clegg2022-11-151-1/+1
| | | | This is more modern and (IMHO) easier to read than that old C typedef syntax.
* Make `Name` a pointer, length pair (#5122)Thomas Lively2022-10-111-2/+3
| | | | | | | | | | | | | | | | | | | | | | | With the goal of supporting null characters (i.e. zero bytes) in strings. Rewrite the underlying interned `IString` to store a `std::string_view` rather than a `const char*`, reduce the number of map lookups necessary to intern a string, and present a more immutable interface. Most importantly, replace the `c_str()` method that returned a `const char*` with a `toString()` method that returns a `std::string`. This new method can correctly handle strings containing null characters. A `const char*` can still be had by calling `data()` on the `std::string_view`, although this usage should be discouraged. This change is NFC in spirit, although not in practice. It does not intend to support any particular new functionality, but it is probably now possible to use strings containing null characters in at least some cases. At least one parser bug is also incidentally fixed. Follow-on PRs will explicitly support and test strings containing nulls for particular use cases. The C API still uses `const char*` to represent strings. As strings containing nulls become better supported by the rest of Binaryen, this will no longer be sufficient. Updating the C and JS APIs to use pointer, length pairs is left as future work.
* Implement bottom heap types (#5115)Thomas Lively2022-10-071-10/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | These types, `none`, `nofunc`, and `noextern` are uninhabited, so references to them can only possibly be null. To simplify the IR and increase type precision, introduce new invariants that all `ref.null` instructions must be typed with one of these new bottom types and that `Literals` have a bottom type iff they represent null values. These new invariants requires several additional changes. First, it is now possible that the `ref` or `target` child of a `StructGet`, `StructSet`, `ArrayGet`, `ArraySet`, or `CallRef` instruction has a bottom reference type, so it is not possible to determine what heap type annotation to emit in the binary or text formats. (The bottom types are not valid type annotations since they do not have indices in the type section.) To fix that problem, update the printer and binary emitter to emit unreachables instead of the instruction with undetermined type annotation. This is a valid transformation because the only possible value that could flow into those instructions in that case is null, and all of those instructions trap on nulls. That fix uncovered a latent bug in the binary parser in which new unreachables within unreachable code were handled incorrectly. This bug was not previously found by the fuzzer because we generally stop emitting code once we encounter an instruction with type `unreachable`. Now, however, it is possible to emit an `unreachable` for instructions that do not have type `unreachable` (but are known to trap at runtime), so we will continue emitting code. See the new test/lit/parse-double-unreachable.wast for details. Update other miscellaneous code that creates `RefNull` expressions and null `Literals` to maintain the new invariants as well.
* Refactor interaction between Pass and PassRunner (#5093)Thomas Lively2022-09-301-15/+14
| | | | | | | | | | | | | | Previously only WalkerPasses had access to the `getPassRunner` and `getPassOptions` methods. Move those methods to `Pass` so all passes can use them. As a result, the `PassRunner` passed to `Pass::run` and `Pass::runOnFunction` is no longer necessary, so remove it. Also update `Pass::create` to return a unique_ptr, which is more efficient than having it return a raw pointer only to have the `PassRunner` wrap that raw pointer in a `unique_ptr`. Delete the unused template `PassRunner::getLast()`, which looks like it was intended to enable retrieving previous analyses and has been in the code base since 2015 but is not implemented anywhere.
* [Inlining] Preserve return_calls when possible (#4589)Thomas Lively2022-04-111-0/+10
| | | | | | | | | We can preserve return_calls in inlined functions when the inlined call site is itself a return_call, since the call result types must transitively match in that case. This solves a problem where the previous inlining logic could introduce stack exhaustion by downgrading recursive return_calls to normal calls. Fixes #4587.
* Use LiteralUtils::canMakeZero before calling makeZero (#4568)Alon Zakai2022-04-011-3/+3
| | | | | Fixes #4562 Fixes #4564
* [EH] Handle nested pops after inlining (#4404)Heejin Ahn2021-12-201-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Inlining creates additional `block`s at inlined call sites, which can be inside a `catch`. For example: ```wast (try (do) (catch $tag (call $callee (pop i32) ) ) ) ``` After inlining, this becomes ```wast (try (do) (catch $tag (block $__inlined_func$callee (local.set $0 (pop i32) ;; Invalid!! ) (nop) ) ) ) ``` Now the `pop` is nested in a `block`, which makes this invalid. This PR runs `EHUtils::handleBlockNestedPops` at the end to assign the `pop` to a local right after the `catch`, making the code valid again: ```wast (try (do) (catch $tag (local.set $new ;; New local to store `pop` result (pop i32) ) (block $__inlined_func$callee (local.set $0 (local.get $new) ) (nop) ) ) ) ```
* Modernize code to C++17 (#3104)Max Graey2021-11-221-3/+1
|
* Change from storing Signature to HeapType on CallIndirect (#4352)Thomas Lively2021-11-221-7/+6
| | | | | | | | | | | | With nominal function types, this change makes it so that we preserve the identity of the function type used with call_indirect instructions rather than recreating a function heap type, which may or may not be the same as the originally parsed heap type, from the function signature during module writing. This will simplify the type system implementation by removing the need to store a "canonical" nominal heap type for each unique signature. We previously depended on those canonical types to avoid creating multiple duplicate function types during module writing, but now we aren't creating any new function types at all.
* Fix inlining name collision (#4206)Alon Zakai2021-10-041-0/+17
|
* Disable partial inlining by default and add a flag for it. (#4191)Alon Zakai2021-09-271-2/+1
| | | | | Locally I saw a 10% speedup on j2cl but reports of regressions have arrived, so let's disable it for now pending investigation. The option added here should make it easy to experiment.
* Inlining: Remove unneeded functions in linear time (#4190)Alon Zakai2021-09-271-2/+7
| | | | | | | | | | By mistake the recent partial inlining work introduced quadratic time into the compiler: erasing a function from the list of functions takes linear time, which is why we have removeFunctions that does a group at a time. This isn't noticeable on small programs, but on j2cl output this makes the inlining-optimizing step 2x faster. See #4165
* Fix an unused variable warning. NFC (#4170)walkingeyerobot2021-09-201-0/+1
|
* Partial inlining via function splitting (#4152)Alon Zakai2021-09-171-40/+588
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This PR helps with functions like this: function foo(x) { if (x) { .. lots of work here .. } } If "lots of work" is large enough, then we won't inline such a function. However, we may end up calling into the function only to get a false on that if and immediately exit. So it is useful to partially inline this function, basically by creating a split of it into a condition part that is inlineable function foo$inlineable(x) { if (x) { foo$outlined(); } } and an outlined part that is not inlineable: function foo$outlined(x) { .. lots of work here .. } We can then inline the inlineable part. That means that a call like foo(param); turns into if (param) { foo$outlined(); } In other words, we end up replacing a call and then a check with a check and then a call. Any time that the condition is false, this will be a speedup. The cost here is increased size, as we duplicate the condition into the callsites. For that reason, only do this when heavily optimizing for size. This is a 10% speedup on j2cl. This helps two types of functions there: Java class inits, which often look like "have I been initialized before? if not, do all this work", and also assertion methods which look like "if the input is null, throw an exception".
* Inlining: Track names over multiple iterations, not pointers (#4127)Alon Zakai2021-09-071-2/+6
| | | | | | | | It can be confusing during debugging to keep a map of pointers when we might have removed some of those functions from the module meanwhile (if you iterate over it in some additional debug logging). This change has no observable effect, however, as no bug could have actually occurred in practice given that nothing is done with the pointers in the actual code.
* Improve inlining limits for recursion (#4067)Alon Zakai2021-08-101-18/+50
| | | | | | | | | | | | | | Previously we would keep doing iterations of inlining until we hit the number of functions in the module (at which point, we would definitely know we are recursing). This prevents infinite recursion, but it can take a very very long time to notice that in a huge module with one tiny recursive function and 100,000 other ones. To do better than that, track how many times we've inlined into a function. After a fixed number of such inlinings, stop. Aside from avoding very slow compile times, such infinite recursion likely is not that beneficial to do a great many times, so anyhow it is best to stop after a few iterations.
* [Wasm GC] Add a isNonNullable() convenience method. NFC (#3978)Alon Zakai2021-07-121-1/+1
|
* Set an absolute maximum inlining limit (#3959)Alon Zakai2021-07-011-0/+21
| | | | | | | | | | | We have seen some cases in both Chrome and Firefox where extremely large modules cause overhead, #3730 (comment) (and link therein) emscripten-core/emscripten#13899 (comment) There is no "right" value to use as a limit here, but pick an arbitrary one that is very high. (This setting is verified to have no effect on the emscripten benchmark suite.)
* Preserve Function HeapTypes (#3952)Thomas Lively2021-06-301-4/+4
| | | | | | | | | When using nominal types, func.ref of two functions with identical signatures but different HeapTypes will yield different types. To preserve these semantics, Functions need to track their HeapTypes, not just their Signatures. This PR replaces the Signature field in Function with a HeapType field and adds new utility methods to make it almost as simple to update and query the function HeapType as it was to update and query the Function Signature.
* [Wasm GC] Fix inlining + non-nullable locals (#3945)Alon Zakai2021-06-181-1/+8
| | | | | | We don't need to assign a zero value for such locals (and we can't, as no default value exists for them). Fixes #3944
* [Wasm GC] Do not inline a function with an RTT parameter (#3808)Alon Zakai2021-04-141-2/+17
| | | | | Inlined parameters become locals, and rtts cannot be handled as locals, unlike non-nullable values which we can at least fix up. So do not inline functions with rtt params.
* Scan module-level code in necessary places (#3744)Alon Zakai2021-03-291-11/+3
| | | | | | | | | | | | | | | | | | Several old passes like DeadArgumentElimination and DuplicateFunctionElimination need to look at all ref.funcs, and they scanned functions for that, but that is not enough as such an instruction might appear in a global initializer. To fix this, add a walkModuleCode method. walkModuleCode is useful when doing the pattern of creating a function-parallel pass to scan functions quickly, but we also want to do the same scanning of code at the module level. This allows doing so in a single line. (It is also possible to just do walk() on the entire module, which will find all code, but that is not function-parallel. Perhaps we should have a walkParallel() option to simplify this further in a followup, and that would call walkModuleCode afterwards etc.) Also add some missing validation and comments in the validator about issues that I noticed in relation to the new testcases here.
* [RT] Support expressions in element segments (#3666)Abbas Mashayekh2021-03-241-5/+3
| | | | | | This PR adds support for `ref.null t` as a valid element segment item. The abbreviated format of `(elem ... func $f $g...)` is kept in both printing and binary emitting if all items are `ref.func`s. Public APIs aren't updated in this PR.
* [Wasm GC] Add support for non-nullable types, all except for locals (#3710)Alon Zakai2021-03-231-0/+2
| | | | | | | | | | | | | | | | | | | | | | After this PR we still do not support non-nullable locals. But we no longer turn all types into nullable upon load. In particular, we support non-nullable types on function parameters and struct fields, etc. This should be enough to experiment with optimizations in both binaryen and in VMs regarding non- nullability (since we expect that optimizing VMs can do well inside functions anyhow; it's non-nullability across calls and from data that the VM can't be expected to think about). Let is handled as before, by lowering it into gets and sets. In addition, we turn non-nullable locals into nullable ones, and add a ref.as_non_null on all their gets (to keep the type identical there). This is used not just for loading code with a let but also is needed after inlining. Most of the code changes here are removing FIXMEs for allowing non-nullable types. But there is also code to handle the issues mentioned above. Most of the test updates are removing extra nulls that we added before when we turned all types nullable. A few tests had actual issues, though, and also some new tests are added to cover the code changes here.
* [reference-types] Support passive elem segments (#3572)Abbas Mashayekh2021-03-051-5/+3
| | | | | | | | | | | Passive element segments do not belong to any table, so the link between Table and elem needs to be weaker; i.e. an elem may have a table in case of active segments, or simply be a collection of function references in case of passive/declarative segments. This PR takes Table::Segment out and turns it into a first class module element just like tables and functions. It also implements early support for parsing, printing, encoding and decoding passive/declarative elem segments.