forks/binaryen.git -

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Cost analysis: Remove "Unacceptable" hack (#6782)	Alon Zakai	2024-07-25	1	-25/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We marked various expressions as having cost "Unacceptable", fixed at 100, to ensure we never moved them out from an If arm, etc. Giving them such a high cost avoids that problem - the cost is higher than the limit we have for moving code from conditional to unconditional execution - but it also means the total cost is unrealistic. For example, a function with one such instruction + an add (cost 1) would end up with cost 101, and removing the add would look insignificant, which causes issues for things that want to compare costs (like Monomorphization). To fix this, adjust some costs. The main change here is to give casts a cost of 5. I measured this in depth, see the attached benchmark scripts, and it looks clear that in both V8 and SpiderMonkey the cost of a cast is high enough to make it not worth turning an if with ref.test arm into a select (which would always execute the test). Other costs adjusted here matter a lot less, because they are on operations that have side effects and so the optimizer will anyhow not move them from conditional to unconditional execution, but I tried to make them a bit more realistic while I was removing "Unacceptable": * Give most atomic operations the 10 cost we've been using for atomic loads/ stores. Perhaps wait and notify should be slower, however, but it seems like assuming fast switching might be more relevant. * Give growth operations a cost of 20, and throw operations a cost of 10. These numbers are entirely made up as I am not even sure how to measure them in a useful way (but, again, this should not matter much as they have side effects).
*	[threads] Calculate shared heap type depths in subtypes.h (#6777)	Thomas Lively	2024-07-23	1	-7/+14
\| \| \|	Fixes #6776.
*	[NFC] Clarify and standardize order in flexibleCopy (#6749)	Alon Zakai	2024-07-16	2	-1/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	flexibleCopy always visited parents before children, but it visited vector children in reverse order: (call ;; 1 (call $a) ;; 3 (call $b) ;; 2 ) The order of children happened to not matter in any user of this code, and that's just what you get when you iterate over children in a vector and push them to a stack before visiting them, so this odd ordering was not noticed. For a new user I will introduce soon, however, it would be nice to have the normal pre-order: (call ;; 1 (call $a) ;; 2 (call $b) ;; 3 ) (2 & 3 swapped). This cannot be tested in the current code as it is NFC, but the later PR will depend on it and test it heavily.
*	Monomorphize dropped functions (#6734)	Alon Zakai	2024-07-12	3	-0/+139
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We now consider a drop to be part of the call context: If we see (drop (call $foo) ) (func $foo (result i32) (i32.const 42) ) Then we'd monomorphize to this: (call $foo_1) ;; call the specialized function instead (func $foo_1 ;; the specialized function returns nothing (drop ;; the drop was moved into here (i32.const 42) ) ) With the drop now in the called function, we may be able to optimize out unused work. Refactor a bit of code out of DAE that we can reuse here, into a new return-utils.h.
*	[threads] ref.i31_shared (#6735)	Thomas Lively	2024-07-12	1	-1/+2
\| \| \| \| \| \| \|	Implement `ref.i31_shared` the new instruction for creating references to shared i31s. Implement binary and text parsing and emitting as well as interpretation. Copy the upstream spec test for i31 and modify it so that all the heap types are shared. Comment out some parts that we do not yet support.
*	[wasm-split] Use a fresh table when reference types are enabled (#6726)	Thomas Lively	2024-07-11	1	-13/+18
\| \| \| \| \| \| \|	Rather than trying to trampoline primary-to-secondary calls through an existing table, just create a fresh table for this purpose. This ensures that modifications to the existing tables cannot interfere with primary-to-secondary calls and conversely that loading the secondary module cannot overwrite modifications to the tables.
*	Monomorphization: Optimize constants (#6711)	Alon Zakai	2024-07-11	3	-1/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously the pass would monomorphize a call when we were sending more refined types than the target expects. This generalizes the pass to also consider the case where we send a constant in a parameter. To achieve that, this refactors the pass to explicitly define the "call context", which is the code around the call (inputs and outputs) that may end up leading to optimization opportunities when combined with the target function. Also add comments about the overall design + roadmap. The existing test is mostly unmodified, and the diff there is smaller when ignoring whitespace. We do "regress" those tests by adding more local.set operations, as in the refactoring that makes things a lot simpler, that is, to handle the general case of an operand having either a refined type or be a constant, we copy it inside the function, which works either way. This "regression" is only in the testing version of the pass (the normal version runs optimizations, which would remove that extra code). This also enables the pass when GC is disabled. Previously we only handled refined types, so only GC could benefit. Add a test for MVP content specifically to show we operate there as well.
*	Rename external conversion instructions (#6716)	Jérôme Vouillon	2024-07-08	5	-9/+9
\| \| \| \| \| \| \| \| \|	Rename instructions `extern.internalize` into `any.convert_extern` and `extern.externalize` into `extern.convert_any` to follow more closely the spec. This was changed in https://github.com/WebAssembly/gc/issues/432. The legacy name is still accepted in text inputs and in the C and JS APIs.
*	[WasmGC] Add missing ArrayNew variants to Properties::isGenerative (#6691)	Alon Zakai	2024-06-24	1	-0/+2
\| \| \|	Fixes #6690
*	[threads] Shared basic heap types (#6667)	Thomas Lively	2024-06-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Implement binary and text parsing and printing of shared basic heap types and incorporate them into the type hierarchy. To avoid the massive amount of code duplication that would be necessary if we were to add separate enum variants for each of the shared basic heap types, use bit 0 to indicate whether the type is shared and replace `getBasic()` with `getBasic(Unshared)`, which clears that bit. Update all the use sites to record whether the original type was shared and produce shared or unshared output without code duplication.
*	[threads] Parse, build, and print shared composite types (#6654)	Thomas Lively	2024-06-12	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Parse the text format for shared composite types as described in the shared-everything thread proposal. Update the parser to use 'comptype' instead of 'strtype' to match the final GC spec and add the new syntactic class 'sharecomptype'. Update the type canonicalization logic to take sharedness into account to avoid merging shared and unshared types. Make the same change in the TypeMerging pass. Ensure that shared and unshared types cannot be in a subtype relationship with each other. Follow-up PRs will add shared abstract heap types, binary parsing and emitting for shared types, and fuzzer support for shared types.
*	[DebugInfo] Copy debug info in call-utils.h (#6652)	Alon Zakai	2024-06-12	4	-15/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We automatically copy debuginfo in replaceCurrent(), but there are a few places that do other operations than simple replacements. call-utils.h will turn a call_ref with a select target into two direct calls, and we were missing the logic to copy debuginfo from the call_ref to the calls. To make this work, refactor out the copying logic from wasm-traversal, into debuginfo.h, and use it in call-utils.h. debuginfo.h itself is renamed from debug.h (as now this needs to be included from wasm-traversal, which nearly everything does, and it turns out some files have internal stuff like a debug() helper that ends up conflicing with the old debug namespace). Also rename the old copyDebugInfo function to copyDebugInfoBetweenFunctions which is more explicit. That is also moved from the header to a cpp file because it depends on wasm-traversal (so we'd end up with recursive headers otherwise). That is fine, as that method is called after copying a function, which is not that frequent. The new copyDebugInfoToReplacement (which was refactored out of wasm-traversal) is in the header because it can be called very frequently (every single instruction we optimize) and we want it to get inlined.
*	Fix wasm-split bug in absence of active element segments (#6651)	Thomas Lively	2024-06-11	1	-7/+13
\| \| \| \| \| \| \| \|	The module splitting code incorrectly assumed that there would be at least one active element segment and failed to initialize the table slot manager with a function table if that was not the case. Fix the bug by setting the table even when there are no active segments and add a test. Fixes #6572 and #6637.
*	Effects: Add missing combining logic for MayNotReturn (#6635)	Alon Zakai	2024-06-03	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Without that logic we could end up dropping that particular effect. This actually made a test pass when it should not: the modified test here has a function with effects that are ok to remove, but it had a loop which adds MayNotReturn which we should actually not remove, so it was removed erroneously. To fix the test, add other effects there (local ones) that we can see are removable. Also add a function with a loop to test that we do not remove an infinite loop, which adds coverage for the fix here.
*	Avoid duplicate type names (#6633)	Alon Zakai	2024-05-30	1	-3/+20
\| \| \| \| \|	If we replace a type with another, use the original name for the new type, and give the old a unique name (for the rare cases in which it has uses).
*	SignaturePruning: Properly handle public types (#6630)	Alon Zakai	2024-05-29	2	-11/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The SignaturePruning pass optimizes away parameters that it proves are safe to remove. It turns out that that does not always match the definition of private types, which is more restrictive. Specifically, if say all the types are in one big rec group and one of them is used on an exported function then all of them are considered public (as the rec group is). However, in closed world, it would be ok to leave that rec group unchanged but to create a pruned version of that type and use it, in cases where we see it is safe to remove a parameter. (See the testcase for a concrete example.) To put it another way, SignaturePruning already proves that a parameter is safe to remove in all the ways that matter. Before this PR, however, the testcase in this PR would error - so this PR is not an optimization but a bugfix, really - because SignaturePruning would see that a parameter is safe to remove but then TypeUpdating would see the type is public and so it would leave it alone, leading to a broken module. This situation is in fact not that rare, and happens on real-world Java code. The reason we did not notice it before is that typically there are no remaining SignaturePruning opportunities late in the process (when other closed world optimizations have typically led to a single big rec group). The concrete fix here is to add additionalPrivateTypes to a few more places in TypeUpdating. We already supported that for cases where a pass knew better than the general logic what can be modified, and this adds that ability to the signature-rewriting logic there. Then SignaturePruning can send in all the types it has proven are safe to modify. * Also necessary here is to only add from additionalPrivateTypes if the type is not already in our list (or we'd end up with duplicates in the final rec group). * Also move newSignatures in SignaturePruning out of the top level, which was confusing (the pass has multiple iterations, and we want each to have a fresh instance).
*	Fix FlatTable for table64 (#6598)	Sam Clegg	2024-05-15	1	-1/+1
\|
*	[Strings] Remove operations not included in imported strings (#6589)	Thomas Lively	2024-05-15	3	-68/+15
\| \| \| \| \| \|	The stringref proposal has been superseded by the imported JS strings proposal, but the former has many more operations than the latter. To reduce complexity, remove all operations that are part of stringref but not part of imported strings.
*	[Strings] Remove stringview types and instructions (#6579)	Thomas Lively	2024-05-15	7	-126/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The stringview types from the stringref proposal have three irregularities that break common invariants and require pervasive special casing to handle properly: they are supertypes of `none` but not subtypes of `any`, they cannot be the targets of casts, and they cannot be used to construct nullable references. At the same time, the stringref proposal has been superseded by the imported strings proposal, which does not have these irregularities. The cost of maintaing and improving our support for stringview types is no longer worth the benefit of supporting them. Simplify the code base by entirely removing the stringview types and related instructions that do not have analogues in the imported strings proposal and do not make sense in the absense of stringviews. Three remaining instructions, `stringview_wtf16.get_codeunit`, `stringview_wtf16.slice`, and `stringview_wtf16.length` take stringview operands in the stringref proposal but cannot be removed because they lower to operations from the imported strings proposal. These instructions are changed to take stringref operands in Binaryen IR, and to allow a graceful upgrade path for users of these instructions, the text and binary parsers still accept but ignore `string.as_wtf16`, which is the instruction used to convert stringrefs to stringviews. The binary writer emits code sequences that use scratch locals and `string.as_wtf16` to keep the output valid. Future PRs will further align binaryen with the imported strings proposal instead of the stringref proposal, for example by making `string` a subtype of `extern` instead of a subtype of `any` and by removing additional instructions that do not have analogues in the imported strings proposal.
*	Source maps: Allow specifying that an expression has no debug info in text ↵	Jérôme Vouillon	2024-05-14	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \|	(#6520) ;;@ with nothing else (no source:line) can be used to specify that the following expression does not have any debug info associated to it. This can be used to stop the automatic propagation of debug info in the text parsers. The text printer has also been updated to output this comment when needed.
*	[NFC] Add printing for EffectAnalyzer (#6586)	Alon Zakai	2024-05-14	3	-0/+147
\| \| \| \| \| \| \| \| \|	With this you can do std::cout << effects and get something like EffectAnalyzer { writesMemory hasSideEffects }
*	LocalCSE: Check effects/generativity early (#6587)	Alon Zakai	2024-05-14	2	-17/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously we checked late, and as a result might end up failing to optimize when a sub-pattern could have worked. E.g. (call (A) ) (call (A) ) The call cannot be optimized, but the A pattern repeats. Before this PR we'd greedily focus on the entire call and then fail. After this PR we skip the call before we commit to which patterns to try to optimize, so we succeed. Add a isShallowlyGenerative helper here as we compute this step by step as we go. Also remove a parameter to the generativity code (it did not use the features it was passed).
*	[StackIR] Run StackIR during binary writing and not as a pass (#6568)	Alon Zakai	2024-05-09	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously we had passes --generate-stack-ir, --optimize-stack-ir, --print-stack-ir that could be run like any other passes. After generating StackIR it was stashed on the function and invalidated if we modified BinaryenIR. If it wasn't invalidated then it was used during binary writing. This PR switches things so that we optionally generate, optimize, and print StackIR only during binary writing. It also removes all traces of StackIR from wasm.h - after this, StackIR is a feature of binary writing (and printing) logic only. This is almost NFC, but there are some minor noticeable differences: 1. We no longer print has StackIR in the text format when we see it is there. It will not be there during normal printing, as it is only present during binary writing. (but --print-stack-ir still works as before; as mentioned above it runs during writing). 2. --generate/optimize/print-stack-ir change from being passes to being flags that control that behavior instead. As passes, their order on the commandline mattered, while now it does not, and they only "globally" affect things during writing. 3. The C API changes slightly, as there is no need to pass it an option "optimize" to the StackIR APIs. Whether we optimize is handled by --optimize-stack-ir which is set like other optimization flags on the PassOptions object, so we don't need the old option to those C APIs. The main benefit here is simplifying the code, so we don't need to think about StackIR in more places than just binary writing. That may also allow future improvements to our usage of StackIR.
*	Fuzzer: Stop emitting nullable stringviews (#6574)	Alon Zakai	2024-05-08	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	As of https://chromium-review.googlesource.com/c/v8/v8/+/5471674 V8 requires stringviews to be non-nullable. It might be possible to make that change in our IR, or to remove views entirely, but for now this PR makes the fuzzer stop emitting nullable stringviews as a workaround to allow us to fuzz current V8. There are still rare corner cases where this pattern is emitted, that we have not tracked down, and so this also makes the fuzzer ignore the error for now.
*	wasm-split: Handle RefFuncs (#6513)	Alon Zakai	2024-05-08	1	-8/+85
\| \| \| \| \|	When we have a ref.func that refers to the secondary module then make a trampoline that calls it directly. The trampoline's call is then fixed up like all direct calls to the secondary module.
*	Respect the Web limitation on Table size (#6567)	Alon Zakai	2024-05-01	1	-0/+1
\| \| \| \| \|	Without this the fuzzer can error on differences in behavior between V8 and us. Also move the limitations constants to their own header.
*	[jspi] - Support new version of JSPI for module splitting. (#6546)	Brendan Dahl	2024-04-29	1	-6/+20
\| \| \| \| \| \|	With the old version of JSPI, the JSPI pass was required to be run before splitting and would automatically add an export to be able to find the load_secondary_module function. Now that the pass is no longer needed, just add an import manually for the load_secondary_module function.
*	[Strings] Fix effects of string.compare and add fuzzing (#6547)	Alon Zakai	2024-04-25	1	-1/+8
\| \| \| \| \| \| \| \|	We added string.compare late in the spec process, and forgot to add effects for it. Unlike string.eq, it can trap. Also use makeTrappingRefUse in recent fuzzer string generation places that I forgot, which should reduce the amount of traps in fuzzer output.
*	GUFA: Handle bottom types in filterDataContents() (#6545)	Alon Zakai	2024-04-25	1	-1/+7
\| \| \| \| \| \|	Normally a bottom type cannot reach there, as we ignore unreachable GC operations early on. However, we can infer a bottom type later during the flow, so we need to handle that (just not error on it, and for clarity during debugging we also clear the contents).
*	[wasm-split] Do not split out functions referring to segments (#6517)	Thomas Lively	2024-04-23	1	-4/+62
\| \| \| \| \| \| \|	Since data and elem segments cannot be imported or exported, there is no way to access them from the secondary module, so functions that need to refer to them cannot be split out. Fixes #6512.
*	[Parser] Match legacy parser block naming (#6504)	Thomas Lively	2024-04-16	1	-3/+5
\| \| \| \|	To reduce the size of the test output diff when switching to the new text parser, update it to generate the same block names as the legacy parser.
*	[Parser] Pop past unreachables where possible (#6489)	Thomas Lively	2024-04-16	1	-0/+1118
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We previously would eagerly drop all concretely typed expressions on the stack when pushing an unreachable instruction. This was semantically correct and closely modeled the semantics of unreachable instructions, which implicitly drop the entire stack and start a new polymorphic stack. However, it also meant that the structure of the parsed IR did not match the structure of the folded input, which meant that tests involving unreachable children would not parse as intended, preventing the test from testing the intended behavior. For example, this wat: ```wasm (i32.add (i32.const 0) (unreachable) ) ``` Would previously parse into this IR: ```wasm (drop (i32.const 0) ) (i32.add (unreachable) (unreachable) ) ``` To fix this problem, we need to stop eagerly dropping stack values when encountering an unreachable instruction so we can still pop expressions pushed before the unreachable as direct children of later instructions. In the example above, we need to keep the `i32.const 0` on the stack so it is available to be popped and become a child of the `i32.add`. However, the naive solution of simply popping past unreachables would produce invalid IR in some cases. For example, consider this wat: ```wasm f32.const 0 unreachable i32.add ``` The naive solution would parse this wat into this IR: ```wasm (i32.add (f32.const 0) (unreachable) ) ``` But we do not want to parse an `i32.add` with an `f32`-typed child. Neither do we want to reject this input, since it is a perfectly valid Wasm fragment. In this case, we actually want the old behavior of dropping the `f32.const` and replacing it with another `unreachable` as the first child of the `i32.add`. To both match the input structure where possible and also gracefully fall back to the old behavior of dropping expressions prior to the unreachable, collect constraints on the types of each child for each kind of expression and compare them to the types of available expressions on the stack when an unreachable instruction will be popped. When the constraints are satisfied, pop expressions normally, even after popping the unreachable instruction. Otherwise, drop the instructions that precede the unreachable instruction to ensure we parse valid IR. To collect the constraints, add a new `ChildTyper` utility that calls a different callback for each kind of possible type constraint for each child. In the future, this utility can be used to simplify the validator as well.
*	Fix isGenerative on calls and test via improving ↵	Alon Zakai	2024-04-11	1	-6/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	OptimizeInstructions::areConsecutiveInputsEqual() (#6481) "Generative" is what we call something like a struct.new that may be syntactically identical to another struct.new, but each time a new value is generated. The same is true for calls, which can do anything, including return a different value for syntactically identical calls. This was not a bug because the main user of isGenerative, areConsecutiveInputsEqual(), was too weak to notice, that is, it gave up sooner, for other reasons. This PR improves that function to do a much better check, which makes the fix necessary to prevent regressions. This is not terribly important for itself, but will help a later PR that will add code that depends more heavily on areConsecutiveInputsEqual().
*	Fixes regarding explicit names (#6466)	Jérôme Vouillon	2024-04-11	2	-4/+16
\| \| \| \| \| \| \|	- Only write explicit function names. - When merging modules, the name of types, globals and tags in all modules but the first were lost. - Set name as explicit when copying a function with a new name.
*	GUFA: Fix signed reads of packed GC data (#6494)	Alon Zakai	2024-04-11	2	-1/+73
\| \| \| \| \| \| \|	GUFA already truncated packed fields on write, which is enough for unsigned gets, but for signed gets we also need to sign them on reads. Similar to #6493 but for GUFA. Also found by #6486
*	Fix ConstantFieldPropagation signed packed field handling and improve ↵	Alon Zakai	2024-04-11	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Heap2Local's (#6493) CFP already had logic for truncating but not for sign-extending, which this fixes. Use the new helper function in Heap2Local as well. This changes the model there from "truncate on set, sign-extend on get" to "truncate or sign-extend on get". That is both simpler by reusing the same logic as CFP but also more optimal: the idea to truncate on sets made sense since sets are rarer, but if we must then sign-extend on gets then we can end up doing more work overall (as the truncations on sets are not needed if all gets are signed). Found by #6486
*	Asyncify: Fix nondeterminism in verbose logging (#6479)	Alon Zakai	2024-04-09	1	-0/+7
\| \| \| \|	#6457 added a test that exposed existing nondeterminism.
*	Handle return calls correctly	Thomas Lively	2024-04-08	2	-35/+93
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a combined commit covering multiple PRs fixing the handling of return calls in different areas. The PRs are all landed as a single commit to ensure internal consistency and avoid problems with bisection. Original PR descriptions follow: * Fix inlining of `return_call` (#6448) Previously we transformed return calls in inlined function bodies into normal calls followed by branches out to the caller code. Similarly, when inlining a `return_call` callsite, we simply added a `return` after the body inlined at the callsite. These transformations would have been correct if the semantics of return calls were to call and then return, but they are not correct for the actual semantics of returning and then calling. The previous implementation is observably incorrect for return calls inside try blocks, where the previous implementation would run the inlined body within the try block, but the proper semantics would be to run the inlined body outside the try block. Fix the problem by transforming inlined return calls to branches followed by calls rather than as calls followed by branches. For the case of inlined return call callsites, insert branches out of the original body of the caller and inline the body of the callee as a sibling of the original caller body. For the other case of return calls appearing in inlined bodies, translate the return calls to branches out to calls inserted as siblings of the original inlined body. In both cases, it would have been convenient to use multivalue block return to send call parameters along the branches to the calls, but unfortunately in our IR that would have required tuple-typed scratch locals to unpack the tuple of operands at the call sites. It is simpler to just use locals to propagate the operands in the first place. Fix interpretation of `return_call` (#6451) We previously interpreted return calls as calls followed by returns, but that is not correct both because it grows the size of the execution stack and because it runs the called functions in the wrong context, which can be observable in the case of exception handling. Update the interpreter to handle return calls correctly by adding a new `RETURN_CALL_FLOW` that behaves like a return, but carries the arguments and reference to the return-callee rather than normal return values. `callFunctionInternal` is updated to intercept this flow and call return-called functions in a loop until a function returns with some other kind of flow. Pull in the upstream spec tests return_call.wast, return_call_indirect.wast, and return_call_ref.wast with light editing so that we parse and validate them successfully. Handle return calls in wasm-ctor-eval (#6464) When an evaluated export ends in a return call, continue evaluating the return-called function. This requires propagating the parameters, handling the case that the return-called function might be an import, and fixing up local indices in case the final function has different parameters than the original function. * Update effects.h to handle return calls correctly (#6470) As far as their surrounding code is concerned return calls are no different from normal returns. It's only from a caller's perspective that a function containing a return call also has the effects of the return-callee. To model this more precisely in EffectAnalyzer, stash the throw effect of return-callees on the side and only merge it in at the end when analyzing the effects of a full function body.
*	Asyncify-verbose: Show all reasons why a function is instrumented (#6457)	Dannii Willis	2024-04-08	1	-8/+16
\| \| \| \|	Helps emscripten-core/emscripten#17380 by logging all the reasons why we instrument a function, and not just the first as we did before.
*	[NFC] Refactor Heap2Local logic (#6473)	Alon Zakai	2024-04-06	2	-3/+13
\| \| \| \| \| \|	Separate out an EscapeAnalyzer class that does the escape analysis, and a Struct2Local one that does the optimization. Also make a few things const here to be safer.
*	[NFC] Remove unused variables (#6475)	Thomas Lively	2024-04-05	1	-2/+2
\| \| \|	These were causing build failures on the Emscripten builder.
*	[NFC] Generalize and simplify wasm-delegations-fields.h (#6465)	Alon Zakai	2024-04-03	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This removes the hard-coded generation of a switch and cases, and allows the user to define the boilerplate at the start and end of the main output, and of what is generated for each expression. By default we still emit a switch and cases. Also standardize the output by never emitting ; unnecessarily, which we were inconsistent about. This serves two goals: First, it will make using embind on Binaryen simpler as embind needs to generate C++ template logic for each expression, and not a switch (and we cannot have extra ; in embind notation). Second, this makes the format much simple to parse, which is a stepping stone for #6460, e.g. before we had case Expression::Id::LoopId: { DELEGATE_START(Loop); DELEGATE_FIELD_CHILD(Loop, body); DELEGATE_FIELD_SCOPE_NAME_DEF(Loop, name); DELEGATE_END(Loop); break; } and now we have DELEGATE_FIELD_CASE_START(Loop) DELEGATE_FIELD_CHILD(Loop, body) DELEGATE_FIELD_SCOPE_NAME_DEF(Loop, name) DELEGATE_FIELD_CASE_END(Loop) The main part of this diff was autogenerated by this python: for l in x.splitlines(): if l.startswith(' case'): id = l.split(':')[4][:-2] print(f'DELEGATE_FIELD_CASE_START({id})') if l.startswith(' DELEGATE_FIELD'): print(l) if l.startswith(' DELEGATE_END'): id = l[17:-2] print(f'DELEGATE_FIELD_CASE_END({id})') print()
*	GUFA: Fix hashing of GlobalInfo's type (#6455)	Alon Zakai	2024-03-29	1	-2/+6
\| \| \| \| \| \| \| \| \|	For a global we store the name and a type, and the type may be more precise than the global's type in the wasm. As a result, when hashing, it is not enough to hash only the name, so hash the type as well. Also add a random TODO as a comment.
*	GUFA: Fix nondeterminism in roots (#6456)	Alon Zakai	2024-03-29	1	-1/+4
\| \| \| \| \|	Found by the fuzzer. We already processed the work queue in a deterministic order, but the roots were unordered. The work queue's initial state is filled by the roots, so we must process the roots deterministically as well.
*	Typed continuations: suspend instructions (#6393)	Frank Emrich	2024-03-19	5	-1/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This PR is part of a series that adds basic support for the [typed continuations/wasmfx proposal](https://github.com/wasmfx/specfx). This particular PR adds support for the `suspend` instruction for suspending with a given tag, documented [here](https://github.com/wasmfx/specfx/blob/main/proposals/continuations/Overview.md#instructions). These instructions are of the form `(suspend $tag)`. Assuming that `$tag` is defined with _n_ `param` types `t_1` to `t_n`, the instruction consumes _n_ arguments of types `t_1` to `t_n`. Its result type is the same as the `result` type of the tag. Thus, the folded textual representation looks like `(suspend $tag arg1 ... argn)`. Support for the instruction is implemented in both the old and the new wat parser. Note that this PR does not implement validation of the new instruction. This PR also fixes finalization of `cont.new`, `cont.bind` and `resume` nodes in those cases where any of their children are unreachable.
*	[NFC] Refactor ChildLocalizer to handle unreachable code better (#6394)	Alon Zakai	2024-03-14	1	-17/+85
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is NFC in the current users, but is necessary functionality for a later PR. ChildLocalizer moves children into locals as needed. It used to stop when it saw the first unreachable. After this change we move such unreachable children out of the parent as well, making this more uniform: all interacting effects are moved out, and all that is left nested in the parent can be moved around and removed as desired. Also add a getReplacement helper that makes using this easier. This cannot be tested comprehensively with the current user as that user will not call this code path on an unreachable parent at all, so this just adds what can be tested. The later PR will have tests for all corner cases.
*	Fix EH fuzz bugs (#6381)	Thomas Lively	2024-03-07	1	-1/+1
\| \| \| \| \|	Due to a typo, the fuzzer was making externrefs when it should have been making exnrefs. Fix that and also let eh-utils.cpp know that TryTable exists to avoid an assertion failure.
*	Add sourcemap support to wasm-metadce and wasm-merge (#6372)	Jérôme Vouillon	2024-03-06	3	-6/+68
\|
*	Typed continuations: cont.bind instructions (#6365)	Frank Emrich	2024-03-04	6	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This PR is part of a series that adds basic support for the [typed continuations/wasmfx proposal](https://github.com/wasmfx/specfx). This particular PR adds support for the `cont.bind` instruction for partially applying continuations, documented [here](https://github.com/wasmfx/specfx/blob/main/proposals/continuations/Overview.md#instructions). In short, these instructions are of the form `(cont.bind $ct_before $ct_after)` where `$ct_before` and `$ct_after` are related continuation types. They must only differ in the number of arguments, where `$ct_before` has _n_ additional parameters as compared to `$ct_after`, for some _n_ ≥ 0. The idea is that `(cont.bind $ct_before $ct_after)` then takes a reference to a continuation of type `$ct_before` as well as _n_ operands and returns a (reference to a) continuation of type `$ct_after`. Thus, the folded textual representation looks like `(cont.bind $ct_before $ct_after arg1 ... argn c)`. Support for the instruction is implemented in both the old and the new wat parser. Note that this PR does not implement validation of the new instruction.
*	[NFC] Add some comments about flow in SubtypingDiscoverer and Unsubtyping ↵	Alon Zakai	2024-02-28	1	-0/+7
\| \| \| \| \| \| \|	(#6359) I audited all of SubtypingDiscoverer for flow/non-flow constraints and added some comments to clarify things for our future selves if we ever need to generalize it.