forks/binaryen.git -

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[threads] Shared basic heap types (#6667)	Thomas Lively	2024-06-19	3	-89/+138
\| \| \| \| \| \| \| \| \| \| \|	Implement binary and text parsing and printing of shared basic heap types and incorporate them into the type hierarchy. To avoid the massive amount of code duplication that would be necessary if we were to add separate enum variants for each of the shared basic heap types, use bit 0 to indicate whether the type is shared and replace `getBasic()` with `getBasic(Unshared)`, which clears that bit. Update all the use sites to record whether the original type was shared and produce shared or unshared output without code duplication.
*	[Parser] Fix bug in unreachable fallback logic (#6676)	Thomas Lively	2024-06-18	1	-13/+18
\| \| \| \| \| \| \| \| \|	When popping past an unreachable instruction would lead to popping from an empty stack or popping an incorrect type, we need to avoid popping and produce new Unreachable instructions instead to ensure we parse valid IR. The logic for this was flawed and made the synthetic Unreachable come before the popped unreachable child, which was not correct in the case that that popped unreachable was a branch or other non-trapping instruction. Fix and simplify the logic and re-enable the spec test that uncovered the bug.
*	Reject invalid section IDs (#6675)	Thomas Lively	2024-06-18	1	-6/+7
\| \| \| \| \| \|	Rather than treating them as custom sections. Also fix UB where invalid `Section` enum values could be used as keys in a map. Use the raw `uint8_t` section IDs as keys instead. Re-enable a disabled spec test that was failing because of this bug and UB.
*	Fix DataSegment name handling (#6673)	Alon Zakai	2024-06-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The code used i instead of index, as in this pseudocode: for i in range(num_names): index = readU32LEB() # index of the data segment to name name = readName() # name to give that segment data[i] = name # XXX 'i' should be 'index' That (funnily enough) happened to always work before since we write names in order. That is, normally given segments A,B,C we'd write then in the names section as A,B,C. Then the reader, which had the bug, would always have i and index identical in value anyhow. But if a wasm producer used different indexes, a problem could happen. To test this, add a binary file that has a reversed name section. Fixes #6672
*	[threads] Binary reading and writing of shared composite types (#6664)	Thomas Lively	2024-06-14	1	-0/+7
\| \| \| \|	Also update the parser so that implicit type uses are not matched with shared function types.
*	[threads] Add a "shared-everything" feature (#6658)	Thomas Lively	2024-06-14	4	-5/+39
\| \| \| \| \|	Add the feature and flags to enable and disable it. Require the new feature to be enabled for shared heap types to validate. To make the test work, update the validator to actually check features for global types.
*	[threads] Parse, build, and print shared composite types (#6654)	Thomas Lively	2024-06-12	1	-0/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Parse the text format for shared composite types as described in the shared-everything thread proposal. Update the parser to use 'comptype' instead of 'strtype' to match the final GC spec and add the new syntactic class 'sharecomptype'. Update the type canonicalization logic to take sharedness into account to avoid merging shared and unshared types. Make the same change in the TypeMerging pass. Ensure that shared and unshared types cannot be in a subtype relationship with each other. Follow-up PRs will add shared abstract heap types, binary parsing and emitting for shared types, and fuzzer support for shared types.
*	Fix scratch local optimizations when emitting string slice (#6649)	Thomas Lively	2024-06-11	1	-31/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The binary writing of `stringview_wtf16.slice` requires scratch locals to store the `start` and `end` operands while the string operand is converted to a stringview. To avoid unbounded binary bloat when round-tripping, we detect the case that `start` and `end` are already `local.get`s and avoid using scratch locals by deferring the binary writing of the `local.get` operands until after the stringview conversoins is emitted. We previously optimized the scratch locals for `start` and `end` independently, but this could produce incorrect code in the case where the `local.get` for `start` is deferred but its value is changed by a `local.set` in the code for `end`. Fix the problem by only optimizing to avoid scratch locals in the case where both `start` and `end` are already `local.get`s, so they will still be emitted in the original relative order and they cannot interfere with each other anyway.
*	Fix binary parser of declarative element segments (#6618)	Rikito Taniguchi	2024-06-03	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The parser was incorrectly handling the parsing of declarative element segments whose `init` is a `vec(expr)`. https://webassembly.github.io/spec/core/binary/modules.html#element-section Binry parser was simply reading a single `u32LEB` value for `init` instead of parsing a expression regardless `usesExpressions = true`. This commit updates the `WasmBinaryReader::readElementSegments` function to correctly parse the expressions for declarative element segments by calling `readExpression` instead of `getU32LEB` when `usesExpressions = true`. Resolves the parsing exception: "[parse exception: bad section size, started at ... not being equal to new position ...]" Related discussion: https://github.com/tanishiking/scala-wasm/issues/136
*	Remove obsolete parser code (#6607)	Thomas Lively	2024-05-29	3	-4137/+3
\| \| \| \| \|	Remove `SExpressionParser`, `SExpressionWasmBuilder`, and `cashew::Parser`. Simplify gen-s-parser.py. Remove the --new-wat-parser and --deprecated-wat-parser flags.
*	Fix fuzzer generation of a DataSegment + add validation that would have ↵	Alon Zakai	2024-05-23	1	-2/+56
\| \| \| \| \| \| \| \| \| \|	caught it (#6626) The DataSegment was manually added to .dataSegments, but we need to add it using addDataSegment so the maps are updated and getDataSegment(name) works. Also add validation that would have caught this earlier: check that each item in the item lists can be fetched by name.
*	Fuzzer: Better fuzzing of globals (#6611)	Alon Zakai	2024-05-21	1	-1/+16
\| \| \| \| \| \| \| \| \| \| \| \| \|	With this PR we generate global.gets in globals, which we did not do before. We do that by replacing makeConst (the only thing we did before, for the contents of globals) with makeTrivial, and add code to makeTrivial to sometimes make a global.get. When no suitable global exists, makeGlobalGet will emit a constant, so there is no danger in trying. Also raise the number of globals a little. Also explicitly note the current limitation of requiring all tuple globals to contain tuple.make and nothing else, including not global.get, and avoid adding such invalid global.gets in tuple globals in the fuzzer.
*	[table64] Preserve 64-bit table flag when writing binaries (#6610)	Sam Clegg	2024-05-20	1	-1/+1
\|
*	Rewrite wasm-shell to use new wast parser (#6601)	Thomas Lively	2024-05-17	2	-1/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use the new wast parser to parse a full script up front, then traverse the parsed script data structure and execute the commands. wasm-shell had previously used the new wat parser for top-level modules, but it now uses the new parser for module assertions as well. Fix various bugs this uncovered. After this change, wasm-shell supports all the assertions used in the upstream spec tests (although not new kinds of assertions introduced in any proposals). Uncomment various `assert_exhaustion` tests that we can now execute. Other kinds of assertions remain commented out in our tests: wasm-shell now supports `assert_unlinkable`, but the interpreter does not eagerly check for the existence of imports, so those tests do not pass. Tests that check for NaNs also remain commented out because they do not yet use the standard syntax that wasm-shell now supports for canonical and arithmetic NaN results, and our interpreter would not pass all of those tests even if they did use the standard syntax.
*	Fix GlobalRefining's handling of gets in module code and add missing ↵	Alon Zakai	2024-05-17	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \|	validation (#6603) GlobalRefining did not traverse module code, so it did not update global.gets in other globals. Add missing validation that actually errors on that: We did not check global.get types. These could be separate PRs but it would be difficult to test them separately.
*	Fix binary emitting of br_if with a refined value by emitting a cast (#6510)	Alon Zakai	2024-05-16	1	-2/+157
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This makes us compliant with the wasm spec by adding a cast: we use the refined type for br_if fallthrough values, and the wasm spec uses the branch target. If the two differ, we add a cast after the br_if to make things match. Alternatively we could match the wasm spec's typing in our IR, but we hope the wasm spec will improve here, and so this is will only be temporary in that case. Even if not, this is useful because by using the most refined type in the IR we optimize in the best way possible, and only suffer when we emit fixups in the binary, but in practice those cases are very rare: br_if is almost always dropped rather than used, in real-world code (except for fuzz cases and exploits). We check carefully when a br_if value is actually used (and not dropped) and its type actually differs, and it does not already have a cast. The last condition ensures that we do not keep adding casts over repeated roundtripping.
*	Add table64 lowering pass (#6595)	Sam Clegg	2024-05-15	2	-29/+15
\| \| \| \| \|	Changes to wasm-validator.cpp here are mostly for consistency between elem and data segment validation.
*	[Strings] Remove operations not included in imported strings (#6589)	Thomas Lively	2024-05-15	6	-266/+67
\| \| \| \| \| \|	The stringref proposal has been superseded by the imported JS strings proposal, but the former has many more operations than the latter. To reduce complexity, remove all operations that are part of stringref but not part of imported strings.
*	[Strings] Remove stringview types and instructions (#6579)	Thomas Lively	2024-05-15	9	-462/+175
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The stringview types from the stringref proposal have three irregularities that break common invariants and require pervasive special casing to handle properly: they are supertypes of `none` but not subtypes of `any`, they cannot be the targets of casts, and they cannot be used to construct nullable references. At the same time, the stringref proposal has been superseded by the imported strings proposal, which does not have these irregularities. The cost of maintaing and improving our support for stringview types is no longer worth the benefit of supporting them. Simplify the code base by entirely removing the stringview types and related instructions that do not have analogues in the imported strings proposal and do not make sense in the absense of stringviews. Three remaining instructions, `stringview_wtf16.get_codeunit`, `stringview_wtf16.slice`, and `stringview_wtf16.length` take stringview operands in the stringref proposal but cannot be removed because they lower to operations from the imported strings proposal. These instructions are changed to take stringref operands in Binaryen IR, and to allow a graceful upgrade path for users of these instructions, the text and binary parsers still accept but ignore `string.as_wtf16`, which is the instruction used to convert stringrefs to stringviews. The binary writer emits code sequences that use scratch locals and `string.as_wtf16` to keep the output valid. Future PRs will further align binaryen with the imported strings proposal instead of the stringref proposal, for example by making `string` a subtype of `extern` instead of a subtype of `any` and by removing additional instructions that do not have analogues in the imported strings proposal.
*	Remove redundant ptrType from MemorySize/Grow instructions. NFC (#6590)	Sam Clegg	2024-05-15	3	-9/+5
\| \| \| \|	I recently add TableSize/Grow and noticed I didn't need these. It seems they are superfluous.
*	Source maps: Allow specifying that an expression has no debug info in text ↵	Jérôme Vouillon	2024-05-14	3	-23/+46
\| \| \| \| \| \| \| \| \| \| \| \|	(#6520) ;;@ with nothing else (no source:line) can be used to specify that the following expression does not have any debug info associated to it. This can be used to stop the automatic propagation of debug info in the text parsers. The text printer has also been updated to output this comment when needed.
*	Simplify scratch local calculation (#6583)	Thomas Lively	2024-05-13	1	-52/+66
\| \| \| \| \| \| \| \| \| \| \|	Change `countScratchLocals` to return the count and type of necessary scratch locals. It used to encode them as keys in the global map from scratch local types to local indices, which could not handle having more than one scratch local of a given type and was generally harder to reason about due to its use of global state. Take the opportunity to avoid emitting unnecessary scratch locals for `TupleExtract` expressions that will be optimized to not use them. Also simplify and better document the calculation of the mapping from IR indices to binary indices for all locals, scratch and non-scratch.
*	[memory64] Add table64 to existing memory64 support (#6577)	Sam Clegg	2024-05-10	2	-38/+67
\| \| \| \| \| \| \|	Tests is still very limited. Hopefully we can use the upstream spec tests soon and avoid having to write our own tests for `.set/.set/.fill/etc`. See https://github.com/WebAssembly/memory64/issues/51
*	[StackIR] Run StackIR during binary writing and not as a pass (#6568)	Alon Zakai	2024-05-09	5	-5/+527
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously we had passes --generate-stack-ir, --optimize-stack-ir, --print-stack-ir that could be run like any other passes. After generating StackIR it was stashed on the function and invalidated if we modified BinaryenIR. If it wasn't invalidated then it was used during binary writing. This PR switches things so that we optionally generate, optimize, and print StackIR only during binary writing. It also removes all traces of StackIR from wasm.h - after this, StackIR is a feature of binary writing (and printing) logic only. This is almost NFC, but there are some minor noticeable differences: 1. We no longer print has StackIR in the text format when we see it is there. It will not be there during normal printing, as it is only present during binary writing. (but --print-stack-ir still works as before; as mentioned above it runs during writing). 2. --generate/optimize/print-stack-ir change from being passes to being flags that control that behavior instead. As passes, their order on the commandline mattered, while now it does not, and they only "globally" affect things during writing. 3. The C API changes slightly, as there is no need to pass it an option "optimize" to the StackIR APIs. Whether we optimize is handled by --optimize-stack-ir which is set like other optimization flags on the PassOptions object, so we don't need the old option to those C APIs. The main benefit here is simplifying the code, so we don't need to think about StackIR in more places than just binary writing. That may also allow future improvements to our usage of StackIR.
*	[validator] Remove indexType helper function (#6576)	Sam Clegg	2024-05-09	1	-23/+18
\| \| \| \|	It seems like that each of the callsites already has looked up the `Memory` object so this helper is not doing anything useful.
*	Allow DWARF and multivalue together (#6570)	Heejin Ahn	2024-05-06	1	-9/+29
\| \| \| \| \| \| \| \| \|	This allows writing of binaries with DWARF info when multivalue is enabled. Currently we just crash when both are enabled together. This just assumes, unless we have run DWARF-invalidating passes, all locals added for tuples or scratch locals would have been added at the end of the local list, so just printing all locals in order would preserve the DWARF info. Tuple locals are expanded in place and scratch locals are added at the end.
*	Source map fixes (#6550)	Jérôme Vouillon	2024-05-02	2	-23/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Keep debug locations at function start The `fn_prolog_epilog.debugInfo` test is failing otherwise, since there was debug information associated to the nop instruction at the beginning of the function. * Do not clear the debug information when reaching the end of the source map The last segment should extend to the end of the function. * Propagate debug location from the function prolog to its first instruction * Fix printing of epilogue location The text parser no longer propagates locations to the epilogue, so we should always print the location if there is one. * Fix debug location smearing The debug location of the last instruction should not smear into the function epilogue, and a debug location from a previous function should not smear into the prologue of the current function.
*	Respect the Web limitation on Table size (#6567)	Alon Zakai	2024-05-01	1	-0/+1
\| \| \| \| \|	Without this the fuzzer can error on differences in behavior between V8 and us. Also move the limitations constants to their own header.
*	[StackIR] Support source maps and DWARF with StackIR (#6564)	Alon Zakai	2024-05-01	2	-2/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Helping #6509, this fixes debugging support for StackIR, which makes it more possible to use StackIR in more places. The fix is basically just to pass around some more state, and then to call the parent with "please write debug info" at the correct times, mirroring the similar calls in BinaryenIRWriter. The relevant Emscripten tests pass, and the source map test modified here produces identical output in StackIR and non-StackIR modes (the test is also simplified to remove --new-wat-parser which is no longer needed, after which the test can clearly show that StackIR has the same output as BinaryenIR).
*	[Parser] Re-use blocks instead of wrapping where possible (#6552)	Thomas Lively	2024-04-29	1	-6/+11
\| \| \| \| \| \| \|	When the input has branches to block scope, IR builder generally has to add a wrapper block with a label name for the branch to target. To reduce the parsed IR size, add a special case for when the wrapped expression is already an unnamed block. In that case we can simply add the label to the existing block instead of creating a new wrapper block.
*	[Strings] Work around ref.cast not working on string views, and add fuzzing ↵	Alon Zakai	2024-04-29	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \|	(#6549) As suggested in #6434 (comment) , lower ref.cast of string views to ref.as_non_null in binary writing. It is a simple hack that avoids the problem of V8 not allowing them to be cast. Add fuzzing support for the last three core string operations, after which that problem becomes very frequent. Also add yet another makeTrappingRefUse that was missing in that fuzzer code.
*	Improve return validation (#6551)	Thomas Lively	2024-04-29	1	-10/+18
\| \| \| \|	Disallow returns from having any children, even unreachable children, in function that do not return any values.
*	Fix a bug with unreachable control flow in IRBuilder (#6558)	Jérôme Vouillon	2024-04-29	1	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \|	When branches target control flow structures other than blocks or loops, the IRBuilder wraps those control flow structures with an extra block for the branches to target in Binaryen IR. When the control flow structure is unreachable because all its bodies are unreachable, the wrapper block may still need to have a non-unreachable type if it is targeted by branches. This is achieved by tracking whether the wrapper block will be targeted by any branches and use the control flow structure's original, non-unreachable type if so. However, this was not properly tracked when moving into the `else` branch of an `if` or the `catch`/`cath_all` handlers of a `try` block.
*	[Parser] Do not eagerly lex parens (#6540)	Thomas Lively	2024-04-25	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \|	The lexer currently lexes tokens eagerly and stores them in a `Token` variant ahead of when they are actually requested by the parser. It is wasteful, however, to classify tokens before they are requested by the parser because it is likely that the next token will be precisely the kind the parser requests. The work of checking and rejecting other possible classifications ahead of time is not useful. To make incremental progress toward removing `Token` completely, lex parentheses on demand instead of eagerly.
*	[Parser] Enable the new text parser by default (#6371)	Thomas Lively	2024-04-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new text parser is faster and more standards compliant than the old text parser. Enable it by default in wasm-opt and update the tests to reflect the slightly different results it produces. Besides following the spec, the new parser differs from the old parser in that it: - Does not synthesize `loop` and `try` labels unnecessarily - Synthesizes different block names in some cases - Parses exports in a different order - Parses `nop`s instead of empty blocks for empty control flow arms - Does not support parsing Poppy IR - Produces different error messages - Cannot parse `pop` except as the first instruction inside a `catch`
*	Do not add an extra null character when reading files (#6538)	Thomas Lively	2024-04-24	1	-2/+1
\| \| \| \| \| \| \| \|	The new wat parser currently considers itself to be at the end of the file whenever it cannot lex another token. This is not quite right, but fixing it causes parser errors because of the extra null character we were appending to files when we read them. This null character is not useful since we can already read files as `std::string`, which always has an implicit null character, so remove it. Clean up some users of `read_file` while we're at it.
*	[Strings] Fix finalize() of StringNew on arrays (#6511)	Alon Zakai	2024-04-18	1	-1/+3
\|
*	[Parser] Preserve try labels (#6505)	Thomas Lively	2024-04-17	1	-37/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the standard text format, try scopes can be targeted by both normal branches and delegates, but in Binaryen IR we only allow them to be targeted by delegates, so we have to translate branches to try scopes into branches to wrapper blocks instead. These wrapper blocks must have different names than the try expressions they wrap, so we actually need to track two label names for try expressions: one for delegates and another for normal branches. We previously tried to avoid this complexity by tracking only the branch label and computing the delegate label from the branch label as necessary, but that produced unnecessary wrapper blocks and ugly label names that did not appear in the source. To produce better IR and minimize the diff when switching to the new text parser, bite the bullet and track the delegate and branch label names separately. This eliminates unnecessary wrapper blocks and keeps try names the same as in the wat source where possible.
*	[Parser] Match legacy parser block naming (#6504)	Thomas Lively	2024-04-16	1	-1/+5
\| \| \| \|	To reduce the size of the test output diff when switching to the new text parser, update it to generate the same block names as the legacy parser.
*	[Parser] Pop past unreachables where possible (#6489)	Thomas Lively	2024-04-16	2	-460/+510
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We previously would eagerly drop all concretely typed expressions on the stack when pushing an unreachable instruction. This was semantically correct and closely modeled the semantics of unreachable instructions, which implicitly drop the entire stack and start a new polymorphic stack. However, it also meant that the structure of the parsed IR did not match the structure of the folded input, which meant that tests involving unreachable children would not parse as intended, preventing the test from testing the intended behavior. For example, this wat: ```wasm (i32.add (i32.const 0) (unreachable) ) ``` Would previously parse into this IR: ```wasm (drop (i32.const 0) ) (i32.add (unreachable) (unreachable) ) ``` To fix this problem, we need to stop eagerly dropping stack values when encountering an unreachable instruction so we can still pop expressions pushed before the unreachable as direct children of later instructions. In the example above, we need to keep the `i32.const 0` on the stack so it is available to be popped and become a child of the `i32.add`. However, the naive solution of simply popping past unreachables would produce invalid IR in some cases. For example, consider this wat: ```wasm f32.const 0 unreachable i32.add ``` The naive solution would parse this wat into this IR: ```wasm (i32.add (f32.const 0) (unreachable) ) ``` But we do not want to parse an `i32.add` with an `f32`-typed child. Neither do we want to reject this input, since it is a perfectly valid Wasm fragment. In this case, we actually want the old behavior of dropping the `f32.const` and replacing it with another `unreachable` as the first child of the `i32.add`. To both match the input structure where possible and also gracefully fall back to the old behavior of dropping expressions prior to the unreachable, collect constraints on the types of each child for each kind of expression and compare them to the types of available expressions on the stack when an unreachable instruction will be popped. When the constraints are satisfied, pop expressions normally, even after popping the unreachable instruction. Otherwise, drop the instructions that precede the unreachable instruction to ensure we parse valid IR. To collect the constraints, add a new `ChildTyper` utility that calls a different callback for each kind of possible type constraint for each child. In the future, this utility can be used to simplify the validator as well.
*	[Strings] Add a string lowering pass using magic imports (#6497)	Thomas Lively	2024-04-15	1	-6/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The latest idea for efficient string constants is to encode the constants in the import names of their globals and implement fast paths in the engines for materializing those constants at instantiation time without needing to parse anything in JS. This strategy only works for valid strings (i.e. strings without unpaired surrogates) because only valid strings can be used as import names in the WebAssembly syntax. Add a new configuration of the StringLowering pass that encodes valid string contents in import names, falling back to the JSON custom section approach for invalid strings. To test this chang, update the printer to escape import and export names properly and update the legacy parser to parse escapes in import and export names properly. As a drive-by, remove the incorrect check in the parser that the import module and base names are non-empty.
*	Fixes regarding explicit names (#6466)	Jérôme Vouillon	2024-04-11	2	-12/+21
\| \| \| \| \| \| \|	- Only write explicit function names. - When merging modules, the name of types, globals and tags in all modules but the first were lost. - Set name as explicit when copying a function with a new name.
*	[Parser] Use unreachables to fulfill tuple requirements (#6488)	Thomas Lively	2024-04-11	1	-1/+3
\| \| \| \| \| \| \|	When we need to pop a tuple and the top value on the stack is unreachable, just pop the unreachable rather than producing a tuple.make. This always produces valid IR since an unreachable is always valid where a tuple would otherwise be expected. It also avoids bloating the parsed IR, since we would previously parse a `tuple.make` where all the children were unreachable in this case.
*	Handle return calls correctly	Thomas Lively	2024-04-08	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a combined commit covering multiple PRs fixing the handling of return calls in different areas. The PRs are all landed as a single commit to ensure internal consistency and avoid problems with bisection. Original PR descriptions follow: * Fix inlining of `return_call` (#6448) Previously we transformed return calls in inlined function bodies into normal calls followed by branches out to the caller code. Similarly, when inlining a `return_call` callsite, we simply added a `return` after the body inlined at the callsite. These transformations would have been correct if the semantics of return calls were to call and then return, but they are not correct for the actual semantics of returning and then calling. The previous implementation is observably incorrect for return calls inside try blocks, where the previous implementation would run the inlined body within the try block, but the proper semantics would be to run the inlined body outside the try block. Fix the problem by transforming inlined return calls to branches followed by calls rather than as calls followed by branches. For the case of inlined return call callsites, insert branches out of the original body of the caller and inline the body of the callee as a sibling of the original caller body. For the other case of return calls appearing in inlined bodies, translate the return calls to branches out to calls inserted as siblings of the original inlined body. In both cases, it would have been convenient to use multivalue block return to send call parameters along the branches to the calls, but unfortunately in our IR that would have required tuple-typed scratch locals to unpack the tuple of operands at the call sites. It is simpler to just use locals to propagate the operands in the first place. Fix interpretation of `return_call` (#6451) We previously interpreted return calls as calls followed by returns, but that is not correct both because it grows the size of the execution stack and because it runs the called functions in the wrong context, which can be observable in the case of exception handling. Update the interpreter to handle return calls correctly by adding a new `RETURN_CALL_FLOW` that behaves like a return, but carries the arguments and reference to the return-callee rather than normal return values. `callFunctionInternal` is updated to intercept this flow and call return-called functions in a loop until a function returns with some other kind of flow. Pull in the upstream spec tests return_call.wast, return_call_indirect.wast, and return_call_ref.wast with light editing so that we parse and validate them successfully. Handle return calls in wasm-ctor-eval (#6464) When an evaluated export ends in a return call, continue evaluating the return-called function. This requires propagating the parameters, handling the case that the return-called function might be an import, and fixing up local indices in case the final function has different parameters than the original function. * Update effects.h to handle return calls correctly (#6470) As far as their surrounding code is concerned return calls are no different from normal returns. It's only from a caller's perspective that a function containing a return call also has the effects of the return-callee. To model this more precisely in EffectAnalyzer, stash the throw effect of return-callees on the side and only merge it in at the end when analyzing the effects of a full function body.
*	Typed continuations: nocont and cont basic heap types (#6468)	Frank Emrich	2024-04-04	4	-6/+76
\| \| \| \| \| \| \| \|	This PR is part of a series that adds basic support for the typed continuations/wasmfx proposal. This particular PR adds cont and nocont as top and bottom types for continuation types, completely analogous to func and nofunc for function types (also: exn and noexn).
*	Fix writing of data segment names in name section (#6462)	Jérôme Vouillon	2024-04-02	1	-2/+2
\| \| \| \|	- Output segment names even when no memory is declared. - Only write explicit names.
*	Fix parsing of table imports (#6446)	Jérôme Vouillon	2024-03-27	1	-3/+6
\| \| \|	The types was ignored and funcref was always used instead.
*	Fix stringview subtyping (#6440)	Thomas Lively	2024-03-26	1	-7/+30
\| \| \| \| \| \|	The stringview types (`stringview_wtf8`, `stringview_wtf16`, and `stringview_iter`) are not subtypes of `any` even though they are supertypes of `none`. This breaks the type system invariant that types share a bottom type iff they share a top type, but we can work around that.
*	[Strings] Escape strings printed by fuzz-exec (#6441)	Thomas Lively	2024-03-26	1	-6/+5
\| \| \| \| \| \| \| \| \| \| \|	Previously we printed strings as WTF-8 in the output of fuzz-exec, but this could produce invalid unicode output and did not make unprintable characters visible. Fix both these problems by escaping the output, using the JSON string escape procedure since the string to be escaped is WTF-16. Reimplement the same escaping procedure in fuzz_shell.js so that the way we print strings when running on a real JS engine matches the way we print them in our own fuzz-exec interpreter. Fixes #6435.
*	[Strings] Represent string values as WTF-16 internally (#6418)	Thomas Lively	2024-03-22	3	-8/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	WTF-16, i.e. arbitrary sequences of 16-bit values, is the encoding of Java and JavaScript strings, and using the same encoding makes the interpretation of string operations trivial, even when accounting for non-ascii characters. Specifically, use little-endian WTF-16. Re-encode string constants from WTF-8 to WTF-16 in the parsers, then back to WTF-8 in the writers. Update the constructor for string `Literal`s to interpret the string as WTF-16 and store a sequence of WTF-16 code units, i.e. 16-bit integers. Update `Builder::makeConstantExpression` accordingly to convert from the new `Literal` string representation back to a WTF-16 string. Update the interpreter to remove the logic for detecting non-ascii characters and bailing out. The naive implementations of all the string operations are correct now that our string encoding matches the JS string encoding.