summaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
...
* [EH] Fix missing outer block for catchless try (#6519)Heejin Ahn2024-04-241-8/+21
| | | | | | | | | | | | | | | | | | | When translating a `try` expression, we may need an 'outer' block that wraps the newly generated `try_table` so we can jump out of the expression when an exception does not occur. (The condition we use is when the `try` has any catches or if the `try` is a target of any inner `try-delegate`s: https://github.com/WebAssembly/binaryen/blob/219e668e87b012c0634043ed702534b8be31231f/src/passes/TranslateEH.cpp#L677) In case the `try` has either of `catch` or `delegate`, when we have the 'outer' block, we add the newly created `try_table` in the 'outer' block and replace the whole expression with the block: https://github.com/WebAssembly/binaryen/blob/219e668e87b012c0634043ed702534b8be31231f/src/passes/TranslateEH.cpp#L670 https://github.com/WebAssembly/binaryen/blob/219e668e87b012c0634043ed702534b8be31231f/src/passes/TranslateEH.cpp#L332-L340 But in case of a catchless `try`, we forgot to do that: https://github.com/WebAssembly/binaryen/blob/219e668e87b012c0634043ed702534b8be31231f/src/passes/TranslateEH.cpp#L388 So this PR fixes it.
* Precompute: Ignore mutable arrays in StringNew (#6522)Alon Zakai2024-04-231-0/+22
| | | | | All Struct/Array operations must ignore mutable fields in Precompute atm, which we did, but StringNew has an array variant that does an effective ArrayGet operation, which we didn't handle.
* OptimizeInstructions: Optimize subsequent struct.sets after ↵Alon Zakai2024-04-231-12/+21
| | | | | | | | | struct.new_with_default (#6523) Before we preferred not to add default values, as that increases code size. But since #6495 we turn more things into struct.new_with default, so it is important to handle this. It seems likely that in most cases the code size downside of adding default values is offset by avoiding a local.set later, so always do this (rather than add some kind of heuristic).
* [EH] Fix delegating to caller when func result is concrete (#6518)Heejin Ahn2024-04-231-1/+1
| | | | | | | | | | | hen the function return type is concrete, the translation result of a `try`-`delegate` that targets the caller includes a `return` that returns the whole function body: https://github.com/WebAssembly/binaryen/blob/219e668e87b012c0634043ed702534b8be31231f/src/passes/TranslateEH.cpp#L751-L763 We should do that based on the function's return type, not the function body's return type. The previous code didn't handle the case where the function's return type is concrete but the function body's return type is unreachable.
* [wasm-split] Do not split out functions referring to segments (#6517)Thomas Lively2024-04-231-4/+62
| | | | | | | Since data and elem segments cannot be imported or exported, there is no way to access them from the secondary module, so functions that need to refer to them cannot be split out. Fixes #6512.
* DebugLocationPropagation: pass debuglocation from parent node to chil… (#6500)许鑫权2024-04-215-45/+106
| | | | | | | | | | | | | | | | | | | This PR creates a pass to propagate debug location from parent node to child nodes which has no debug location with pre-order traversal. This is useful for compilers that use Binaryen API to generate WebAssembly modules. It behaves like `wasm-opt` read text format file: children are tagged with the debug info of the parent, if they have no annotation of their own. For compilers that use Binaryen API to generate WebAssembly modules, it is a bit redundant to add debugInfo for each expression, Especially when the compiler wrap expressions. With this pass, compilers just need to add debugInfo for the parent node, which is more convenient. For example: ``` (drop (call $voidFunc) ) ``` Without this pass, if the compiler only adds debugInfo for the wrapped expression `drop`, the `call` expression has no corresponding source code mapping in DevTools debugging, which is obviously not user-friendly.
* [Parser][NFC] Do less work when parsing function types (#6516)Thomas Lively2024-04-192-3/+11
| | | | | | | After the initial parsing pass to find the locations of all the module elements and after the type definitions have been parsed, the next phase of parsing is to visit all of the module elements and parse their types. This phase does not require parsing function bodies, but it previously parsed entire functions anyway for simplicity. To improve performance, skip that useless work.
* [Parser][NFC] Improve performance of idchar lexing (#6515)Thomas Lively2024-04-191-30/+18
| | | | | The parsing of idchars was hot enough to show up while profiling the parsing of a very large module. Optimize it to speed up the overall parse by about 16% in a very unscientific measurement.
* [Parser][NFC] Solve performance issue by adding maybeLabelidx (#6514)Thomas Lively2024-04-181-7/+21
| | | | | | | | | | | | | Creating an error in the parser is an extremely expensive operation for very large files because it has to traverse the input buffer and count newlines to compute the error message. Despite that, there are a few places were we create errors just to discard them and continue parsing. The most notable of these places was where we parsed the list of label index immediates for the br_table instruction. The parser determined the end of the list by intercepting the error produced when trying to parse one more label index. Fix this significant performance problem causing parsing to be quadratic by introducing and using `maybeLabelidx`, which tries to parse a label index but does not produce an error if it fails.
* [Strings] Fix finalize() of StringNew on arrays (#6511)Alon Zakai2024-04-181-1/+3
|
* OptimizeCasts: Also handle local.tee (#6507)Jérôme Vouillon2024-04-181-27/+26
| | | | | | | | | | | | | | | | | | Converts the following: (some.operation (ref.cast .. (local.tee $ref ..)) (local.get $ref) ) into: (some.operation (local.tee $temp (ref.cast .. (local.tee $ref ..)) ) (local.get $temp) )
* Remove unused options from wasm-shell (#6508)Thomas Lively2024-04-171-53/+10
| | | | | None of our tests exercised the --entry or --skip options in wasm-shell, and since wasm-shell is probably not used for anything outside our testing, there's no reason to keep them. Remove them.
* [Parser] Preserve try labels (#6505)Thomas Lively2024-04-172-43/+43
| | | | | | | | | | | | | | | | | | In the standard text format, try scopes can be targeted by both normal branches and delegates, but in Binaryen IR we only allow them to be targeted by delegates, so we have to translate branches to try scopes into branches to wrapper blocks instead. These wrapper blocks must have different names than the try expressions they wrap, so we actually need to track two label names for try expressions: one for delegates and another for normal branches. We previously tried to avoid this complexity by tracking only the branch label and computing the delegate label from the branch label as necessary, but that produced unnecessary wrapper blocks and ugly label names that did not appear in the source. To produce better IR and minimize the diff when switching to the new text parser, bite the bullet and track the delegate and branch label names separately. This eliminates unnecessary wrapper blocks and keeps try names the same as in the wat source where possible.
* [Parser] Match legacy parser block naming (#6504)Thomas Lively2024-04-163-7/+17
| | | | To reduce the size of the test output diff when switching to the new text parser, update it to generate the same block names as the legacy parser.
* [Parser] Pop past unreachables where possible (#6489)Thomas Lively2024-04-164-506/+1645
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We previously would eagerly drop all concretely typed expressions on the stack when pushing an unreachable instruction. This was semantically correct and closely modeled the semantics of unreachable instructions, which implicitly drop the entire stack and start a new polymorphic stack. However, it also meant that the structure of the parsed IR did not match the structure of the folded input, which meant that tests involving unreachable children would not parse as intended, preventing the test from testing the intended behavior. For example, this wat: ```wasm (i32.add (i32.const 0) (unreachable) ) ``` Would previously parse into this IR: ```wasm (drop (i32.const 0) ) (i32.add (unreachable) (unreachable) ) ``` To fix this problem, we need to stop eagerly dropping stack values when encountering an unreachable instruction so we can still pop expressions pushed before the unreachable as direct children of later instructions. In the example above, we need to keep the `i32.const 0` on the stack so it is available to be popped and become a child of the `i32.add`. However, the naive solution of simply popping past unreachables would produce invalid IR in some cases. For example, consider this wat: ```wasm f32.const 0 unreachable i32.add ``` The naive solution would parse this wat into this IR: ```wasm (i32.add (f32.const 0) (unreachable) ) ``` But we do not want to parse an `i32.add` with an `f32`-typed child. Neither do we want to reject this input, since it is a perfectly valid Wasm fragment. In this case, we actually want the old behavior of dropping the `f32.const` and replacing it with another `unreachable` as the first child of the `i32.add`. To both match the input structure where possible and also gracefully fall back to the old behavior of dropping expressions prior to the unreachable, collect constraints on the types of each child for each kind of expression and compare them to the types of available expressions on the stack when an unreachable instruction will be popped. When the constraints are satisfied, pop expressions normally, even after popping the unreachable instruction. Otherwise, drop the instructions that precede the unreachable instruction to ensure we parse valid IR. To collect the constraints, add a new `ChildTyper` utility that calls a different callback for each kind of possible type constraint for each child. In the future, this utility can be used to simplify the validator as well.
* CoalesceLocals: ReFinalize when we refine a type (#6502)Alon Zakai2024-04-161-12/+12
| | | | | | | | | | | The code had a different workaround, namely to add a block with an explicit type to avoid changing the type at all (i.e. the block declared the original type of the thing we replaced, not the refined type). But by adding a block we can end up with an invalid pop, as the fuzzer found, see the EH testcase here. To fix this, use the usual workaround of just running ReFinalize. That is simpler and also improves the code while we do it. It does add more work, but this is likely a very rare situation anyhow.
* Do not repeat types names in text output (#6499)Thomas Lively2024-04-161-2/+35
| | | | | | | | | | For types that do not have explicit names, we generate index-based names in the printer. However, we did not previously ensure that the generated types were not already used as explicit names, so it was possible to print the same name for multiple types, which is not valid. Fix the problem by skipping indices that are already used as type names. Fixes #6492.
* Fuzzer: Randomly pick which functions to use in RefFunc (#6503)Alon Zakai2024-04-151-10/+12
| | | | | | | Previously we chose the first with a proper type, and now we start to scan from a random index, giving later functions a chance too, so we should be emitting a greater variety of ref.func targets. Also remove some obsolete fuzzer TODOs.
* OptimizeInstructions: Optimize StructNew/ArrayNew forms (#6495)Alon Zakai2024-04-151-0/+159
| | | | | | | | | | | | | struct.new with default values can be struct.new_default. array.new with default values can be array.new_default. array.new of size 1 should be array.new_fixed (saves the Const). array.new_fixed with default values can be array.new_default. array.new_fixed with equal but non-default values can be array.new (basically use a Const to say how many copies we want, rather than copy).
* [Strings] Add a string lowering pass using magic imports (#6497)Thomas Lively2024-04-158-40/+96
| | | | | | | | | | | | | | | | | The latest idea for efficient string constants is to encode the constants in the import names of their globals and implement fast paths in the engines for materializing those constants at instantiation time without needing to parse anything in JS. This strategy only works for valid strings (i.e. strings without unpaired surrogates) because only valid strings can be used as import names in the WebAssembly syntax. Add a new configuration of the StringLowering pass that encodes valid string contents in import names, falling back to the JSON custom section approach for invalid strings. To test this chang, update the printer to escape import and export names properly and update the legacy parser to parse escapes in import and export names properly. As a drive-by, remove the incorrect check in the parser that the import module and base names are non-empty.
* When creating `dynCall` and `legalstub` function mark them as ↵Sam Clegg2024-04-122-0/+2
| | | | | | | | | hasExplicitName. NFC (#6496) This is temporary workaround for the current test failures on the emscripten waterfall. The real fix I believe will be to make `hasExplicitName` the default in these kinds of cases. See: #6466
* Fix isGenerative on calls and test via improving ↵Alon Zakai2024-04-112-34/+62
| | | | | | | | | | | | | | | | OptimizeInstructions::areConsecutiveInputsEqual() (#6481) "Generative" is what we call something like a struct.new that may be syntactically identical to another struct.new, but each time a new value is generated. The same is true for calls, which can do anything, including return a different value for syntactically identical calls. This was not a bug because the main user of isGenerative, areConsecutiveInputsEqual(), was too weak to notice, that is, it gave up sooner, for other reasons. This PR improves that function to do a much better check, which makes the fix necessary to prevent regressions. This is not terribly important for itself, but will help a later PR that will add code that depends more heavily on areConsecutiveInputsEqual().
* Fuzzer: Emit signed Struct/ArrayGet operations (#6486)Alon Zakai2024-04-112-5/+17
|
* Fixes regarding explicit names (#6466)Jérôme Vouillon2024-04-114-16/+37
| | | | | | | - Only write explicit function names. - When merging modules, the name of types, globals and tags in all modules but the first were lost. - Set name as explicit when copying a function with a new name.
* GUFA: Fix signed reads of packed GC data (#6494)Alon Zakai2024-04-112-1/+73
| | | | | | | GUFA already truncated packed fields on write, which is enough for unsigned gets, but for signed gets we also need to sign them on reads. Similar to #6493 but for GUFA. Also found by #6486
* Fix ConstantFieldPropagation signed packed field handling and improve ↵Alon Zakai2024-04-113-34/+34
| | | | | | | | | | | | | | | | Heap2Local's (#6493) CFP already had logic for truncating but not for sign-extending, which this fixes. Use the new helper function in Heap2Local as well. This changes the model there from "truncate on set, sign-extend on get" to "truncate or sign-extend on get". That is both simpler by reusing the same logic as CFP but also more optimal: the idea to truncate on sets made sense since sets are rarer, but if we must then sign-extend on gets then we can end up doing more work overall (as the truncations on sets are not needed if all gets are signed). Found by #6486
* [Parser] Use unreachables to fulfill tuple requirements (#6488)Thomas Lively2024-04-111-1/+3
| | | | | | | When we need to pop a tuple and the top value on the stack is unreachable, just pop the unreachable rather than producing a tuple.make. This always produces valid IR since an unreachable is always valid where a tuple would otherwise be expected. It also avoids bloating the parsed IR, since we would previously parse a `tuple.make` where all the children were unreachable in this case.
* Ensure printed tuple.extract arity is valid (#6487)Thomas Lively2024-04-111-1/+6
| | | | | | We previously printed the size of the tuple operand as the arity, but that printed `1` when the operand is unreachable. We don't allow our text input to use `1` as the arity, so don't print it, either. Instead, print the smallest valid arity, `2`, in this case.
* [Parser] Parse contref and nullcontref types (#6485)Thomas Lively2024-04-102-0/+16
|
* Heap2Local: Fix signed struct/array reads (#6484)Alon Zakai2024-04-101-5/+10
| | | | | | In #6480 I forgot that StructGet can be signed, which means we need to emit a sign-extend. Arrays already copied the field as part of Array2Struct.
* Improve inlining of `return_call*` (#6477)Jérôme Vouillon2024-04-102-28/+107
| | | | | Use the previous implementation when no return_call is in a try block. This avoids moving code around (as a sibling of the caller body or the inlined body), so that should allow more local optimizations after inlining.
* Heap2Local: Optimize packed fields (#6480)Alon Zakai2024-04-091-12/+26
| | | | | | Previously we did not optimize a struct or an array with a packed field. As a result a single packed field in a struct prevented the entire struct from being localized, which this fixes. This is also useful for arrays as packed arrays are common (e.g. for string data).
* Heap2Local: Optimize Arrays in addition to Structs (#6478)Alon Zakai2024-04-091-21/+322
| | | | | | | | | | | | To keep things simple, this adds a Array2Struct component to the pass. When we find a non-escaping array, we run that to turn it into a struct, and then run the existing Struct2Local to convert that to locals. This avoids refactoring Struct2Local to handle both structs and arrays (with the downside of making the optimization of arrays a little less efficient, but they are rarer, I suspect - that is certainly the case in Java output I've seen). The core EscapeAnalyzer logic is generalized to handle both arrays and structs, but the changes there are thankfully quite minor.
* Asyncify: Fix nondeterminism in verbose logging (#6479)Alon Zakai2024-04-092-4/+24
| | | | #6457 added a test that exposed existing nondeterminism.
* Handle return calls in CodeFolding (#6474)Thomas Lively2024-04-081-1/+21
| | | | Treat them the same as returns and test that they can be folded out of try-catch blocks because they do not have throws effects.
* Handle return calls correctlyThomas Lively2024-04-087-226/+476
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a combined commit covering multiple PRs fixing the handling of return calls in different areas. The PRs are all landed as a single commit to ensure internal consistency and avoid problems with bisection. Original PR descriptions follow: * Fix inlining of `return_call*` (#6448) Previously we transformed return calls in inlined function bodies into normal calls followed by branches out to the caller code. Similarly, when inlining a `return_call` callsite, we simply added a `return` after the body inlined at the callsite. These transformations would have been correct if the semantics of return calls were to call and then return, but they are not correct for the actual semantics of returning and then calling. The previous implementation is observably incorrect for return calls inside try blocks, where the previous implementation would run the inlined body within the try block, but the proper semantics would be to run the inlined body outside the try block. Fix the problem by transforming inlined return calls to branches followed by calls rather than as calls followed by branches. For the case of inlined return call callsites, insert branches out of the original body of the caller and inline the body of the callee as a sibling of the original caller body. For the other case of return calls appearing in inlined bodies, translate the return calls to branches out to calls inserted as siblings of the original inlined body. In both cases, it would have been convenient to use multivalue block return to send call parameters along the branches to the calls, but unfortunately in our IR that would have required tuple-typed scratch locals to unpack the tuple of operands at the call sites. It is simpler to just use locals to propagate the operands in the first place. * Fix interpretation of `return_call*` (#6451) We previously interpreted return calls as calls followed by returns, but that is not correct both because it grows the size of the execution stack and because it runs the called functions in the wrong context, which can be observable in the case of exception handling. Update the interpreter to handle return calls correctly by adding a new `RETURN_CALL_FLOW` that behaves like a return, but carries the arguments and reference to the return-callee rather than normal return values. `callFunctionInternal` is updated to intercept this flow and call return-called functions in a loop until a function returns with some other kind of flow. Pull in the upstream spec tests return_call.wast, return_call_indirect.wast, and return_call_ref.wast with light editing so that we parse and validate them successfully. * Handle return calls in wasm-ctor-eval (#6464) When an evaluated export ends in a return call, continue evaluating the return-called function. This requires propagating the parameters, handling the case that the return-called function might be an import, and fixing up local indices in case the final function has different parameters than the original function. * Update effects.h to handle return calls correctly (#6470) As far as their surrounding code is concerned return calls are no different from normal returns. It's only from a caller's perspective that a function containing a return call also has the effects of the return-callee. To model this more precisely in EffectAnalyzer, stash the throw effect of return-callees on the side and only merge it in at the end when analyzing the effects of a full function body.
* Asyncify-verbose: Show all reasons why a function is instrumented (#6457)Dannii Willis2024-04-083-16/+24
| | | | Helps emscripten-core/emscripten#17380 by logging all the reasons why we instrument a function, and not just the first as we did before.
* [NFC] Refactor Heap2Local logic (#6473)Alon Zakai2024-04-063-355/+402
| | | | | | Separate out an EscapeAnalyzer class that does the escape analysis, and a Struct2Local one that does the optimization. Also make a few things const here to be safer.
* [NFC] Remove unused variables (#6475)Thomas Lively2024-04-051-2/+2
| | | These were causing build failures on the Emscripten builder.
* Typed continuations: nocont and cont basic heap types (#6468)Frank Emrich2024-04-049-6/+103
| | | | | | | | This PR is part of a series that adds basic support for the typed continuations/wasmfx proposal. This particular PR adds cont and nocont as top and bottom types for continuation types, completely analogous to func and nofunc for function types (also: exn and noexn).
* [NFC] Generalize and simplify wasm-delegations-fields.h (#6465)Alon Zakai2024-04-033-762/+601
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This removes the hard-coded generation of a switch and cases, and allows the user to define the boilerplate at the start and end of the main output, and of what is generated for each expression. By default we still emit a switch and cases. Also standardize the output by never emitting ; unnecessarily, which we were inconsistent about. This serves two goals: First, it will make using embind on Binaryen simpler as embind needs to generate C++ template logic for each expression, and not a switch (and we cannot have extra ; in embind notation). Second, this makes the format much simple to parse, which is a stepping stone for #6460, e.g. before we had case Expression::Id::LoopId: { DELEGATE_START(Loop); DELEGATE_FIELD_CHILD(Loop, body); DELEGATE_FIELD_SCOPE_NAME_DEF(Loop, name); DELEGATE_END(Loop); break; } and now we have DELEGATE_FIELD_CASE_START(Loop) DELEGATE_FIELD_CHILD(Loop, body) DELEGATE_FIELD_SCOPE_NAME_DEF(Loop, name) DELEGATE_FIELD_CASE_END(Loop) The main part of this diff was autogenerated by this python: for l in x.splitlines(): if l.startswith(' case'): id = l.split(':')[4][:-2] print(f'DELEGATE_FIELD_CASE_START({id})') if l.startswith(' DELEGATE_FIELD'): print(l) if l.startswith(' DELEGATE_END'): id = l[17:-2] print(f'DELEGATE_FIELD_CASE_END({id})') print()
* Fix writing of data segment names in name section (#6462)Jérôme Vouillon2024-04-021-2/+2
| | | | - Output segment names even when no memory is declared. - Only write explicit names.
* Add an Asyncify option to propagate the addList (#5935)かめのこにょこにょこ2024-04-011-12/+42
| | | | | The new asyncify flag --pass-arg=asyncify-propagate-addlist changes the behavior of --pass-arg=asyncify-addlist : with it, callers of functions in the asyncify-addlist will be also instrumented.
* [Strings] string.new_wtf16_array should trap if the end index is less than ↵Alon Zakai2024-04-011-1/+2
| | | | the start (#6459)
* GUFA: Fix hashing of GlobalInfo's type (#6455)Alon Zakai2024-03-291-2/+6
| | | | | | | | | For a global we store the name and a type, and the type may be more precise than the global's type in the wasm. As a result, when hashing, it is not enough to hash only the name, so hash the type as well. Also add a random TODO as a comment.
* GUFA: Fix nondeterminism in roots (#6456)Alon Zakai2024-03-291-1/+4
| | | | | Found by the fuzzer. We already processed the work queue in a deterministic order, but the roots were unordered. The work queue's initial state is filled by the roots, so we must process the roots deterministically as well.
* Report timeout in interpretation of AtomicWait (#6452)Thomas Lively2024-03-291-1/+1
| | | | | | | To avoid slow-running fuzz cases, we report a host limit when interpreting atomic.wait with any non-zero timeout. However, in the allowed case where the timeout is zero, we were incorrectly interpreting the wait as returning 0, meaning that it was woken up, instead of 2, meaning that the timeout expired. Fix it to return 2.
* Remove the TRAVERSE_CALLS option in the ConstantExpressionRunner (#6449)Thomas Lively2024-03-294-44/+0
| | | | | | | | The implementation of calls with this option was incorrect because it cleared the locals before evaluating the call arguments. The likely explanation for why this was never noticed is that there are no users of this option, especially since it is exposed in the C and JS APIs but not used internally. Rather than try to fix the implementation, just remove the option.
* [Strings] Lower string.concat in StringLowering (#6453)Thomas Lively2024-03-291-0/+9
|
* wasm-merge: Check that the types of imports and exports match (#6437)Jérôme Vouillon2024-03-271-0/+108
|