summaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* [Wasm GC] Fix TypeRefining on fallthrough values via tee (#4900)Alon Zakai2022-08-184-10/+57
| | | | | | | | | | | | | | | | | | | | | | A rather tricky corner case: we normally look at fallthrough values for copies of fields, so when we try to refine a field, we ignore stuff like this: a.x = b.x; That copies the same field on the same type to itself, so refining is not limited by it. But if we have something else in the middle, and that thing cannot change type, then it is a problem, like this: (struct.set (..ref..) (local.tee $temp (struct.get))) tee has the type of the local, which does not change in this pass. So we can't look at just the fallthrough here and skip the tee: after refining the field, the tee's old type might not fit in the field's new type. We could perhaps add casts to fix things up, but those may have too big a cost. For now, just ignore the fallthrough.
* Fix DeadArgumentElimination + TrapsNeverHappen to not leave stale types (#4910)Alon Zakai2022-08-181-5/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DAE will normally not remove an unreachable parameter, because it checks for effects there. But in TrapsNeverHappen mode, we assume that an unreachable is an effect we can remove, so we are willing to remove it: (func $foo (param $unused i32) ;; never use $unused ) (func $bar (call $foo (unreachable))) ;;=> dae+tnh (func $foo ) (func $bar (call $foo)) But that transformation is invalid: the call's type was unreachable before but no longer is. What went wrong here is that, yes, it is valid to remove an unreachable, but we may need to update types while doing so, which we were not doing. This wasn't noticed before due to a combination of unfortunate factors: The main reason is that this only happens in TrapsNeverHappens mode. We don't fuzz that, because it's difficult: that mode can assume a trap never happens, so a trap is undefined behavior really. On real-world code this is great, but in the fuzzer it means that the output can seem to change after optimizations. The validator happened to be missing an error for a call that has type unreachable but shouldn't: Validator: Validate unreachable calls more carefully #4909 . Without that, we'd only get an error if the bad type influenced a subsequent pass in a confusing way - which is possible, but difficult to achieve (what ended up happening in practice is that SignatureRefining on J2Wasm relied on the unreachable and refined a type too much). Even with that fix, for the problem to be detected we'd need for the validation error to happen in the final output, after running all the passes. In practice, though, that's not likely, since other passes tend to remove unreachables etc. Pass-debug mode is very useful for finding stuff like this, as it validates after every individual pass. Sadly it turns out that global validation was off there: Validator: Validate globally by default #4906 (so it was catching the 99% of validation errors that are local, but this particular error was in the remaining 1%...). As a fix, simply ignore this case. It's not really worth the effort to optimize it, since DCE will just remove unreachables like that anyhow. So if we run again after a DCE we'd get a chance to optimize. This updates some existing tests to avoid (unreachable). That was used as an example of something with effects, but after this change it is treated more carefully. Replace those things with something else that has effects (a call).
* Validator: Validate unreachable calls more carefully (#4909)Alon Zakai2022-08-181-0/+57
| | | | | | | | | | | | | | | | | | Normally the validator will find stale types properly, by just running refinalize and seeing if the type has changed (if so, then some code forgot to refinalize). However, refinalize is a local operation, so it does not apply to calls: a call's proper type is determined by the global information of the function we are calling. As a result, we would not notice errors like this: (call $foo) ;; type: unreachable Refinalizing that would not change the type from unreachable to the proper type, since that is global information. To validate this properly, validate that a call whose type is unreachable actually has an unreachable child. That rules out an invalid unreachable type here, which leaves concrete types, that we already have proper global validation for. The code here is generalized to handle non-call things as well, but it only helps expressions requiring global validation, so it likely only helps global.get and a few others.
* Avoid emitting a block in the binary format when it has no name (#4912)Alon Zakai2022-08-181-1/+32
| | | | | | | | | | We already did this if the block was a child of a control flow structure, which is the common case (see the new added comment around that code, which clarifies why). This does the same for all other blocks. This is simple to do and a minor optimization, but the main benefit from this is just to make our handling of blocks uniform: after this, we never emit a block with no name. This will make 1a non- nullable locals easier to handle (since they will be able to assume that property; and not emitting such blocks avoids some work to handle non-nullable locals in them).
* Restore the `extern` heap type (#4898)Thomas Lively2022-08-1717-98/+206
| | | | | | | The GC proposal has split `any` and `extern` back into two separate types, so reintroduce `HeapType::ext` to represent `extern`. Before it was originally removed in #4633, externref was a subtype of anyref, but now it is not. Now that we have separate heaptype type hierarchies, make `HeapType::getLeastUpperBound` fallible as well.
* Mutli-Memories Support in IR (#4811)Ashley Nelson2022-08-1744-1185/+2287
| | | | | | | This PR removes the single memory restriction in IR, adding support for a single module to reference multiple memories. To support this change, a new memory name field was added to 13 memory instructions in order to identify the memory for the instruction. It is a goal of this PR to maintain backwards compatibility with existing text and binary wasm modules, so memory indexes remain optional for memory instructions. Similarly, the JS API makes assumptions about which memory is intended when only one memory is present in the module. Another goal of this PR is that existing tests behavior be unaffected. That said, tests must now explicitly define a memory before invoking memory instructions or exporting a memory, and memory names are now printed for each memory instruction in the text format. There remain quite a few places where a hardcoded reference to the first memory persist (memory flattening, for example, will return early if more than one memory is present in the module). Many of these call-sites, particularly within passes, will require us to rethink how the optimization works in a multi-memories world. Other call-sites may necessitate more invasive code restructuring to fully convert away from relying on a globally available, single memory pointer.
* Validator: Validate globally by default (#4906)Alon Zakai2022-08-161-1/+1
| | | | | | I'm not sure why this defaulted to non-global. Perhaps because of limitations in the asm.js days. A better default is to validate globally, and this also applies in pass-debug mode (since that just uses the default there), so this will catch more problems there.
* Validator: More carefully check for stale types (#4907)Alon Zakai2022-08-161-4/+8
| | | | | | | The validation logic to check for stale types (code where we forgot to run refinalize) had a workaround for a control flow issue. That workaround meant we didn't catch errors where a type was concrete but it should be unreachable. This PR makes that workaround only apply for control flow structures, so we can catch more errors.
* Revert "[Wasm GC] GC-prefixed opcodes are int8, not LEBs (#4889)" (#4895)Alon Zakai2022-08-162-61/+61
| | | | | | | Reverts #4889 The spec is unclear on this, and that PR moved us to do what V8 does. But it sounds like we should clarify the spec to do things the other way, so this goes back to that.
* Validator: Validate intrinsics (#4880)Alon Zakai2022-08-161-8/+55
| | | | | | | | | | call.without.effects has a specific form, where the last parameter is a function reference, and that function reference must have the right type for the other parameters if called with them: (call $call.without.effects (..i32..) (..f64..) (..function reference, which takes params i32 and f64..)
* LegalizeJSInterface: Look for get/setTempRet0 as exports (#4881)Sam Clegg2022-08-153-11/+55
| | | | | | This allows emscripten to move these helper functions from JS library imports to native wasm exports. See https://github.com/emscripten-core/emscripten/issues/7273
* Function-level pass-debug mode 2 validation (#4897)Alon Zakai2022-08-123-2/+53
| | | | | | In BINARYEN_PASS_DEBUG=2 we save the module before each pass, and if validation fails afterwards, we print the module before. This PR does the same for function-parallel passes - in that case, we can actually show the specific function that broke validation, as opposed to the whole module.
* [EH] Pop should be supertype of tag type (#4901)Heejin Ahn2022-08-111-1/+1
| | | | `pop`s type should be a supertype, not a subtype, of the tag's type within `catch`.
* [Strings] Fix string.new_wtf16_array (#4894)Alon Zakai2022-08-102-2/+11
| | | Like the 8-bit array variants, it takes 3 parameters.
* [Strings] Linear memory string operations should emit a memory index (#4893)Alon Zakai2022-08-102-12/+31
| | | | | | | For now this index is always 0, but we must emit it. Also clean up the wat test a little - we don't have validation yet, but we should not validate without a memory in that file.
* [Wasm GC] GC-prefixed opcodes are int8, not LEBs (#4889)Alon Zakai2022-08-092-61/+61
| | | | | | This starts to matter with strings, it turns out. This change should make us runnable in v8. Spec: https://github.com/WebAssembly/gc/blob/main/proposals/gc/MVP.md#instructions-1
* [Strings] string.new.array methods have start:end arguments (#4888)Alon Zakai2022-08-096-2/+32
|
* RedundantSetElimination: ReFinalize when needed (#4877)Alon Zakai2022-08-091-0/+21
|
* SimplifyLocals: ReFinalize when needed (#4878)Alon Zakai2022-08-091-0/+25
|
* [Wasm GC] Fix SignaturePruning on CallWithoutEffects (#4882)Alon Zakai2022-08-081-0/+13
| | | | | | | | | | | | | | | | | | | call.without.effects will turn into a normal call of the last parameter later, (call $call.without.effects A B (ref.func $foo) ) ;; => intrinsic lowering (call $foo A B ) SignaturePruning needs to be aware of that: we can't remove a parameter from $foo without also updating relevant calls to $call.without.effects. Rather than handle that, just skip such cases, and leave them to be optimized after intrinsics are lowered away.
* [GUFA] Fix readFromData on a function literal (#4883)Alon Zakai2022-08-081-6/+9
| | | | | | | A function literal (ref.func) should never reach a struct or array get, but if there is a cast then it can look like they might arrive. We filter in ref.cast which avoids that (since casting a function to a data type will trap), but there is also br_on_cast which is not yet optimized. This PR adds code to avoid an assert in readFromData in that case.
* [Optimize Instructions] Fold eqz(eqz(x)) to not-equal of zero (#4855)Max Graey2022-08-081-4/+17
| | | | | | eqz(eqz(i32(x))) -> i32(x) != 0 eqz(eqz(i64(x))) -> i64(x) != 0 Only when shrinkLevel == 0 (prefer speed over binary size).
* Remove metadata generation from wasm-emscripten-finalize (#4863)Sam Clegg2022-08-073-269/+3
| | | | This is no longer needed by emscripten as of: https://github.com/emscripten-core/emscripten/pull/16529
* wasm-emscripten-finalize: Remove em_js/em_asm start/stop symbols when ↵Sam Clegg2022-08-051-0/+9
| | | | | | | | stripping data segments. (#4876) This avoid a fatal crash in `--post-emscripten` where it tries to remove data that is no longer part of the file. This fixes bug introduced by #4871 that causes emscripten tests to fail.
* Remove RTTs (#4848)Thomas Lively2022-08-0544-1945/+432
| | | | | | | RTTs were removed from the GC spec and if they are added back in in the future, they will be heap types rather than value types as in our implementation. Updating our implementation to have RTTs be heap types would have been more work than deleting them for questionable benefit since we don't know how long it will be before they are specced again.
* Cleanup em_asm/em_js strings as part of PostEmscripten (#4871)Sam Clegg2022-08-042-90/+185
| | | | Rather than doing it as a side effect of dumping the metadata in wasm-emscripten-finalize.
* [NFC] Skip Directize pass if there are no tables (#4875)Alon Zakai2022-08-041-0/+4
|
* Allow `BINARYEN_DEBUG` environment variable to be used in place of ↵Sam Clegg2022-08-041-0/+4
| | | | | | | | | `--debug`. NFC (#4874) For example I found it useful to able to do something like this: ``` $ BINARYEN_DEBUG=post-emscripten ./test/runner sometest ```
* [C-API] Add type builder C-API (#4803)dcode2022-08-042-0/+308
| | | Introduces the necessary APIs to use the type builder from C. Enables construction of compound heap types (arrays, structs and signatures) that may be recursive, including assigning concrete names to the built types and, in case of structs, their fields.
* [NFC] Mark modifiesBinaryenIR=false in more places (#4869)Alon Zakai2022-08-037-0/+14
|
* wasm-emscripten-finalize: Remove __start/__stop_em_js exports (#4870)Sam Clegg2022-08-031-0/+3
| | | | | | | | We already remove `__start_em_asm` and `__stop_em_asm`. This change is needed since I want to start exporting `__start_em_js` and `__stop_em_js` from emscripten without causing regressions. As a followup I'm planning on moving all of the em_js and em_asm stripping code it PostEmscripten.cpp.
* Remove support for parsing `let` (#4864)Thomas Lively2022-08-032-90/+2
| | | | | It has been removed from the typed function references proposal, so we no longer need to support it. Maintaining the test for `let` was difficult because Binaryen could not emit either text or binary that actually used it.
* [Optimize Instructions] Refactor squared rules (#4840)Max Graey2022-08-022-55/+135
| | | | | | | | | | | | + Move these rules to separate function; + Refactor them to use matches; + Add comments; + Handle rotational shifts as well; + Handle overflows for `<<`, `>>`, `>>>` shifts; + Add mixed rotate rules: ```rust rotl(rotr(x, C1), C2) => rotr(x, C1 - C2) rotr(rotl(x, C1), C2) => rotl(x, C1 - C2) ```
* Fix a new compiler warning (#4860)Alon Zakai2022-08-021-0/+1
| | | Avoids a "may fall through" warning.
* [NFC] Refactor getDroppedChildrenAndAppend (#4849)Max Graey2022-08-011-3/+3
|
* Update reference type Literal constructors to use HeapType (#4857)Thomas Lively2022-08-017-18/+21
| | | | | | We already require non-null literals to have non-null types, but with this change we can enforce that constraint by construction. Also remove the default behavior of creating a function reference literal with heap type `func`, since there is always a more specific function type to use.
* Add interpreter support for intrinsics (#4851)Alon Zakai2022-08-012-3/+17
| | | This can give us some chance to catch bugs like #4839 in the fuzzer.
* [GUFA] Handle GUFA + Intrinsics (#4839)Alon Zakai2022-08-012-17/+52
| | | | | | | | Like RemoveUnusedModuleElements, places that build graphs of function reachability must special-case the call-without-effects intrinsic. Without that, it looks like a call to an import. Normally a call to an import is fine - it makes us be super-pessimistic, as we think things escape all the way out - but in GC for now we are assuming a closed world, and so we end up broken. To fix that, properly handle the intrinsic case.
* [NFC] wasm-reduce: Avoid wasted work on drops (#4850)Alon Zakai2022-07-291-0/+7
| | | | | | It was wasted work to see a drop and then check if we can replace it with a drop of its child, which is identical to the original state. This didn't cause any harm (we'd not reduce code size, and stop eventually) but it did slow us down.
* Refactor doIndent (#4847)Max Graey2022-07-293-14/+3
| | | | | | | | | | | | | | | | Refactor everywhere from: ```c++ for (size_t i = 0; i < indent; i++) { o << ' '; } ``` to: ```c++ o << std::string(indent, ' '); ``` ### Motivation It is much simpler and should produce smaller code.See godbolt: https://godbolt.org/z/KMYMdn7z5
* [JS Api] Reuse C-Api for emitText and emitStackIR (#4832)Max Graey2022-07-295-41/+43
| | | Make the C API match the JS API and fix an old bug where extra newlines were emitted.
* [C/JS API] Add string reference types (#4810)dcode2022-07-273-0/+44
|
* [Strings] Add interpreter stubs for string instructions (#4835)Alon Zakai2022-07-261-35/+40
| | | | | | | | | The stubs let precompute skip over them without erroring. With this PR we can run the optimizer on strings code. We still can't run --fuzz-exec though, so we can't run the fuzzer. Also simplify the error strings in the earlier part of the file. All other code just has "unimp" so we might as well do the same and not mention full names there.
* Fix unreachable handling in getDroppedChildrenAndAppend (#4834)Heejin Ahn2022-07-261-1/+1
| | | | | | | The previous code assumes if `last`'s type is unreachable it traps. But it's not always the case because it can be other instructions like `br` whose type is unreachable but doesn't necessarily trap. Context: https://github.com/WebAssembly/binaryen/pull/4827#discussion_r929395477
* wasm-reduce: Apply commandline features (#4833)Alon Zakai2022-07-261-3/+11
| | | | | This lets wasm-reduce --enable-FOO work. Usually this is not needed as we do enable all features by default, but sometimes it is nice to disable features (e.g. to avoid reducing into a testcase that uses something the original wasm did not use).
* Make `GlobalTypeRewriter` work for isorecursive types (#4829)Thomas Lively2022-07-263-4/+15
| | | | | | | | | | | | | | | | | | | There are two new potential problems that `GlobalTypeRewriter` can run into when working with isorecursive types instead of nominal types. First, the refined types may have replaced generic references with references to specific other types, potentially creating new recursions and making the existing recursion groups insufficient. Second, distinct types may be refined to structurally identical types and those distinct input types may map the same output type, potentially changing cast behavior. Both of these problems are solved by putting all the new types in a single large recursion group. We do not currently account for the fact that types may be used in the external interface of the module, but when we do, externalized types will be excluded from optimizations and will not be affected by the creation of this single large rec group. Fixes #4816.
* Changing ref maps in wasm-binary to use a value of a vector of Name* (#4830)Ashley Nelson2022-07-262-40/+17
| | | | | | | | | | | * Changing ref maps in wasm-binary to use a value of a vector of Name* * clang-format * Update src/wasm/wasm-binary.cpp Co-authored-by: Thomas Lively <7121787+tlively@users.noreply.github.com> Co-authored-by: Thomas Lively <7121787+tlively@users.noreply.github.com>
* [C/JS API] Expose string reference feature (#4831)Max Graey2022-07-263-0/+5
|
* [OptimizeInstructions] Add folding for mixed left shift and mul with ↵Max Graey2022-07-261-0/+17
| | | | | | constants on RHS (#4808) (x * C1) << C2 -> x * (C1 << C2) (x << C1) * C2 -> x * (C2 << C1)
* Update reference type shorthands in binary output (#4828)Thomas Lively2022-07-251-26/+31
| | | | | | | | | Add support for emitting the string type reference shorthands, which had previously been omitted accidentally due to the `default` case in that switch. Also avoid emitting shorthands for non-nullable reference types as a first step towards transitioning the shorthands to represent nullable types instead. Not emitting these shorthands at all will give V8 the flexibility it needs to change its interpretation of the shorthands without breaking any workflows using Binaryen.