summaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
...
* Fix shareability handling in TypeSSA collision logic (#6798)Alon Zakai2024-08-011-0/+1
|
* Add a disjoint sets (union-find) utility (#6797)Thomas Lively2024-08-011-0/+87
| | | | This will be used in an upcoming type optimization pass and may be generally useful.
* [NFC] Avoid a temp local (#6800)Alon Zakai2024-08-011-3/+3
| | | | | The local was only used once, so it didn't really add much. And, it was causing some compilers to error on "unused variable" (when building without assertions, the use was removed).
* Use Names::getValidNameGivenExisting in binary reading (#6793)Alon Zakai2024-07-311-8/+3
| | | | | | We had a TODO to use it once Names was optimized, which it has been. The Names version is also far faster. When building https://github.com/JetBrains/kotlinconf-app it saves 70 seconds(!).
* Fix shareability of externalized nulls (#6791)Alon Zakai2024-07-301-2/+3
|
* Add a customizable title to Metrics reporting (#6792)Alon Zakai2024-07-302-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before the PR: $ bin/wasm-opt test/hello_world.wat --metrics total [exports] : 1 [funcs] : 1 [globals] : 0 [imports] : 0 [memories] : 1 [memory-data] : 0 [tables] : 0 [tags] : 0 [total] : 3 [vars] : 0 Binary : 1 LocalGet : 2 After the PR: $ bin/wasm-opt test/hello_world.wat --metrics Metrics total [exports] : 1 [funcs] : 1 ... Note the "Metrics" addition at the top. And the title can be customized: $ bin/wasm-opt test/hello_world.wat --metrics=text Metrics: text total [exports] : 1 [funcs] : 1 The custom title can be helpful when multiple invocations of metrics are used at once, e.g. --metrics=before -O3 --metrics=after.
* Add a Tarjan's Strongly Connected Component utilty (#6790)Thomas Lively2024-07-301-0/+262
| | | | | | | | | Implement a non-recursive version of Tarjan's Strongly Connected Component algorithm that consumes and produces iterators for maximum flexibility. This will be used in an optimization that transforms the heap type graph to use minimal recursion groups, which correspond to the strongly connected components of the type graph.
* Fix shareability of internalized nulls (#6789)Alon Zakai2024-07-291-2/+3
|
* Generalize Literal::externalize/internalize for strings and shareability (#6784)Alon Zakai2024-07-291-18/+12
|
* [wasm-reduce] Do not crash on non-func element segments (#6778)Thomas Lively2024-07-261-10/+5
| | | | Generalize the code for simplifying element segments to handle more than just null and funcref elements.
* Cost analysis: Remove "Unacceptable" hack (#6782)Alon Zakai2024-07-252-27/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We marked various expressions as having cost "Unacceptable", fixed at 100, to ensure we never moved them out from an If arm, etc. Giving them such a high cost avoids that problem - the cost is higher than the limit we have for moving code from conditional to unconditional execution - but it also means the total cost is unrealistic. For example, a function with one such instruction + an add (cost 1) would end up with cost 101, and removing the add would look insignificant, which causes issues for things that want to compare costs (like Monomorphization). To fix this, adjust some costs. The main change here is to give casts a cost of 5. I measured this in depth, see the attached benchmark scripts, and it looks clear that in both V8 and SpiderMonkey the cost of a cast is high enough to make it not worth turning an if with ref.test arm into a select (which would always execute the test). Other costs adjusted here matter a lot less, because they are on operations that have side effects and so the optimizer will anyhow not move them from conditional to unconditional execution, but I tried to make them a bit more realistic while I was removing "Unacceptable": * Give most atomic operations the 10 cost we've been using for atomic loads/ stores. Perhaps wait and notify should be slower, however, but it seems like assuming fast switching might be more relevant. * Give growth operations a cost of 20, and throw operations a cost of 10. These numbers are entirely made up as I am not even sure how to measure them in a useful way (but, again, this should not matter much as they have side effects).
* TupleOptimization: Properly handle subtyping in copies (#6786)Alon Zakai2024-07-251-2/+7
| | | | | | We used the target's type for the read from the source, but due to subtyping those might be different. Found by the fuzzer.
* Validate RefAsNonNull (#6785)Alon Zakai2024-07-242-3/+18
| | | Fixes #6781
* Properly validate ref.cast when lacking a common supertype (#6741)Alon Zakai2024-07-231-0/+15
| | | | | | | When lacking a common supertype the GLB operation makes the type of the cast unreachable, which errors on getHeapType in the later code. Fixes #6738
* [threads] Calculate shared heap type depths in subtypes.h (#6777)Thomas Lively2024-07-231-7/+14
| | | Fixes #6776.
* Make FunctionInfo's assignment operator argument const (#6780)Derek Schuff2024-07-221-1/+1
| | | | | | | | Aside from the fact that there's no need for this to be non-const and this is the usual way to write an assignment operator, this is also needed because of a recent change to std::pair (https://github.com/llvm/llvm-project/pull/89652). This seems to be forcing pair to want the const version of the assignment operator of its members.
* Heap2Local: Properly handle failing array casts (#6772)Alon Zakai2024-07-181-5/+39
| | | | | | | | Followup to #6727 which added support for failing casts in Struct2Local, but it turns out that it required Array2Struct changes as well. Specifically, when we turn an array into a struct then casts can look like they behave differently (what used to be an array input, becomes a struct), so like with RefTest that we already handled, check if the cast succeeds in the original form and handle that.
* [NFC] Add HeapType::isMaybeShared(BasicHeapType) utility (#6773)Thomas Lively2024-07-188-11/+13
| | | | | | | | | This abbreviates a common pattern where we first had to check whether a heap type was basic, then if it was, get its unshared version and compare it to some expected BasicHeapType. Suggested in https://github.com/WebAssembly/binaryen/pull/6771#discussion_r1683005495.
* [threads] Update the fuzzer for shared types (#6771)Thomas Lively2024-07-182-54/+90
| | | | | | | | Update the fuzzer to both handle shared types in initial contents and create and use new shared types without crashing or producing invalid modules. Since V8 does not have a complete implementation of shared-everything-threads yet, disable fuzzing V8 when shared-everything is enabled. To avoid losing too much coverage of V8, disable shared-everything in the fuzzer more frequently than other features.
* Validate features for types used in element segments (#6769)Thomas Lively2024-07-181-0/+8
|
* Monomorphization: Add a limit on the number of parameters (#6774)Alon Zakai2024-07-181-0/+20
|
* Validate features for types used in tables (#6768)Thomas Lively2024-07-181-13/+8
| | | | We previously special-cased things like GC types, but switch to a more general solution of detecting what features a table's type requires.
* [threads] ref.i31_shared requires shared-everything in validation (#6767)Thomas Lively2024-07-181-0/+6
|
* Monomorphize all the things (#6760)Alon Zakai2024-07-181-28/+260
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously call operands were monomorphized (considered as part of the call context, so we can create a specialized function with those operands fixed) if they were constant or had a different type than the function parameter's type. This generalizes that to pull in pretty much all the code we possibly can, including nested code. For example: (call $foo (struct.new $struct (i32.const 10) (local.get $x) (local.get $y) ) ) This can turn into (call $foo_mono (local.get $x) (local.get $y) ) The struct.new and even one of the struct.new's children is moved into the called function, replacing the original ref argument with two other ones. If the original called function was this: (func $foo (param $ref (ref ..)) .. ) then the monomorphized function then looks like this: (func $foo_mono (param $x i32) (param $y i32) (local $ref (ref ..)) (local.set $ref (struct.new $struct (i32.const 10) (local.get $x) (local.get $y) ) ) .. ) The struct.new and its constant child appear here, and we read the parameters. To do this, generalize the code that creates the call context to accept everything that is impossible to copy (like a local.get) or that would be tricky and likely unworthwhile (like another call or a tuple). Also check for effect interactions, as this code motion does some reordering. For this to work, we need to adjust how we compute the costs we compare when deciding what to monomorphize. Before we just compared the called function to the monomorphized called function, which was good enough when the call context only contained consts, but now it can contain arbitrarily nested code. The proper comparison is between these two: * Old function + call context * New monomorphized function Including the call context makes this a fair comparison. In the example above, the struct.new and the i32.const are part of the call context, and so they are in the monomorphized function, so if we didn't count them in other function we'd decide not to optimize anything with a large context. The new functionality is tested in a new file. A few parts of existing tests needed changes to not become pointless after this improvement, namely by replacing stuff that we now optimize with things that we don't like replacing an i32.eqz with a local.get. There are also a handful of test outcomes that change in CAREFUL mode due to the new cost analysis.
* [threads] Simplify and generalize reftype writing without GC (#6766)Thomas Lively2024-07-181-16/+8
| | | | | | Similar to #6765, but for types instead of heap types. Generalize the logic for transforming written reference types to types that are supported without GC so that it will automatically handle shared types and other new types correctly.
* [threads] Simplify and generalize heap type writing without GC (#6765)Thomas Lively2024-07-171-14/+1
| | | | | | | | | | We represent `ref.null`s as having bottom heap types, even when GC is not enabled. Bottom heap types are a feature of the GC proposal, so in that case the binary writer needs to write the corresponding top type instead. We previously had separate logic for this for each type hierarchy in the binary writer, but that did not handle shared types and would not have automatically handled other new types, either. Simplify and generalize the implementation and test that we can write `ref.null`s of shared types without GC enabled.
* [threads] Fix shared ref.eq and disallow mixed-shareability (#6763)Thomas Lively2024-07-172-1/+8
| | | | | | | Update the validator to reject mixed-shareability ref.eq, although this is still under discussion in https://github.com/WebAssembly/shared-everything-threads/issues/76. Fix the implementation of `Literal::operator==` to work properly with shared i31ref.
* Revert "[threads] Allow i31refs of mixed shareability to compare equal ↵Thomas Lively2024-07-171-9/+3
| | | | | | | | | | | | | | (#6752)" (#6761) Allowing Literals with different types to compare equal causes problems for passes that want equality to mean real equality, e.g. because they are using literals as map keys or because they otherwise need to use them interchangeably. At a minimum, we would need to differentiate a `refEq` operation where mixed-shareability i31refs can compare equal from physical equality on Literals, but there is also appetite to disallow mixed-shareability ref.eq at the spec level. See https://github.com/WebAssembly/shared-everything-threads/issues/76.
* Error more clearly on wasm components (#6751)Alon Zakai2024-07-171-1/+9
| | | | | | Component binary format: https://github.com/WebAssembly/component-model/blob/main/design/mvp/Binary.md#component-definitions Context: https://github.com/WebAssembly/binaryen/issues/6728#issuecomment-2231288924
* [threads] Validate all features required by ref.null (#6758)Thomas Lively2024-07-171-2/+4
| | | | | | | `ref.null` of shared types should only be allowed when shared-everything is enabled, but we were previously checking only that reference types were enabled when validating `ref.null`. Update the code to check all features required by the null type and factor out shared logic for printing lists of missing feature options in error messages.
* Make it possible to skip several passes (#6714)Jérôme Vouillon2024-07-171-1/+1
| | | --skip-pass can now be specified more than once on the commandline.
* [threads] Validate all features required by ref.null (#6757)Thomas Lively2024-07-161-15/+25
| | | | | | | `ref.null` of shared types should only be allowed when shared-everything is enabled, but we were previously checking only that reference types were enabled when validating `ref.null`. Update the code to check all features required by the null type and factor out shared logic for printing lists of missing feature options in error messages.
* [threads] Fix feature detection for shared basic heap types (#6756)Thomas Lively2024-07-161-4/+4
| | | | The logic for adding the shared-everything feature was not previously executed for shared basic heap types.
* Add C and JS APIs to control more pass options (#6713)Jérôme Vouillon2024-07-163-0/+147
| | | | | | | Add functions to: * Set and get the trapsNeverHappen, closedWorld, generateStackIR and optimizeStackIR flags * Manage the list of passes to skip.
* [threads] Update TypeSSA for shared types (#6753)Thomas Lively2024-07-161-1/+4
| | | | When creating a new subtype, make sure to copy the supertype's shareability.
* [threads] Allow i31refs of mixed shareability to compare equal (#6752)Thomas Lively2024-07-161-3/+9
| | | | | | | Normally, values of different types can never compare equal to each other, but since i31refs are not actually allocations, `ref.eq` has no way to differentiate a shared i31ref and an unshared i31ref with the same value, so it will report them as equal. Update the implementation of value equality to reflect this correctly.
* [NFC] Clarify and standardize order in flexibleCopy (#6749)Alon Zakai2024-07-162-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | flexibleCopy always visited parents before children, but it visited vector children in reverse order: (call ;; 1 (call $a) ;; 3 (call $b) ;; 2 ) The order of children happened to not matter in any user of this code, and that's just what you get when you iterate over children in a vector and push them to a stack before visiting them, so this odd ordering was not noticed. For a new user I will introduce soon, however, it would be nice to have the normal pre-order: (call ;; 1 (call $a) ;; 2 (call $b) ;; 3 ) (2 & 3 swapped). This cannot be tested in the current code as it is NFC, but the later PR will depend on it and test it heavily.
* Remove extra space printed in empty structs (#6750)Thomas Lively2024-07-161-4/+0
| | | | | | When we switched to the new type printing machinery, we inserted this extra space to minimize the diff in the test output compared with the previous type printer. Improve the quality of the printed output by removing it.
* [threads] Fix encoding of shared types (#6746)Thomas Lively2024-07-151-1/+1
|
* Simplify fuzzer generation of function references (#6745)Thomas Lively2024-07-151-17/+11
| | | | | | | | | | | | When creating a reference to `func`, fix the probability of choosing to continue on to choose some function other than the last one rather than making it depend on the number of functions. Then, do not eagerly pick from the rest of the candidate functions. Instead, fall through to the more general logic that will already pick a random candidate function. Also move the logic for coming up with a concrete signature down to where it is needed. These simplifications will make it easier to update the code to handle shared types.
* Allow different arguments for multiple instances of a pass (#6687)Christian Speckner2024-07-1520-79/+170
| | | | | | | | | | | | Each pass instance can now store an argument for it, which can be different. This may be a breaking change for the corner case of running a pass multiple times and setting the pass's argument multiple times as well (before, the last pass argument affected them all; now, it affects the last instance only). This only affects arguments with the name of a pass; others remain global, as before (and multiple passes can read them, in fact). See the CHANGELOG for details. Fixes #6646
* [threads] Fix struct op validation for shared null (#6742)Thomas Lively2024-07-131-1/+1
|
* Monomorphize dropped functions (#6734)Alon Zakai2024-07-125-36/+227
| | | | | | | | | | | | | | | | | | | | | | | | | We now consider a drop to be part of the call context: If we see (drop (call $foo) ) (func $foo (result i32) (i32.const 42) ) Then we'd monomorphize to this: (call $foo_1) ;; call the specialized function instead (func $foo_1 ;; the specialized function returns nothing (drop ;; the drop was moved into here (i32.const 42) ) ) With the drop now in the called function, we may be able to optimize out unused work. Refactor a bit of code out of DAE that we can reuse here, into a new return-utils.h.
* Remove non-standard `i31.new` (#6736)Thomas Lively2024-07-121-20/+9
| | | | The standard name for the instruction is `ref.i31`. Remove support for the non-standard name and update tests that were still using it.
* [threads] ref.i31_shared (#6735)Thomas Lively2024-07-1216-32/+76
| | | | | | | Implement `ref.i31_shared` the new instruction for creating references to shared i31s. Implement binary and text parsing and emitting as well as interpretation. Copy the upstream spec test for i31 and modify it so that all the heap types are shared. Comment out some parts that we do not yet support.
* SafeHeap: Handle overflows when adding the pointer and the size (#6409)Alon Zakai2024-07-121-11/+33
| | | | | | | | | | | | | E.g. loading 4 bytes from 2^32 - 2 should error: 2 bytes are past the maximum address. Before this PR we added 2^32 - 2 + 4 and overflowed to 2, which we saw as a low and safe address. This PR adds an extra check for an overflow in that add. Also add unreachables after calls to segfault(), which reduces the overhead of the extra check here (the unreachable apparently allows VMs to see that control flow ends, after the segfault() which is truly no-return). Fixes emscripten-core/emscripten#21557
* Do not abbreviate items in element segments (#6737)Thomas Lively2024-07-121-1/+2
| | | | | | | | The full syntax for an expression in an element syntax looks like `(item (ref.null none))`, but we have been printing the abbreviated version, which omits the `(item ...)`. This abbreviation is only valid when the item has only a single instruction, so it is not always correct to use it. Rather than determining whether or not to use the abbreviation on a case-by-case basis, always print the full syntax.
* Memory64Lowering: Handle -1 return value from memory.grow (#6733)Sam Clegg2024-07-111-2/+25
| | | This edge case make the lowering a little more tricky.
* Monomorphize: Use -O3 over -O1 + tweaks (#6732)Alon Zakai2024-07-111-22/+12
| | | | | Eventually we will need to do some tuning of compile time speed, but for now it is going to be simpler to do all the opts, in particular because it makes writing tests simpler.
* [threads] Shared polymorphism for extern conversions (#6730)Thomas Lively2024-07-112-11/+16
| | | | | `any.convert_extern` and `extern.convert_any` return references to shared heap types iff their operands are references to shared heap types.