summaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* Heap2Local: Refinalize when removing a cast (#6012)Alon Zakai2023-10-161-0/+14
|
* [wasm-split] Fix instrumentation to work with memory 64 (#6013)Thomas Lively2023-10-161-21/+23
| | | | Correctly use the output memory's index type when generating the __write_profile function. Requires moving some code around, but is a very small fix.
* GUFA: Add missing set of optimized boolean (#6010)Alon Zakai2023-10-161-0/+1
| | | | Without marking us as having optimized we didn't refinalize, and broke validation.
* Add an "unsubtyping" optimization (#5982)Thomas Lively2023-10-108-3/+591
| | | | | | | | | | | | | | Add a new pass that analyzes the module to find the minimal subtyping relation that is necessary to maintain the validity and semantics of the program and rewrites the types to use this minimal relation. Besides eliminating references to otherwise-unused intermediate types, this optimization should unlock significant additional optimizing power in other type optimizations that are constrained by having to maintain supertype validity, since after this new optimization there are fewer and more general supertypes. The analysis works by visiting each expression and module element to collect the subtypings that are required to maintain its validity, then, using that as a starting point, iteratively adding new subtypings required by type definitions and casts until reaching a fixed point.
* Fix a bug printing and emitting empty, passive element segments (#6002)Thomas Lively2023-10-091-7/+4
| | | | | | | | Empty, passive element segments were always emitted as having `func` type because all their elements trivially were RefFunc (because they have no elements) and because we were incorrectly checking table types if they existed instead of the element segment's type directly to see if it was non-func. Fix the bug by checking each element segment's type directly and add a test.
* Automatically discard global effects in the rare passes that add effects (#5999)Alon Zakai2023-10-0614-0/+53
| | | | | All logging/instrumentation passes need to do this, to avoid us using stale global effects that are too low (too high is not optimal either, but at least it cannot cause bugs).
* Compute full transitive closure in GlobalEffects (#5992)Alon Zakai2023-10-061-33/+156
|
* [typed-cont] Allow result types on tags (#5997)Frank Emrich2023-10-052-8/+33
| | | | | | | | | | | This PR is part of a series that adds basic support for the typed continuations proposal. This PR relaxes the restriction that tags must not have results , only params. Tags with results must not be used for exception handling and are only allowed if the typed continuations feature is enabled. As a minor point, this PR also changes the printing of tags without params: To make the presentation consistent, (param) is omitted when printing a tag.
* [typed-cont] Add feature flag (#5996)Frank Emrich2023-10-055-1/+16
| | | | | | | This PR is part of a series that adds basic support for the [typed continuations proposal](https://github.com/wasmfx/specfx). This particular PR simply extends `FeatureSet` with a corresponding entry for this proposal.
* [Outlining] Adds separator context (#5977)Ashley Nelson2023-10-043-32/+118
| | | Adds a std::variant to represent the context of why a unique symbol was inserted in the stringified module. This allows us to pass necessary contextual data to subclasses of StringifyWalker in a structured manner.
* Work around a gcc 13 issue with signbit that made us not compute fmin of -0 ↵Alon Zakai2023-10-041-4/+14
| | | | properly (#5994)
* [Outlining] Adds SuffixTree::RepeatSubstring dedupe test (#5972)Ashley Nelson2023-10-043-0/+56
| | | This PR adds a StringProcessor struct intended to hold functions that filter vectors of SuffixTree::RepeatedSubstring, and a test of its first functionality, removing overlapping repeated substrings.
* RemoveUnusedBrs: Allow less unconditional work and in particular division ↵Alon Zakai2023-10-032-20/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (#5989) Fixes #5983: The testcase from there is used here in a new testcase remove-unused-brs_levels in which we check if we are willing to unconditionally do a division operation. Turning an if with an arm that does a division into a select, which always does the division, is almost 5x slower, so we should probably be extremely careful about doing that. I took some measurements and have some suggestions for changes in this PR: * Raise the cost of div/rem to what I measure on my machine, which is 5x slower than an add, or worse. * For some reason we added the if arms rather than take the max of them, so fix that. This does not help the issue, but was confusing. * Adjust TooCostlyToRunUnconditionally in the pass from 9 to 8 (this helps balance the last point). * Use half that value when not optimizing for size. That is, we allow only 4 extra unconditional work normally, and 8 in -Os, and when -Oz then we allow any extra amount. Aside from the new testcases, some existing ones changed. They all appear to change in a reasonable way, to me. We should perhaps go even further than this, and not even run a division unconditionally in -Os, but I wasn't sure it makes sense to go that far as other benchmarks may be affected. For now, this makes the benchmark in #5983 run at full speed in -O3 or -Os, and it remains slow in -Oz. The modified version of the benchmark that only divides in the if (no other operations) is still fast in -O3, but it become slow in -Os as we do turn that if into a select (but again, I didn't want to go that far as to overfit on that one benchmark).
* Asyncify: Improve comments (#5987)Heejin Ahn2023-10-033-43/+56
| | | | | | | | This fixes some outdated comments and typos in Asyncify and improves some other comments. This tries to make code comments more readable by making them more accurate and also by using the three state (normal, unwinding, and rewinding) consistently. Drive-by fix: Typo fixes in SimplifyGlobals and wasm-reduce option.
* Asyncify: Simpify if into i32.or (#5988)Heejin Ahn2023-10-031-17/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | ```wast (if (result i32) (expr0) (i32.const 1) (expr1) ) ``` can be written as ```wast (i32.or (expr0) (expr1) ) ``` Also this removes some unused variables and methods. This also adds an optimization for ```wast (i32.eqz (global.get $__asyncify_state) ) ``` in `--mod-asyncify-always-and-only-unwind` to fix an unexpected regression caused by this.
* [NFC] Mark operator== as const (#5990)walkingeyerobot2023-10-031-1/+1
| | | | | C++20 will automatically generate an operator== with reversed operand order, which is ambiguous with the written operator== when one argument is marked const and the other isn't.
* [Parser] Parse labels and br (#5970)Thomas Lively2023-10-025-29/+182
| | | | | | The parser previously parsed labels and could attach them to control flow structures, but did not maintain the context necessary to correctly parse branches. Support parsing labels as both names and indices in IRBuilder, handling shadowing correctly, and use that support to implement parsing of br.
* Refine ref.test's castType during refinalization (#5985)Thomas Lively2023-10-022-0/+6
| | | | | | Just like we do with other casts, refine the cast type to be the greatest lower bound of its previous cast type and its input type. The difference is that the output type of ref.test remains i32, but it's still useful to retain more precise type information.
* wasm-s-parser: Add context in validation errors (#5981)Alon Zakai2023-09-282-202/+192
| | | | Instead of just reporting the reason and line + column, also log out the element the error occurred at.
* [NFC] Refactor SupertypesFirst utility (#5979)Thomas Lively2023-09-273-13/+15
| | | | | | Move the topological sort from the constructor to a separate method. This CRTP utility calls into its subclass to query supertype relationships, but doing so in the base class constructor means that the subclass has not been fully initialized yet, which can cause problems.
* ConstantFieldPropagation: Fully handle copies (#5969)Alon Zakai2023-09-262-9/+21
| | | | | | | | | | | | | | | | If we see A->f0 = A->f0 then we might be copying fields not only between instances of A but also of any subtypes of A, and so if some subtype has value x then that x might now have reached any other subtype of A (even in a sibling type, so long as A is their parent). We already thought we were handling that, but the mechanism we used to do so (copying New info to Set info, and letting Set info propagate) was not enough. Also add a small constructor to save the work of computing subTypes again. Add TODOs for some cases that we could optimize regarding copies but do not, yet.
* [NFC] Refactor StackIR code to clarify local variable meanings (#5975)Alon Zakai2023-09-261-10/+13
|
* Handle table.fill in Directize (#5974)Alon Zakai2023-09-261-3/+15
| | | Like table.set, it can modify a table.
* StackIR local2stack: Make sure we do not break non-nullable validation (#5919)Alon Zakai2023-09-221-1/+118
| | | | | | | | | | | | | | | | | | | | | | | | | | | local2stack removes a pair of local.set 0 local.get 0 when that set is not used anywhere else: whatever value is put into the local, we can just leave it on the stack to replace the get. However, we only handled actual uses of the set which we checked using LocalGraph. There may be code that does not actually use the set local, but needs that set purely for validation reasons: local.set 0 local.get 0 block local.set 0 end local.get That last get reads the value set in the block, so the first set is not used by it. But for validation purposes, the inner set stops helping at the block end, so we do need that initial set. To fix this, check for gets that need our set to validate before removing any. Fixes #5917
* NameTypes and TypeSSA : Prefer _ over $ in names, and lint away _N suffixes ↵Alon Zakai2023-09-222-2/+25
| | | | | | | | | | | | | | (#5968) Apparently $N (e.g. FooClass$5) is a convention in Java for anonymous classes, so our $N that we use to disambiguate could be confusing. As the way we disambiguate does not matter, switch to using _N. This PR does that in both TypeSSA and NameTypes. Also make NameTypes "lint" names as it goes. That pass tries to give types nice names, leaving existing ones that seem ok, and renaming long or unnamed ones. This PR makes it aware of the _N notation and it tries to remove it, if removing it does not cause a collision. An example of how that helps is if TypeSSA creates a subtype $Foo_0 and then we manage to remove $Foo, then we can use the shorter name for the subtype.
* Support function contexts in IRBuilder (#5967)Thomas Lively2023-09-225-42/+57
| | | | | | Add a `visitFunctionStart` function to IRBuilder and make it responsible for setting the function's body when the context is closed. This will simplify outlining, will be necessary to support branches to function scope properly, and removes an extra block around function bodies in the new wat parser.
* [Parser] Support loops (#5966)Thomas Lively2023-09-214-33/+109
| | | Parse loops in the new wat parser and add support for them to the IRBuilder.
* [Parser] Allow any number of foldedinsts in `foldedinsts` (#5965)Thomas Lively2023-09-213-60/+79
| | | | | | Somewhat counterintuitively, the text syntax for a folded `if` allows any number of folded instructions in the condition position, not just one. Update the corresponding `foldedinsts` parsing function to parse arbitrary sequences of folded instructions and add a test.
* [NFC][Parser] Simplify instruction handling (#5964)Thomas Lively2023-09-215-2346/+1554
| | | | | | | | | | | The new wat parser previously returned InstrT types when parsing individual instructions and collected InstrsT types when parsing sequences of instructions. However, instructions were always actually tracked in the internal state of the parsing context, so these types never held any interesting or necessary data. Simplify the parser by removing these types and leaning into the pattern that the parser context will keep track of parsed instructions. This allows for a much cleaner separation between the `instrs` and `foldedinstrs` parser functions.
* [Parser] Parse if-else in the new wat parser and IRBuilder (#5963)Thomas Lively2023-09-215-72/+341
| | | | | | Parse both the straight-line and folded versions of if, including the abbreviations that allow omitting the else clause. In the IRBuilder, generalize the scope stack to be able to track scopes other than blocks and add methods for visiting the beginnings of ifs and elses.
* Support i8/i16 mutable arrays as public types for string interop (#5814)Alon Zakai2023-09-214-1/+47
| | | | | Probably any array of non-reference data can be allowed to be public and sent out of the module, as it is just data. For now, however, just special case the i8 and i16 array types which are useful already for string interop.
* Make heap2local work through casts (#5952)Jérôme Vouillon2023-09-211-2/+26
| | | | | | | | | | | | | | | | | | | | | E.g. (local $x (ref eq) ... (local.set $x (struct.new $float ... ) ) (struct.get $float 0 (ref.cast (ref $float) (local.get $x) ) ) This PR allows us to use heap2local, ignoring the passing cast. This is similar to existing handling of ref.as_non_null.
* Error on multivalue inputs that we do not handle (#5962)Alon Zakai2023-09-201-2/+6
| | | | | | Before in getType() we silently dropped the params of a signature type. Now we verify that it is none, or we error. Helps #5950
* [NFC] RemoveUnusedModuleElements: Use delegations (#5961)Alon Zakai2023-09-191-93/+41
| | | | NFC, but fixes a current fuzz bug on table.fill not having an entry in this file. After this PR, there is no need for such entries.
* [NFC] Split the new wat parser into multiple files (#5960)Thomas Lively2023-09-1913-1899/+2197
| | | | | | And put the new files in a new source directory, "parser". This is a rough split and is not yet expected to dramatically improve compile times. The exact organization of the new files is subject to change, but this splitting should be enough to make further parser development more pleasant.
* Reland "Optimize tuple.extract of gets in BinaryInstWriter" (#5955)Thomas Lively2023-09-182-1/+47
| | | | | | | | | In general, the binary lowering of tuple.extract expects that all the tuple values are on top of the stack, so it inserts drops and possibly uses a scratch local to ensure only the extracted value is left. However, when the extracted tuple expression is a local.get, local.tee, or global.get, it's much more efficient to change the lowering of the get or tee to ensure that only the extracted value is on the stack to begin with. Implement that optimization in the binary writer.
* Do not optimize tuple locals in StackIR local2stack (#5958)Thomas Lively2023-09-181-1/+5
| | | | This Stack IR optimization is not compatible with a much more powerful optimization we plan to do for tuples in the binary writer.
* Fix visitBlock and add visitBlockStart in IRBuilder (#5959)Thomas Lively2023-09-192-9/+19
| | | | | | | | | | Visiting a block should push it onto the stack just like visiting any other expression, but we previously had a `visitBlock` that introduced a new scope instead. Fix `visitBlock` to behave as expected and introduce a new `visitBlockStart` method to introduce a new scope. Unfortunately this cannot be fully tested yet because the wat parser uses the `makeXYZ` API intead of the `visit` API, but at least I updated `makeBlock` to call `visitBlockStart`, so that is tested.
* Add passes to finalize or unfinalize types (#5944)Alon Zakai2023-09-186-0/+108
| | | | | | | | | TypeFinalization finalizes all types that we can, that is, all private types that have no children. TypeUnFinalization unfinalizes (opens) all (private) types. These could be used by first opening all types, optimizing, and then finalizing, as that might find more opportunities. Fixes #5933
* TupleOptimization: Handle copies of different types in unreachable code (#5956)Alon Zakai2023-09-181-4/+12
|
* Fix validation error message for table.fill (#5953)Thomas Lively2023-09-181-4/+3
| | | table.fill requires bulk memory to be enabled, not reference types.
* Implement table.fill (#5949)Thomas Lively2023-09-1821-0/+175
| | | | | | | | This instruction was standardized as part of the bulk memory proposal, but we never implemented it until now. Leave similar instructions like table.copy as future work. Fixes #5939.
* Remove legacy type defintion text syntax (#5948)Thomas Lively2023-09-181-40/+4
| | | | | | | Remove support for the "struct_subtype", "array_subtype", "func_subtype", and "extends" notations we used at various times to declare WasmGC types, leaving only support for the standard text fromat for declaring types. Update all the tests using the old formats and delete tests that existed solely to test the old formats.
* Revert "Optimize tuple.extract of gets in BinaryInstWriter (#5941)" (#5945)Thomas Lively2023-09-142-47/+1
| | | | | This reverts commit 56ce1eaba7f500b572bcfe06e3248372e9672322. The binary writer optimization is not always correct when stack IR optimizations have run. Revert the change until we can fix it.
* Optimize tuple.extract of gets in BinaryInstWriter (#5941)Thomas Lively2023-09-142-1/+47
| | | | | | | | | In general, the binary lowering of tuple.extract expects that all the tuple values are on top of the stack, so it inserts drops and possibly uses a scratch local to ensure only the extracted value is left. However, when the extracted tuple expression is a local.get, local.tee, or global.get, it's much more efficient to change the lowering of the get or tee to ensure that only the extracted value is on the stack to begin with. Implement that optimization in the binary writer.
* Add a simple tuple optimization pass (#5937)Alon Zakai2023-09-144-0/+370
| | | | | | | | | | | In some cases tuples are obviously not needed, such as when they are only used in local operations and make/extract. Such tuples are not used as return values or in control flow structures, so we might as well lower them to individual locals per lane, which other passes can optimize a lot better. I believe LLVM does the same with its own tuples: it lowers them as much as possible, leaving only necessary ones. Fixes #5923
* Encode command line to UTF8 on Windows (#5671)Derek Schuff2023-09-146-15/+98
| | | | | | | | | | | | | | | | This PR changes how file paths and the command line are handled. On startup on Windows, we process the wstring version of the command line (including the file paths) and re-encode it to UTF8 before handing it off to the rest of the command line handling logic. This means that all paths are stored in UTF8-encoded std::strings as they go through the program, right up until they are used to open files. At that time, they are converted to the appropriate native format with the new to_path function before passing to the stdlib open functions. This has the advantage that all of the non-file-opening code can use a single type to hold paths (which is good since std::filesystem::path has proved problematic in some cases), but has the disadvantage that someone could add new code that forgets to convert to_path before opening. That's somewhat mitigated by the fact that most of the code uses the ModuleIOBase classes for opening files. Fixes #4995
* OptimizeInstructions: Simplify tuple.extract of tuple.make (#5938)Alon Zakai2023-09-141-0/+19
| | | | | | | | | | | | | | E.g. (tuple.extract 1 (tuple.make (A) (B) (C)) => (B) Modify some existing tests to not be in this trivial form, so that they do not stop testing what they should.
* Replace i31.new with ref.i31 everywhere (#5931)Thomas Lively2023-09-137-19/+39
| | | | | Replace i31.new with ref.i31 in the printer, tests, and source code. Continue parsing i31.new for the time being to allow a graceful transition. Also update the JS API to reflect the new instruction name.
* Avoid off_t in small_vector.h (#5936)Alon Zakai2023-09-131-1/+3
| | | | Fixes #5928 , on FreeBSD off_t is not defined in the headers we include.