| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This PR is part of a series that adds basic support for the typed continuations proposal.
This PR relaxes the restriction that tags must not have results , only params. Tags with
results must not be used for exception handling and are only allowed if the typed
continuations feature is enabled.
As a minor point, this PR also changes the printing of tags without params: To make the
presentation consistent, (param) is omitted when printing a tag.
|
|
|
|
|
|
|
| |
This PR is part of a series that adds basic support for the [typed continuations
proposal](https://github.com/wasmfx/specfx).
This particular PR simply extends `FeatureSet` with a corresponding entry for
this proposal.
|
|
|
| |
Adds a std::variant to represent the context of why a unique symbol was inserted in the stringified module. This allows us to pass necessary contextual data to subclasses of StringifyWalker in a structured manner.
|
|
|
|
| |
properly (#5994)
|
|
|
| |
This PR adds a StringProcessor struct intended to hold functions that filter vectors of SuffixTree::RepeatedSubstring, and a test of its first functionality, removing overlapping repeated substrings.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#5989)
Fixes #5983: The testcase from there is used here in a new testcase
remove-unused-brs_levels in which we check if we are willing to unconditionally
do a division operation. Turning an if with an arm that does a division into a
select, which always does the division, is almost 5x slower, so we should probably
be extremely careful about doing that.
I took some measurements and have some suggestions for changes in this PR:
* Raise the cost of div/rem to what I measure on my machine, which is 5x slower
than an add, or worse.
* For some reason we added the if arms rather than take the max of them, so
fix that. This does not help the issue, but was confusing.
* Adjust TooCostlyToRunUnconditionally in the pass from 9 to 8 (this helps
balance the last point).
* Use half that value when not optimizing for size. That is, we allow only 4 extra
unconditional work normally, and 8 in -Os, and when -Oz then we allow any
extra amount.
Aside from the new testcases, some existing ones changed. They all appear to
change in a reasonable way, to me.
We should perhaps go even further than this, and not even run a division
unconditionally in -Os, but I wasn't sure it makes sense to go that far as
other benchmarks may be affected. For now, this makes the benchmark in
#5983 run at full speed in -O3 or -Os, and it remains slow in -Oz. The
modified version of the benchmark that only divides in the if (no other
operations) is still fast in -O3, but it become slow in -Os as we do turn that
if into a select (but again, I didn't want to go that far as to overfit on that one
benchmark).
|
|
|
|
|
|
|
|
| |
This fixes some outdated comments and typos in Asyncify and improves
some other comments. This tries to make code comments more readable by
making them more accurate and also by using the three state (normal,
unwinding, and rewinding) consistently.
Drive-by fix: Typo fixes in SimplifyGlobals and wasm-reduce option.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
```wast
(if (result i32)
(expr0)
(i32.const 1)
(expr1)
)
```
can be written as
```wast
(i32.or
(expr0)
(expr1)
)
```
Also this removes some unused variables and methods.
This also adds an optimization for
```wast
(i32.eqz
(global.get $__asyncify_state)
)
```
in `--mod-asyncify-always-and-only-unwind` to fix an unexpected
regression caused by this.
|
|
|
|
|
| |
C++20 will automatically generate an operator== with reversed operand order,
which is ambiguous with the written operator== when one argument is marked
const and the other isn't.
|
|
|
|
|
|
| |
The parser previously parsed labels and could attach them to control flow
structures, but did not maintain the context necessary to correctly parse
branches. Support parsing labels as both names and indices in IRBuilder,
handling shadowing correctly, and use that support to implement parsing of br.
|
|
|
|
|
|
| |
Just like we do with other casts, refine the cast type to be the greatest lower
bound of its previous cast type and its input type. The difference is that the
output type of ref.test remains i32, but it's still useful to retain more
precise type information.
|
|
|
|
| |
Instead of just reporting the reason and line + column, also log out the element
the error occurred at.
|
|
|
|
|
|
| |
Move the topological sort from the constructor to a separate method. This CRTP
utility calls into its subclass to query supertype relationships, but doing so
in the base class constructor means that the subclass has not been fully
initialized yet, which can cause problems.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we see A->f0 = A->f0 then we might be copying fields not only between
instances of A but also of any subtypes of A, and so if some subtype has
value x then that x might now have reached any other subtype of A
(even in a sibling type, so long as A is their parent).
We already thought we were handling that, but the mechanism we used to
do so (copying New info to Set info, and letting Set info propagate) was not
enough.
Also add a small constructor to save the work of computing subTypes again.
Add TODOs for some cases that we could optimize regarding copies but
do not, yet.
|
| |
|
|
|
| |
Like table.set, it can modify a table.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
local2stack removes a pair of
local.set 0
local.get 0
when that set is not used anywhere else: whatever value is put into the local,
we can just leave it on the stack to replace the get. However, we only handled
actual uses of the set which we checked using LocalGraph. There may be code
that does not actually use the set local, but needs that set purely for validation
reasons:
local.set 0
local.get 0
block
local.set 0
end
local.get
That last get reads the value set in the block, so the first set is not used by it.
But for validation purposes, the inner set stops helping at the block end, so
we do need that initial set.
To fix this, check for gets that need our set to validate before removing any.
Fixes #5917
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#5968)
Apparently $N (e.g. FooClass$5) is a convention in Java for anonymous classes, so our
$N that we use to disambiguate could be confusing. As the way we disambiguate does
not matter, switch to using _N. This PR does that in both TypeSSA and NameTypes.
Also make NameTypes "lint" names as it goes. That pass tries to give types nice names,
leaving existing ones that seem ok, and renaming long or unnamed ones. This PR makes
it aware of the _N notation and it tries to remove it, if removing it does not cause a
collision. An example of how that helps is if TypeSSA creates a subtype $Foo_0 and then
we manage to remove $Foo, then we can use the shorter name for the subtype.
|
|
|
|
|
|
| |
Add a `visitFunctionStart` function to IRBuilder and make it responsible for
setting the function's body when the context is closed. This will simplify
outlining, will be necessary to support branches to function scope properly, and
removes an extra block around function bodies in the new wat parser.
|
|
|
| |
Parse loops in the new wat parser and add support for them to the IRBuilder.
|
|
|
|
|
|
| |
Somewhat counterintuitively, the text syntax for a folded `if` allows any number
of folded instructions in the condition position, not just one. Update the
corresponding `foldedinsts` parsing function to parse arbitrary sequences of
folded instructions and add a test.
|
|
|
|
|
|
|
|
|
|
|
| |
The new wat parser previously returned InstrT types when parsing individual
instructions and collected InstrsT types when parsing sequences of instructions.
However, instructions were always actually tracked in the internal state of the
parsing context, so these types never held any interesting or necessary data.
Simplify the parser by removing these types and leaning into the pattern that
the parser context will keep track of parsed instructions.
This allows for a much cleaner separation between the `instrs` and
`foldedinstrs` parser functions.
|
|
|
|
|
|
| |
Parse both the straight-line and folded versions of if, including the
abbreviations that allow omitting the else clause. In the IRBuilder, generalize
the scope stack to be able to track scopes other than blocks and add methods for
visiting the beginnings of ifs and elses.
|
|
|
|
|
| |
Probably any array of non-reference data can be allowed to be public and sent
out of the module, as it is just data. For now, however, just special case the i8
and i16 array types which are useful already for string interop.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
E.g.
(local $x (ref eq)
...
(local.set $x
(struct.new $float
...
)
)
(struct.get $float 0
(ref.cast (ref $float)
(local.get $x)
)
)
This PR allows us to use heap2local, ignoring the passing cast.
This is similar to existing handling of ref.as_non_null.
|
|
|
|
|
|
| |
Before in getType() we silently dropped the params of a signature type. Now we verify that
it is none, or we error.
Helps #5950
|
|
|
|
| |
NFC, but fixes a current fuzz bug on table.fill not having an entry in this file. After
this PR, there is no need for such entries.
|
|
|
|
|
|
| |
And put the new files in a new source directory, "parser". This is a rough split
and is not yet expected to dramatically improve compile times. The exact
organization of the new files is subject to change, but this splitting should be
enough to make further parser development more pleasant.
|
|
|
|
|
|
|
|
|
| |
In general, the binary lowering of tuple.extract expects that all the tuple
values are on top of the stack, so it inserts drops and possibly uses a scratch
local to ensure only the extracted value is left. However, when the extracted
tuple expression is a local.get, local.tee, or global.get, it's much more
efficient to change the lowering of the get or tee to ensure that only the
extracted value is on the stack to begin with. Implement that optimization in
the binary writer.
|
|
|
|
| |
This Stack IR optimization is not compatible with a much more powerful
optimization we plan to do for tuples in the binary writer.
|
|
|
|
|
|
|
|
|
|
| |
Visiting a block should push it onto the stack just like visiting any other
expression, but we previously had a `visitBlock` that introduced a new scope
instead. Fix `visitBlock` to behave as expected and introduce a new
`visitBlockStart` method to introduce a new scope.
Unfortunately this cannot be fully tested yet because the wat parser uses the
`makeXYZ` API intead of the `visit` API, but at least I updated `makeBlock` to
call `visitBlockStart`, so that is tested.
|
|
|
|
|
|
|
|
|
| |
TypeFinalization finalizes all types that we can, that is, all private types that have no
children. TypeUnFinalization unfinalizes (opens) all (private) types.
These could be used by first opening all types, optimizing, and then finalizing, as that
might find more opportunities.
Fixes #5933
|
| |
|
|
|
| |
table.fill requires bulk memory to be enabled, not reference types.
|
|
|
|
|
|
|
|
| |
This instruction was standardized as part of the bulk memory proposal, but we
never implemented it until now. Leave similar instructions like table.copy as
future work.
Fixes #5939.
|
|
|
|
|
|
|
| |
Remove support for the "struct_subtype", "array_subtype", "func_subtype", and
"extends" notations we used at various times to declare WasmGC types, leaving
only support for the standard text fromat for declaring types. Update all the
tests using the old formats and delete tests that existed solely to test the old
formats.
|
|
|
|
|
| |
This reverts commit 56ce1eaba7f500b572bcfe06e3248372e9672322. The binary writer
optimization is not always correct when stack IR optimizations have run. Revert
the change until we can fix it.
|
|
|
|
|
|
|
|
|
| |
In general, the binary lowering of tuple.extract expects that all the tuple
values are on top of the stack, so it inserts drops and possibly uses a scratch
local to ensure only the extracted value is left. However, when the extracted
tuple expression is a local.get, local.tee, or global.get, it's much more
efficient to change the lowering of the get or tee to ensure that only the
extracted value is on the stack to begin with. Implement that optimization in
the binary writer.
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases tuples are obviously not needed, such as when they are only used
in local operations and make/extract. Such tuples are not used as return values or
in control flow structures, so we might as well lower them to individual locals per
lane, which other passes can optimize a lot better.
I believe LLVM does the same with its own tuples: it lowers them as much as
possible, leaving only necessary ones.
Fixes #5923
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This PR changes how file paths and the command line are handled. On startup on Windows,
we process the wstring version of the command line (including the file paths) and re-encode
it to UTF8 before handing it off to the rest of the command line handling logic. This means
that all paths are stored in UTF8-encoded std::strings as they go through the program, right
up until they are used to open files. At that time, they are converted to the appropriate native
format with the new to_path function before passing to the stdlib open functions.
This has the advantage that all of the non-file-opening code can use a single type to hold paths
(which is good since std::filesystem::path has proved problematic in some cases), but has the
disadvantage that someone could add new code that forgets to convert to_path before
opening. That's somewhat mitigated by the fact that most of the code uses the ModuleIOBase
classes for opening files.
Fixes #4995
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
E.g.
(tuple.extract 1
(tuple.make (A) (B) (C))
=>
(B)
Modify some existing tests to not be in this trivial form, so that they do not
stop testing what they should.
|
|
|
|
|
| |
Replace i31.new with ref.i31 in the printer, tests, and source code. Continue
parsing i31.new for the time being to allow a graceful transition. Also update
the JS API to reflect the new instruction name.
|
|
|
|
| |
Fixes #5928 , on FreeBSD off_t is not defined in the headers we include.
|
|
|
|
|
|
|
|
| |
Globally replace the source string "I31New" with "RefI31" in preparation for
renaming the instruction from "i31.new" to "ref.i31", as implemented in the spec
in https://github.com/WebAssembly/gc/pull/422. This would be NFC, except that it
also changes the string in the external-facing C APIs.
A follow-up PR will make the corresponding behavioral change.
|
|
|
|
| |
The legacy encodings remain available for now by defining
USE_LEGACY_GC_ENCODINGS at build time.
|
|
|
|
| |
Remove the old forms of ref.test and ref.cast that took heap types instead of
ref types and remove the old array.init_static name for array.new_fixed.
|
|
|
|
|
|
| |
Previously, the printer incorrectly reconstructed imported functions' types from
their signatures instead of printing their types directly. This could cause the
printer to print uses of types that were never defined and did not exist in the
module. Fix the bug by printing imported functions' heap types directly.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Match the spec and parse the shorthand binary and text formats as final and emit
final types without supertypes using the shorthands as well. This is a
potentially-breaking change, since the text and binary shorthands can no longer
be used to define types that have subtypes.
Also make TypeBuilder entries final by default to better match the spec and
update the internal APIs to use the "open" terminology rather than "final"
terminology. Future changes will update the text format to use the standard "sub
open" rather than the current "sub final" keywords. The exception is the new wat
parser, which supporst "sub open" as of this change, since it didn't support
final types at all previously.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Almost entirely trivial, except for this part:
- if (nextDebugLocation.availablePos == 0 &&
- nextDebugLocation.previousPos <= pos) {
+ if (nextDebugLocation.availablePos == 0) {
return;
I believe removing the extra check has no effect. Removing it does not change
anything in the test suite, and logically, if we set availablePos to 0 then we
definitely want to return here - we set it to 0 to indicate there is nothing left
to read, which is what the code after it does.
As a result, we can remove the previousPos field entirely.
|