| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
This pulls out the subtype-exprs.h parts of #6108
These are NFC in the current codebase, but are fixes for that unlanded PR, and
another unrelated PR that will be opened shortly.
|
|
|
|
| |
Also add an end-to-end test using node to verify we can parse the escaped
content properly using TextDecoder+JSON.parse.
|
|
|
|
|
| |
allocateUTF8OnStack (#6324)
This avoids a warning on recent Emscripten.
|
|
|
|
|
|
|
|
|
|
|
| |
The lexer was previously an iterator over tokens, but that expressivity is not
actually used in the parser. Instead, we have `input.h` that adapts the token
iterator interface into an iterface that is actually useful.
As a first step toward simplifying the lexer implementation to no longer be an
iterator over tokens, update its interface by moving the adaptation from input.h
to the lexer itself. This requires extensive changes to the lexer unit tests,
which will not have to change further when we actually simplify the lexer
implementation.
|
| |
|
| |
|
|
|
| |
StringAs's output must be non-nullable, so add a cast.
|
| |
|
|
|
|
|
|
|
| |
This adds just enough support to be able to --fuzz-exec a small but realistic fuzz
testcase from Java.
To that end, just implement the minimal ops we need, which are all related to
JS-style strings.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The input module might use an array of 16-bit elements type that is somewhere in a
giant rec group, but that is not valid for imported strings: that array type is now on an
import and must match the expected ABI, which is to be in its own personal rec group.
The old array16 type remains in the module after this transformation, but all uses of it
are replaced with uses of the new array16 type.
Also move makeImports to after updateTypes: there are no types to update in the new
imports. That does not matter but it can make debugging less pleasant, so improve it.
|
|
|
|
| |
The LLVM wasm backend grows the stack downwards, and this pass did not
fully account for that before.
|
|
|
|
|
| |
Replacing the string heap type with extern is dangerous as they do not share top/bottom
types. In practice this works out almost everywhere except for a few ifs, which we can fix
up as a hack for now.
|
|
|
|
| |
We want to actually remove all stringref appearances, in both public and
private types.
|
|
|
| |
Arrays have immutable length, so we can optimize them like immutable fields.
|
| |
|
| |
|
|
|
|
| |
Construct a mapping from heap type and field name to field index, then use it
while parsing instructions.
|
|
|
|
|
|
|
|
|
| |
Get as many of the lit tests as possible to parse with the new parser, mostly by
moving declared module items to be after imports. Also fix a bug in the new
parser's pop validation to allow supertypes of the expected type.
The two big issues that still prevent some lit tests from working correctly
under the new parser are missing support for symbolic field names and missing
support for source map annotations.
|
|
|
|
| |
Removing support for the legacy syntax will allow us to avoid implementing
support for it in the new text parser.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SimplifyGlobals already does this, so this is a subset of that pass, and does not
add anything new. It is useful for testing, however.
In particular it allows testing that we propagate subsequent globals in a single
pass, that is if one global reads from another and becomes constant, then it
can be propagated as well. SimplifyGlobals runs multiple passes so this always
worked, but with this pass we can test that we do it efficiently in one pass.
This will also be useful for comparing stringref to imported strings, as it
allows gathered strings to be propagated to other globals (possible with
stringref, but not imported strings) but not anywhere else (which might have
downsides as it could lead to more allocations).
Also add an additional test for simplify-globals that we do not get confused by
an unoptimizable global.get in the middle (see last part).
|
|
|
| |
All those in the list from #6271 (comment)
|
|
|
|
|
|
|
|
|
|
| |
We previously had a bug where we would begin and end an IRBuilder context for
imported functions even though they don't have bodies. For functions that return
results, ending this empty scope should have produced an error except that we
had another bug where we only produced that error for multivalue functions. We
did not previously have imported multivalue functions in wat-kitchen-sink.wast,
so both of these bugs went undetected. Fix both bugs and update the test to
include an imported multivalue function so that it would have failed without
this fix.
|
|
|
|
|
| |
globals (#6285)
Before we propagated to the top level, but not to anything interior.
|
|
|
|
|
|
|
|
| |
The new parser enforces the rule that imports must come before declarations
(except for type declarations). The old parser does not enforce this rule, so
many of our tests did not follow it. Fix them to follow that rule and fix other
invalid syntax. Also add missing finalization of Load expressions in
wasm-builder.h that was causing a test to fail under the new parser and guard
against an error case in wasm-ir-builder.cpp that used to cause a segfault.
|
|
|
|
| |
Now that we have a .cpp file, none of the code that was in string.h needs to be
in a header any more.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update identifiers used in tests to use a format supported by the new text
parser, i.e. either the standard format with its limited set of allowed
characters or the non-standard `$"..."` format. Notably, any name containing
square or curly braces now uses the string format.
Input automatically updated with this script:
https://gist.github.com/tlively/4e22311736661849e641d02e521a0748
The printer is updated to properly escape names in more places as well. The
logic for escaping names is moved to a common location so that the type
printing logic in wasm-type.cpp can use it as well.
|
|
|
|
|
|
|
|
|
|
| |
In addition to normal identifiers, support parsing identifiers of the format
`$"..."`. This format is not yet allowed by the standard, but it is a popular
proposed extension (see https://github.com/WebAssembly/spec/issues/617 and
https://github.com/WebAssembly/annotations/issues/21).
Binaryen has historically allowed a similar format and has supported arbitrary
non-standard identifier characters, so it's much easier to support this extended
syntax than to fix everything to use the restricted standard syntax.
|
|
|
|
|
| |
They were previously optional to ease the transition to the standard text
format, but now we can make them mandatory to match the spec. This will simplify
the new text parser as well.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds `--experimental-new-eh` option to `wasm-opt`. The difference
between this and `--translate-to-new-eh` is, `--translate-to-new-eh`
just runs `TranslateToNewEH` pass, while `--experimental-new-eh`
attaches `TranslateToNewEH` pass at the end of the whole optimization
pipeline. So if no other passes or optimization options (`-On`) are
specified, it is equivalent to `--translate-to-new-eh`. If other
optimization passes are specified, it runs them and at the end run the
translator to ensure the new EH instructions are emitted. The reason we
are doing this this way is that the optimization pipeline as a whole
does not support the new EH instruction yet, but we would like to
provide an option to emit a reasonably OK code with the new EH
instructions.
This also means when the optimization level > 3, it will also run
the StackIR + local2stack optimization after the translation.
Not sure how to test the output of this option, given that there is not
much point in testing the default optimization passes, and it is also
not clear how to print the stack IR if the stack ir generation and
optimization runs as a part of the pipeline and not the explicit command
line options.
This is created in favor of #6267, which added the option to
`optimization-options.h`. It had a problem of running the translator
multiple times when `-On` was given multiple times in the command line,
which I learned was rather a common usage. This adds the option directly
to `wasm-opt.cpp`, which avoids the problem. With this, it is still
possible to create and optimize Stack IR unnecessarily, but that feels a
better alternative.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This extends StringGathering by replacing the gathered string globals to imported
globals. It adds a custom section with the strings that the imports are expected to
provide. It also replaces the string type with extern.
This is a complete lowering of strings, except for string operations that are a TODO.
After running this, no strings remain in the wasm, and the outside JS is expected
to provide the proper imports, which it can do by processing the JSON of the
strings in the custom section "string.consts", which looks like
["foo", "bar", ..]
That is, an array of strings, which are imported as
(import "string.const" "0" (global $string.const_foo (ref extern))) ;; foo
(import "string.const" "1" (global $string.const_bar (ref extern))) ;; bar
|
|
|
|
|
|
|
| |
#6244 tried to do this but was not quite right. It treated a string like an array
or a struct, which means create a global for it. But just creating a global isn't
enough, as it needs to also be sorted in the right place etc. which requires
changes in other places. But there is a much simpler solution here: string
constants are just constants, which we can emit in-line, so do that.
|
| |
|
|
|
|
|
|
| |
Have a single implementation for lexing each of unsigned, signed, and
uninterpreted integers, each generic over the bit width of the integer. This
reduces duplication in the existing code and it will make it much easier to
support lexing more 8- and 16-bit integers.
|
|
|
| |
Followup to #6243 which handled empty ones.
|
| |
|
|
|
|
|
|
|
|
|
| |
Move from segment indexes to names. This is a breaking change to make the API more
capable and consistent. An effort has been made to reduce the burden on C API users
where possible (specifically, you can avoid providing names and let Binaryen make them
for you, which will basically be numbers that match the indexes from before).
Fixes #6247
|
| |
|
|
|
| |
We only noted the type but not the literal value.
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 9090ce56fcc67e15005aeedc59c6bc6773220f11.
This has the effect of once more propagating string constants from
globals to other places (and from non-globals too), which is useful
for various optimizations even if it isn't useful in the final output.
To fix the final output problem, #6257 added a pass that is run at the
end to collect string.const to globals, which allows us to once more
propagate strings in the optimizer, now without a downside.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pass finds all string.const and creates globals for them. After this transform, no
string.const appears anywhere but in a global, and each string appears in one global
which is then global.get-ed everywhere.
This avoids overhead in VMs where executing a string.const is an allocation, and is
also a good step towards imported strings. For that, this pass will be extended from
gathering to a full lowering pass, which will first gather into globals as this pass does,
and then turn each of those globals with a string.const into an imported externref.
(For that reason this pass is in a file called StringLowering, as the two passes will
share much of their code, and the larger pass should decide the name I think.)
This pass runs in -O2 and above. Repeated executions have no downside (see
details in code).
|
|
|
|
| |
Specifically offsets larger than 2^32 which were being interpreted
misinterpreted here as very large int64_t values.
|
|
|
|
|
|
| |
The previous name feels too verbose and unwieldy.
This also removes the "new-to-old EH" placeholder. I think it'd be
better to add it back when it is actually added.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Parse pop expressions and check that they have the expected types, but do not
actually create new Pop expressions or push anything onto the stack because we
already create Pop expressions as necessary when visiting the beginning of catch
blocks.
Unlike the legacy text parser, the new text parser is not capable of parsing
pops in invalid locations in the IR. This means that the new text parser will
never be able to parse test/lit/catch-pop-fixup-eh-old.wast, which deliberately
parses invalid IR to check that the pops can be fixed up and moved to the
correct locations. It should be acceptable to delete that test when we turn on
the new parser by default, though, so that won't be a problem.
|
|
|
|
|
|
| |
Rather than `(pop valtype*)`, use `(pop valtype)`, where `valtype` is now
allowed to be a tuple. This will make it possible to parse un-folded multivalue
pops in the new text parser. The alternative would have been to put an arity in
the syntax like we have for other tuple instructions, but that's much uglier.
|
|
|
|
| |
These instructions always pop a single value, except when tuples are involved,
in which case they need special handling to know how many values to pop.
|