| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Previously we were tracking whether integer tokens were signed but we did not
differentiate between positive and negative signs. Unfortunately, without
differentiating them, there's no way to tell the difference between an in-bounds
negative integer and a wildly out-of-bounds positive integer when trying to
perform bounds checks for s32 tokens. Fix the problem by tracking not only
whether there is a sign on an integer token, but also what the sign is.
|
| |
|
| |
|
|
|
|
|
|
| |
wat-parser-internal.h was already quite large after implementing just the lexer,
so it made sense to rename it to be lexer-specific and start a new file for the
higher-level parser. Also make it a proper .cpp file and split the testable
interface out into wat-lexer.h.
|
|
|
|
|
|
| |
calls (#4660)
This extends the existing call_indirect code to do the same for call_ref,
basically. The shared code is added to a new helper utility.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds exported tags to `exports` section in wasm-emscripten-finalize
metadata so Emscripten can use it.
Also fixes a bug in the parser. We have only recognized the export
format of
```wasm
(tag $e2 (param f32))
(export "e2" (tag $e2))
```
and ignored this format:
```wasm
(tag $e1 (export "e1") (param i32))
```
Companion patch: https://github.com/emscripten-core/emscripten/pull/17064
|
|
|
|
| |
Improve comments and variable names to make it clear that we allocate and build
a separate string only when necessary to handle escape sequences.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than trying to actually implement the parsing of float values, which
cannot be done naively due to precision concerns, just parse the float grammar
then postprocess the parsed text into a form we can pass to `strtod` to do the
actual parsing of the value.
Since the float grammar reuses `num` and `hexnum` from the integer grammar but
does not care about overflow, add a mode to `LexIntCtx`, `num`, and `hexnum` to
allow parsing overflowing numbers.
For NaNs, store the payload as a separate value rather than as part of the
parsed double. The payload will be injected into the NaN at a higher level of
the parser once we know whether we are parsing an f64 or an f32 and therefore
know what the allowable payload values are.
|
|
|
|
|
| |
Also include reserved words that look like keywords to avoid having to find and
enumerate all the valid keywords. Invalid keywords will be rejected at a higher
level in the parser instead.
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
We were missing CallRef in the CFG traversal code in a place where we
note possible exceptions. As a result we thought CallRef cannot throw, and
were missing some control flow edges.
To actually detect the problem, we need to validate non-nullable locals
properly, which we were not doing. This adds that as well.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Begin implementing a new text format parser that will accept the
standard text format. Start with a lexer that can iterate over
tokens in an underlying text buffer. The initial supported tokens
are integers, parentheses, and whitespace including comments.
The implementation is in a new private internal header so it can
be included into a gtest source file even though it is not meant
to be a public API. Once the parser is more complete, there will
be an additional public header exposing a more concise public API
and the private header will be included into a source file that
implements that public API.
The new parser will improve on the existing text format parser
not only because it will accept the full standard text format,
but also because its code will be simpler and easier to maintain
and because it will hopefully be faster as well. The new parser
will be built out of small functions that closely mirror the
grammar productions given in the spec and will heavily use C++17
features like string_view, optional, and variant to provide more
self-documenting and efficient code.
Future PRs will add support for lexing other kinds of tokens
followed by support for parsing more complex constructs.
|
|
|
| |
Based on #3573 plus minor fixes
|
|
|
|
|
|
|
|
|
|
| |
Optionally avoid updating types in TypeUpdating::updateParamTypes(). That update
is incomplete if the function signature is also changing, which is the case in
SignatureRefining (but not DeadArgumentElimination). "Incomplete" means that
we updated the local.get type, but the function signature does not match yet. That
incomplete state can hit an internal error in GlobalTypeRewriter::updateSignatures
where it updates types. To avoid that, do the entire full update only there (in
GlobalTypeRewriter::updateSignatures).
|
|
|
|
|
|
| |
We were checking that nominal modules only had a single element in their type
sections, but that's not correct for the prototype nominal binary format we
still want to support. The test for this missed catching the bug because it
wasn't actually parsing in nominal mode.
|
|
|
|
|
|
|
|
|
|
| |
Share the logic for parsing imported and non-imported globals of the formats:
(import "module" "base" (global $name? type))
(global $name? type init)
This fixes #4676, since the deleted logic for parsing imported globals did not
handle parsing GC types correctly.
|
| |
|
|
|
|
|
|
| |
With only reference types but not GC, we cannot easily create a constant
for eqref for example. Only GC adds i31.new etc. To avoid assertions in
the fuzzer, avoid randomly picking (ref eq) etc., that is, keep it nullable
so that we can emit a (ref.null eq) if we need a constant value of that type.
|
|
|
|
|
|
| |
The old code would short-circuit and not do anything after we managed
any reduction in the loop here. That would end up doing entire iterations of
the whole pipeline before removing another element segment, which could
be slow.
|
| |
|
|
|
|
|
| |
Being a const reference allows writing insert({a, b}), which will be
useful in a future PR, and there is no reason to actually update the
reference.
|
| |
|
|
|
|
|
|
| |
Also improve comments.
As suggested in #4647
|
|
|
|
|
|
|
| |
There's no reason not to allow growing by zero slots, but previously doing so
would trigger an assertion. This caused a crash when roundtripping a trivial
module.
Fixes #4667.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we could return different results depending on the order we
noted things:
note(anyref.null);
note(funcref.null);
get() => anyref.null
note(funcref.null);
note(anyref.null);
get() => funcref.null
This is correct, as nulls are equal anyhow, and any could be used in
the location we are optimizing. However, it can lead to nondeterminism
if the caller's order of notes is nondeterministic. That is the case in
DeadArgumentElimination, where we scan functions in parallel, then
merge them without special ordering.
To fix this, make the note operation symmetric. That seems simplest and
least likely to be confusing. We can use the LUB to do that.
To avoid duplicating the null logic, refactor note() to use combine().
|
|
|
|
|
| |
If we don't think that preventing copies in assignment makes sense by
itself (since we allow them on construction) then I think we can just
remove the restriction and also the implicit copy constructor.
|
|
|
|
| |
This also includes the type itself in the returned vector. This will be
useful in a future PR.
|
|
|
|
|
|
|
|
| |
Taking a Type is redundant as we only care about the heap type -
the nullability must be Nullable.
This avoids needing an assertion in the function, that is, it makes
the API more type-safe.
|
|
|
|
| |
This prevents new `RefCast` expressions that don't explicitly have their safety
set from getting an unitialized safety value.
|
|
|
|
|
| |
Casts involve branches in the VM, so adding a cast in return for removing a branch
(like If=>Select) is not beneficial. We don't want to ever do any more casts than we
already are.
|
|
|
|
|
|
| |
This unsafe experimental instruction is semantically equivalent to
ref.cast_static, but V8 will unsafely turn it into a nop. This is meant to help
us measure cast overhead more precisely than we can by globally turning all
casts into nops.
|
|
|
|
|
|
|
| |
Do not prune parameters if there is a supertype that is a signature.
Without this we crash on an assertion in TypeBuilder when we try to
recreate the types (as we try to make a a subtype with fewer fields
than the super).
|
|
|
|
|
|
|
|
|
|
| |
Diff without whitespace is smaller.
We can't emit HeapType::data without GC. Fixing that by switching to func,
another problem was uncovered: makeRefFuncConst had a TODO to handle
the case where we need a function to refer to but have created none yet. In
fact that TODO was done at the end of the function. Fix up the logic in
between to actually get there.
|
|
|
|
|
|
|
|
| |
A null target is not a valid name so nothing can branch to there. This just
saves the wasted work.
No existing code in the codebase benefits from this atm, but a later PR
will. In particular this lets callers call this without checking if the name
is non-null, which is more concise.
|
| |
|
|
|
|
|
| |
(#4629)" (#4646)
This reverts commit 4bcfba261cb8ee182261d26064453cab787d0df4.
|
|
|
|
|
|
| |
* Don't emit "i31" or "data" if GC is not enabled, as only the GC feature adds those.
* Don't emit "any" without GC either. While it is allowed, fuzzer limitations prevent
this atm (see details in comment - it's fixable).
|
|
|
|
|
|
| |
In f124a11ca3 we removed support for the prototype nominal binary format
entirely, but that means that we can no longer parse older binary modules that
used that format. Fix this regression by restoring the ability to parse the
prototype binary format.
|
|
|
|
|
|
| |
If we do not remove a param, we can try to remove the return value. We can do that
on a per-function basis, and not only if we removed no params from anywhere.
Also simplify tail call logic.
|
| |
|
|
|
|
|
|
| |
Remove `Type::externref` and `HeapType::ext` and replace them with uses of
anyref and any, respectively, now that we have unified these types in the GC
proposal. For backwards compatibility, continue to parse `extern` and
`externref` and maintain their relevant C API functions.
|
|
|
|
|
|
| |
V8 requires that supertypes come before subtypes when it parses
isorecursive (i.e. standards-track) type definitions. Since 2268f2a we are
emitting nominal types using the standard isorecursive format, so respect the
ordering requirement.
|
|
|
|
|
| |
Without this Windows fails with:
'isdigit': is not a member of 'std'
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We assume a closed world atm in the GC space, but the call.without.effects
intrinsic sort of breaks that: that intrinsic looks like an import, but we really
need to care about what is sent to it even in a closed world:
(call $call-without-effects
(ref.func $target-keep)
)
That reference cannot be ignored, as logically it is called just as if there
were a call_ref there. This adds support for that, fixing the combination of
#4621 and using call.without.effects.
Also flip the vector of ref.func names to a set. I realized that in a very
large program we might see the same name many times.
|
|
|
|
|
|
|
|
|
|
| |
Print subtype declarations using the standards-track format with a vector of
supertypes followed by a normal type declaration rather than our interim nominal
format that used alternative versions of the func, struct, and array forms.
Desugar the nominal format to additionally emit all the types into a single
large recursion group. Currently V8 is performing this desugaring, but after
this change and a future change that fixes the order of nominal types to ensure
supertypes precede subtypes, it will no longer need to.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in a function. (#4567)
* Lift the restriction in liveness-traversal.h that supported max 65535 locals in a function.
* Lint
* Fix typo
* Fix static
* Lint
* Lint
* Lint
* Add needed canRun function
* lint
* Use either a sparse or a dense matrix for tracking liveness copies, depending on the locals count.
* Lint
* Fix lint
* Lint
* Implement sparse_square_matrix class and use that as a backing.
* Lint
* Lint
* Lint #includes
* Lint
* Lint includes
* Remove unnecessary code
* Fix canonical accesses to copies matrix
* Lint
* Add missing variable update
* Remove canRun() function
* Address review
* Update expected test results
* Update test name
* Add asserts to sparse_square_matrix set and get functions that they are not out of bound.
* Lint includes
* Update test expectation
* Use .clear() + .resize() to reset totalCopies vector
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we see (ref.func $foo) that does not mean that $foo is reachable - we
must also see a (call_ref ..) of the proper type. Only after seeing both should
we mark the function as reachable, which this PR does.
This adds some complexity as we need to track intermediate state as we go,
since we could see the RefFunc before the CallRef or vice versa. We also
need to handle the case of a RefFunc without a CallRef properly: We cannot
remove the function, as the RefFunc must refer to it, but at least we can
empty out the body since we know it is never reached.
This removes an old wasm-opt test which is now superseded by a new lit
test.
On J2Wasm output this removes 3% of all functions, which account for
2.5% of total code size.
|
|
|
|
|
|
|
|
|
| |
parallel analysis (#4620)
Normally ParallelFunctionAnalysis is just an analysis, and has no effects. However, in
SignatureRefining we actually do have side effects, due to an internal limitation of the
helper code it runs. This adds a template parameter to the class so users can note that
they do modify the IR. The parameter is added in the middle as it is easier to add this
param than to add the last one (the map).
|
|
|
|
| |
* We implemented specialization of field types (the TypeRefining pass).
* LUBFinder now handles nulls, so we need nothing extra for it in TypeRefining.
|