| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Avoid allocating there. This is both faster and also it ensures we never modify
our internal data structure after our constructor.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than passing both a `Ctx` and a `ParseInput` to every parsing function,
pass only a `Ctx` with a `ParseInput` inside of it. This significantly reduces
verbosity in the parser. To handle cases where parsing needs to happen at
specific locations, which used to be handled by constructing a new `ParseInput`
independent from the ctx, introduce a new RAII utility for temporarily changing
the location of the `ParseInput` inside a context.
Also add a utility for generating an error at a particular location to avoid
having to construct new `ParseInput` objects just for that purpose. This
resolves a few TODOs about correcting error locations, but since we don't test
those yet, I still consider this NFC.
|
|
|
|
|
| |
When the heap types are not subtypes of each other, but a null is possible, the
intersection exists and is a null. That null must be the shared bottom type.
|
|
|
|
|
|
|
|
|
|
|
| |
A cone type is a PossibleContents that has a base type and a depth, and it
contains all subtypes up to that depth. So depth 0 is an exact type from
before, etc.
This only adds cone type computations when combining types, that is, when we
combine two exact types we might get a cone, etc. This does not yet use the
cone info in all places (like struct gets and sets), and it does not yet define roots
of cone types, all of which is left for later. IOW this is the MVP of cone types that
is just enough to add them + pass tests + test the new functionality.
|
|
|
|
| |
Remove an obsolete error about null characters and test both binary and text
round tripping of a string constant containing an escaped zero byte.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the goal of supporting null characters (i.e. zero bytes) in strings.
Rewrite the underlying interned `IString` to store a `std::string_view` rather
than a `const char*`, reduce the number of map lookups necessary to intern a
string, and present a more immutable interface.
Most importantly, replace the `c_str()` method that returned a `const char*`
with a `toString()` method that returns a `std::string`. This new method can
correctly handle strings containing null characters. A `const char*` can still
be had by calling `data()` on the `std::string_view`, although this usage should
be discouraged.
This change is NFC in spirit, although not in practice. It does not intend to
support any particular new functionality, but it is probably now possible to use
strings containing null characters in at least some cases. At least one parser
bug is also incidentally fixed. Follow-on PRs will explicitly support and test
strings containing nulls for particular use cases.
The C API still uses `const char*` to represent strings. As strings containing
nulls become better supported by the rest of Binaryen, this will no longer be
sufficient. Updating the C and JS APIs to use pointer, length pairs is left as
future work.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Parse folded expressions as described in the spec:
https://webassembly.github.io/spec/core/text/instructions.html#folded-instructions.
The old binaryen parser _only_ parses folded expressions, and furthermore
requires them to be folded such that a parent instruction consumes the values
produced by its children and only those values. The standard format is much more
general and allows folded instructions to have an arbitrary number of children
independent of dataflow.
To prevent the rest of the parser from having to know or care about the
difference between folded and unfolded instructions, parse folded instructions
after their children have been parsed. This means that a sequence of
instructions is always parsed in the order they would appear in a binary no
matter how they are folded (or not folded).
|
|
|
| |
Finishes work missed in #5126.
|
|
|
|
| |
As an NFC preliminary change that will minimize the diff in #5122, which moves
IString to the wasm namespace.
|
|
|
| |
Making a change to wasm-validator so that Memory::kUnlimitedSize is treated properly like an unlimited case. The check for whether memory.initial < memory.max will only happen if memory.hasMax() — meaning if memory.max is not set to kUnlimitedSize.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we treated each local index as a location, and every local.set to
that index could be read by every local.get. With this we connect only
relevant sets to gets.
Practically speaking, this removes LocalLocation which is what was just
described, and instead there is ParamLocation for incoming parameter
values. And local.get/set use normal ExpressionLocations to connect a
set to a get.
I was worried this would be slow, since computing LocalGraph takes time,
but it actually more than makes up for itself on J2Wasm and we are faster
actually rocket I guess since we do less updating after local.sets.
This makes a noticeable change on the J2Wasm binary, and perhaps will
help with benchmarks.
|
|
|
|
|
| |
Unfortunately there isn't a single place where an error may occur. I tested on
several files with different flags and added sufficient warnings so that we warn
on them all.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These types, `none`, `nofunc`, and `noextern` are uninhabited, so references to
them can only possibly be null. To simplify the IR and increase type precision,
introduce new invariants that all `ref.null` instructions must be typed with one
of these new bottom types and that `Literals` have a bottom type iff they
represent null values. These new invariants requires several additional changes.
First, it is now possible that the `ref` or `target` child of a `StructGet`,
`StructSet`, `ArrayGet`, `ArraySet`, or `CallRef` instruction has a bottom
reference type, so it is not possible to determine what heap type annotation to
emit in the binary or text formats. (The bottom types are not valid type
annotations since they do not have indices in the type section.)
To fix that problem, update the printer and binary emitter to emit unreachables
instead of the instruction with undetermined type annotation. This is a valid
transformation because the only possible value that could flow into those
instructions in that case is null, and all of those instructions trap on nulls.
That fix uncovered a latent bug in the binary parser in which new unreachables
within unreachable code were handled incorrectly. This bug was not previously
found by the fuzzer because we generally stop emitting code once we encounter an
instruction with type `unreachable`. Now, however, it is possible to emit an
`unreachable` for instructions that do not have type `unreachable` (but are
known to trap at runtime), so we will continue emitting code. See the new
test/lit/parse-double-unreachable.wast for details.
Update other miscellaneous code that creates `RefNull` expressions and null
`Literals` to maintain the new invariants as well.
|
|
|
| |
Fixes emscripten-core/emscripten#17988
|
|
|
|
|
|
|
|
| |
The previous code was making emscripten-specific assumptions about
imports basically all coming from the `env` module.
I can't find a way to make this backwards compatible so may do a
combined roll with the emscripten-side change:
https://github.com/emscripten-core/emscripten/pull/17806
|
|
|
|
|
|
|
|
|
| |
The last parameter is the function to call, and we treated it like a normal parameter.
This is mostly only an issue during debugging, but in theory sending this extra value
could cause us to optimize less later (since it gets added to what the local of that
index can contain).
Also add assertions which would have caught this before.
|
|
|
| |
These are only needed for the metadata extraction in emcc.
|
|
|
|
|
| |
Annotations on array.get and array.set were not being counted and the code could
generally be simplified since `count` already ignores types that don't need to
be counted.
|
|
|
|
|
| |
The emscripten side is a little tricky but I've got some tests passing.
Currently blocked on:
https://github.com/emscripten-core/emscripten/issues/17969
|
|
|
| |
Change `standardizeNaN` to take a `Literal` to reduce usage verbosity.
|
|
|
|
| |
Previously it would randomly replace an expression with another one with the
exact same type. Allowing a subtype may give us more coverage.
|
|
|
|
|
|
|
| |
This is a pretty subtle point that was missed in #4811 - we need to first visit the
child, then compute the size, as the child may alter that size.
Found by the fuzzer.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes binaryen easier to call from other applications by making more errors recoverable instead of early-exiting.
The main thing it does is change three calls to exit on I/O errors into calls to Fatal(), which is an existing custom abstraction for handling unrecoverable errors. Currently Fatal's destructor calls _Exit(1).
My intent is to make it possible for Fatal to not exit, but to throw, allowing an embedding application to catch the exception.
Because the previous early exits were exiting with error code EXIT_FAILURE, I also changed Fatal to exit with EXIT_FAILURE. The test suite continues to pass so I assume this is ok.
Next I changed Fatal to buffer its error message until the destructor instead of immediately printing it to stderr. This is for ease of patching Fatal to throw instead.
Finally, I also included the patch I need to make Fatal throw when THROW_ON_FATAL is defined at compile time. I can carry this patch out of tree, but it is a small patch, so perhaps you will be willing to take it. I am happy to remove it.
Fixes #4938
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously only WalkerPasses had access to the `getPassRunner` and
`getPassOptions` methods. Move those methods to `Pass` so all passes can use
them. As a result, the `PassRunner` passed to `Pass::run` and
`Pass::runOnFunction` is no longer necessary, so remove it.
Also update `Pass::create` to return a unique_ptr, which is more efficient than
having it return a raw pointer only to have the `PassRunner` wrap that raw
pointer in a `unique_ptr`.
Delete the unused template `PassRunner::getLast()`, which looks like it was
intended to enable retrieving previous analyses and has been in the code base
since 2015 but is not implemented anywhere.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It does not make sense to construct an `Expression` directly because all
expressions must be specific expressions. However, we previously allowed
constructing Expressions, and in particular we allowed them to be copy
constructed. Unrelatedly, `Fatal::operator<<` took its argument by value.
Together, these two facts produced UB when printing Expressions in fatal error
messages because a new Expression would be copy constructed with the original
expression ID but without any of the actual data from the original specific
expression. For example, when trying to print a Block, the printing code would
try to look at the expression list, but the expression list would be junk stack
data because the copied Expression does not contain an expression list.
Fix the problem by making Expression's constructors visible only to its
subclasses and making `Fatal::operator<<` take its argument by forwarding
reference instead of by value.
|
|
|
|
| |
We ignored only unreachable conditions, but we must ignore the arms as well,
or else we could error.
|
|
|
|
| |
Avoid manually doing bitshifts etc. - leave combining to the core hash
logic, which can do a better job.
|
|
|
|
|
| |
We append to vectors of globals in a nondeterministically-ordered loop, which can lead to
different orderings of the vectors. This happens quite frequently in very large J2Wasm
files it turns out. As a solution, simply sort them after the nondeterministic stage.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This does not actually add cone types, but it does NFC refactoring towards that.
Specifically it replaces the internal ExactType with ConeType, and the latter
has a depth, so a cone type of depth 0 is the old exact type.
Cone types with depth > 0 are not possible yet, keeping this NFC.
I believe this design with a depth for cone types has little overhead. It does add
to the size of ConeType, but the variant there is larger anyhow (it contains a
Literal). And things like isSubType need to loop anyhow, so looping up to the
depth etc. in checks won't make things slower.
|
|
|
|
|
| |
We compared types and not heap types, so a difference in nullability
confused us. But at that point in the code, we've ruled out nulls, so we
should focus on heap types only.
|
|
|
|
| |
This is the case for dynamic linking where the segment offset are
derived from he `__memory_base` import.
|
|
|
| |
Move the logic to the GUFA pass.
|
|
|
|
|
|
|
|
|
| |
This moves the logic to add connections from signatures to functions from the top level
into the RefFunc logic. That way we only add those connections to functions that
actually have a RefFunc, which avoids us thinking that a function without one can be
reached by a call_ref of its type.
Has a small but non-zero benefit on j2wasm.
|
|
|
|
| |
If the PossibleContents for the two sides have no possible intersection then the
result must be 0.
|
|
|
|
| |
Make walkModuleCode set the module automatically, like walkModule already does.
Also remove some unneeded module settings when calling those methods.
|
|
|
|
|
|
|
| |
Emit call_ref instructions with type annotations and a temporary opcode. Also
implement support for parsing optional type annotations on call_ref in the text
and binary formats. This is part of a multi-part graceful update to switch
Binaryen and all of its users over to using the type-annotated version of
call_ref without there being any breakage.
|
|
|
| |
Fixes #5041
|
|
|
|
|
|
| |
The GC spec has been updated to have heap type annotations on call_ref and
return_call_ref. To avoid breaking users, we will have a graceful, multi-step
upgrade to the annotated version of call_ref, but since return_call_ref has no
users yet, update it in a single step.
|
|
|
|
|
|
|
| |
Previously when we parsed `string.const` payloads in the text format we were
using the text strings directly instead of un-escaping them. Fix that parsing,
and while we're editing the code, also add support for the `\r` escape allowed
by the spec. Remove a spurious nested anonymous namespace and spurious `static`s
in Print.cpp as well.
|
|
|
|
|
| |
See #5062
Also add a require() workaround, see https://github.com/emscripten-core/emscripten/pull/17851
|
|
|
|
|
| |
TABLE_BASE usage was removed in #3211.
MEMORY_BASE usage was removed in #3089.
NEW_SIZE usage was removed in #3180.
|
|
|
|
|
|
| |
Similar to ref.cast slightly, but simpler.
Also update some TODO text.
|
| |
|
|
|
| |
This lets that pass optimize 64-bit offsets on memory64 loads and stores.
|
|
|
|
| |
This should make the CI green again. Also fix one of the errors. I haven't fixed
the other errors because I don't know how.
|
|
|
|
|
|
|
|
|
| |
floating points (#5034)
```
(-x) + y -> y - x
x + (-y) -> x - y
x - (-y) -> x + y
```
|
|
|
| |
This finalizes the multi memories feature introduced in #4968.
|
| |
|
|
|
| |
Also fix some formatting issue in the file.
|