| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
This is the only instruction in the current spec proposal that had not
yet been implemnented in the tools.
|
|
|
|
|
|
| |
That is only for the old source maps logic, not DWARF, and it is
only useful to debug source maps (it's not actually useful for
regular users that see the message) which we do not plan to do
since DWARF is the future.
|
|
|
| |
As specified in https://github.com/WebAssembly/simd/pull/122.
|
|
|
| |
As described in the spec.
|
|
|
| |
Fixes #2751.
|
|
|
|
|
|
|
| |
Since it wasn't easy to support tuples in Asyncify's call support
using temporary functions, we decided to allow tuple-typed globals
after all. This PR adds support for parsing, printing, lowering, and
interpreting tuple globals and also adds validation ensuring that
imported and exported globals do not have tuple types.
|
|
|
|
| |
Update it from wasm-emscripten-finalize when we append
to the table.
|
|
|
| |
The version of V8 pulled in by JSVU recently updated to expect the new ordering of the event section, so this PR should fix the CI.
|
|
|
|
|
|
| |
Adds full support for the {i8x16,i16x8,i32x4}.abs instructions merged
to the SIMD proposal in https://github.com/WebAssembly/simd/pull/128
as well as the {i8x16,i16x8,i32x4}.bitmask instructions proposed in
https://github.com/WebAssembly/simd/pull/201.
|
|
|
|
|
| |
It should be a signed LEB128, not an unsigned LEB128. This bug was
causing modules to be invalid when the number of signatures in the
type section was large and multivalue blocks were present.
|
| |
|
|
|
|
|
|
|
|
|
| |
Implements parsing and emitting of tuple creation and extraction and tuple-typed control flow for both the text and binary formats.
TODO:
- Extend Precompute/interpreter to handle tuple values
- C and JS API support/testing
- Figure out how to lower in stack IR
- Fuzzing
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DWARF from LLVM can refer to the first byte belonging to the function,
where the size LEB is, or to the first byte after that, where the local
declarations are, or the end opcode, or to one byte past that which is
one byte past the bytes that belong to the function. We aren't sure why
LLVM does this, but track it all for now.
After this all debug line positions are identified. However,
in some cases a debug line refers to one past the end of the
function, which may be an LLVM bug. That location is ambiguous
as it could also be the first byte of the next function (what
made this discovery possible was when this happened to the
last function, after which there is another section).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Control flow structures have those in addition to the normal span of
(start, end), and we need to track them too.
Tracking them during reading requires us to track control flow
structures while parsing, so that we can know to which structure
an end/else/catch refers to.
We track these locations using a map on the side of instruction
to its "extra" locations. That avoids increasing the size of the
tracking info for the much more common non-control flow
instructions.
Note that there is one more 'end' location, that of the function
(not referring to any instruction). I left that to a later PR to
not increase this one too much.
|
|
|
|
|
| |
Instead of hackishly advancing the read position in the
binary buffer, call readExpression which will do that, and
also do all the debug info handling for us.
|
|
|
|
| |
This will make it easier to switch to something else for
offsets in wasm binaries if we get >4GB files.
|
|
|
|
|
|
|
| |
Update high_pc values. These are interesting as they
may be a relative offset compared to the low_pc.
For functions we already had both a start and an end. Add
such tracking for instructions as well.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Track the beginning and end of each function, both when reading
and writing.
We track expressions and functions separately, instead of having a single
big map of (oldAddr) => (newAddr) because of the potentially ambiguous case
of the final expression in a function: it's end might be identical in offset
to the end of the function. So we have two different things that map to the
same offset. However, if the context is "the end of the function" then the
updated address is the new end of the function, even if the function ends
with a different instruction now, as the old last instruction might have
moved or been optimized out. Concretely, we have getNewExprAddr
and getNewFuncAddr, so we can ask to update the location of either
an expression or a function, and use that contextual information.
This checks for the DIE tag in order to know what we are looking for.
To be safe, if we hit an unknown tag, we halt, so that we don't silently
miss things.
As the test updates show, the new things we can do thanks to this
PR are to update compile unit and subprogram low_pc locations.
Note btw that in the first test (dwarfdump_roundtrip_dwarfdump.bin.txt)
we change 5 to 0: that is correct since that test does not write out
DWARF (it intentionally has no -g), so we do not track binary
locations while writing, and so we have nothing to update to (the
other tests show actual updating).
Also fix the order in the python test runner code to show a diff
of expected to encountered, and not the reverse, which confused
me.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for the reference type proposal. This includes support
for all reference types (`anyref`, `funcref`(=`anyfunc`), and `nullref`)
and four new instructions: `ref.null`, `ref.is_null`, `ref.func`, and
new typed `select`. This also adds subtype relationship support between
reference types.
This does not include table instructions yet. This also does not include
wasm2js support.
Fixes #2444 and fixes #2447.
|
|
|
|
|
|
|
|
|
|
|
| |
Several type-related functions currently exist outside of `Type`
class and thus in the `wasm`, effectively global, namespace. This moves
these functions into `Type` class, making them either member functions
or static functions.
Also this renames `getSize` to `getByteSize` to make it not to be
confused with `size`, which returns the number of types in multiple
types. This also reorders the order of functions in `wasm-type.cpp` to
match that of `wasm-type.h`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this, we can update DWARF debug line info properly as
we write a new binary.
To do that we track binary locations as we write. Each
instruction is mapped to the location it is written to. We
must also adjust them as we move code around because
of LEB optimization (we emit a function or a section
with a 5-byte LEB placeholder, the maximal size; later
we shrink it which is almost always possible).
writeDWARFSections() now takes a second param, the new
locations of instructions. It then maps debug line info from the
original offsets in the binary to the new offsets in the binary
being written.
The core logic for updating the debug line section is in
wasm-debug.cpp. It basically tracks state machine logic
both to read the existing debug lines and to emit the new
ones. I couldn't find a way to reuse LLVM code for this, but
reading LLVM's code was very useful here.
A final tricky thing we need to do is to update the DWARF
section's internal size annotation. The LLVM YAML writing
code doesn't do that for us. Luckily it's pretty easy, in
fixEmittedSection we just update the first 4 bytes in place
to have the section size, after we've emitted it and know
the size.
This ignores debug lines with a 0 in the line, col, or addr,
see WebAssembly/debugging#9 (comment)
This ignores debug line offsets into the middle of
instructions, which LLVM sometimes emits for some
reason, see WebAssembly/debugging#9 (comment)
Handling that would likely at least double our memory
usage, which is unfortunate - we are run in an LTO manner,
where the entire app's DWARF is present, and it may be
massive. I think we should see if such odd offsets are
a bug in LLVM, and if we can fix or prevent that.
This does not emit "special" opcodes for debug lines. Those
are purely an optimization, which I wanted to leave for
later. (Even without them we decrease the size quite a lot,
btw, as many lines have 0s in them...)
This adds some testing that shows we can load and save
fib2.c and fannkuch.cpp properly. The latter includes more
than one function and has nontrivial code.
To actually emit correct offsets a few minor fixes are
done here:
* Fix the code section location tracking during reading -
the correct offset we care about is the body of the code
section, not including the section declaration and size.
* Fix wasm-stack debug line emitting. We need to update
in BinaryInstWriter::visit(), that is, right before writing
bytes for the instruction. That differs from
* BinaryenIRWriter::visit which is a recursive function
that also calls the children - so the offset there would be
of the first child. For some reason that is correct with
source maps, I don't understand why, but it's wrong for
DWARF...
* Print code section offsets in hex, to match other tools.
Remove DWARFUpdate pass, which was useful for testing
temporarily, but doesn't make sense now (it just updates without
writing a binary).
cc @yurydelendik
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Optionally track the binary format code section offsets,
that is, when loading a binary, remember where each IR
node was read from. This is necessary for DWARF
debug info, as these are the offsets DWARF refers to.
(Note that eventually we may want to do something
else, like first read the DWARF and only then add
debug info annotations into the IR in a more LLVM-like
manner, but this is more straightforward and should be
enough to update debug lines and ranges).
This tracking adds noticeable overhead - every single
IR node adds an entry in a map - so avoid it unless
actually necessary. Specifically, if the user passes in
-g and there are actually DWARF sections in the
binary, and we are not about to remove those sections,
then we need it.
Print binary format code section offsets in text, when
printing with -g. This will help debug and test dwarf
support. It looks like
;; code offset: 0x7
as an annotation right before each node.
Also add support for -g in wasm-opt tests (unlike
a pass, it has just one - as a prefix).
Helps #2400
|
|
|
| |
As specified in https://github.com/WebAssembly/simd/pull/126.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to the current spec, `local.tee`'s return type should be the
same as its local's type. (Discussions on whether we should change this
rule is going on in WebAssembly/reference-types#55, but here I will
assume this spec does not change. If this changes, we should change many
parts of Binaryen transformation anyway...)
But currently in Binaryen `local.tee`'s type is computed from its
value's type. This didn't make any difference in the MVP, but after we
have subtype relationship in #2451, this can become a problem. For
example:
```
(func $test (result funcref) (local $0 anyref)
(local.tee $0
(ref.func $test)
)
)
```
This shouldn't validate in the spec, but this will pass Binaryen
validation with the current `local.tee` implementation.
This makes `local.tee`'s type computed from the local's type, and makes
`LocalSet::makeTee` get a type parameter, to which we should pass the
its corresponding local's type. We don't embed the local type in the
class `LocalSet` because it may increase memory size.
This also fixes the type of `local.get` to be the local type where
`local.get` and `local.set` pair is created from `local.tee`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Function signatures were previously redundantly stored on Function
objects as well as on FunctionType objects. These two signature
representations had to always be kept in sync, which was error-prone
and needlessly complex. This PR takes advantage of the new ability of
Type to represent multiple value types by consolidating function
signatures as a pair of Types (params and results) stored on the
Function object.
Since there are no longer module-global named function types,
significant changes had to be made to the printing and emitting of
function types, as well as their parsing and manipulation in various
passes.
The C and JS APIs and their tests also had to be updated to remove
named function types.
|
|
|
|
|
| |
This works more like llvm's unreachable handler in that is preserves
information even in release builds.
|
|
|
|
|
|
| |
This means that debugging/tracing can now be enabled and controlled
centrally without managing and passing state around the codebase.
|
|
|
|
|
|
|
|
|
|
| |
Create a new ParallelFunctionAnalysis helper, which lets us
run in parallel on all functions and collect info from them,
without manually handling locks etc.
Use that in the binary writing code's type collection logic,
avoiding a lock for each type increment.
Also add Signature printing which was useful to debug this.
|
|
|
|
|
|
|
|
|
| |
This is the start of a larger refactoring to remove FunctionType entirely and
store types and signatures directly on the entities that use them. This PR
updates BrOnExn and Events to remove their use of FunctionType and makes the
BinaryWriter traverse the module and collect types rather than using the global
FunctionType list. While we are collecting types, we also sort them by frequency
as an optimization. Remaining uses of FunctionType in Function, CallIndirect,
and parsing will be removed in a future PR.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds the ability to create multivalue types from vectors of concrete value
types. All types are transparently interned, so their representation is still a
single uint32_t. Types can be extracted into vectors of their component parts,
and all the single value types expand into vectors containing themselves.
Multivalue types are not yet used in the IR, but their creation and inspection
functionality is exposed and tested in the C and JS APIs.
Also makes common type predicates methods of Type and improves the ergonomics of
type printing.
|
|
|
|
|
| |
This experimental instruction is specified in
https://github.com/WebAssembly/simd/pull/127 and is being implemented
to enable further investigation of its performance impact.
|
|
|
| |
As proposed in https://github.com/WebAssembly/simd/pull/27.
|
|
|
|
| |
As specified at
https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#swizzling-using-variable-indices.
|
|
|
|
|
|
| |
Adds support for the new load and extend instructions. Also updates
from C++11 to C++17 in order to use generic lambdas in the interpreter
implementation.
|
|
|
|
|
| |
As specified at https://github.com/WebAssembly/simd/pull/102.
Also fixes bugs in the JS API for other SIMD bitwise operators.
|
|
|
|
|
|
|
| |
Introduces a new instruction class, `SIMDLoad`. Implements encoding,
decoding, parsing, printing, and interpretation of the load and splat
instructions, including in the C and JS APIs. `v128.load` remains in
the `Load` instruction class for now because the interpreter code
expects a `Load` to be able to load any memory value type.
|
| |
|
|
|
|
|
|
|
|
|
| |
Renames the SIMDBitselect class to SIMDTernary and adds the new
{f32x4,f64x2}.qfm{a,s} ternary instructions. Because the SIMDBitselect
class is no more, this is a backwards-incompatible change to the C
interface. The new instructions are not yet used in the fuzzer because
they are not yet implemented in V8.
The corresponding LLVM commit is https://reviews.llvm.org/rL370556.
|
|
|
|
|
|
|
| |
This adds `atomic.fence` instruction:
https://github.com/WebAssembly/threads/blob/master/proposals/threads/Overview.md#fence-operator
This also fix bugs in `atomic.wait` and `atomic.notify` instructions in
binaryen.js and adds tests for them.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Another round of trying to push upstream things from my fork.
This PR only adds support for anyref itself as an opaque type. It does NOT implement the full [reference types proposal](https://github.com/WebAssembly/reference-types/blob/master/proposals/reference-types/Overview.md)--so no table.get/set/grow/etc or ref.null, ref.func, etc.
Figured it was easier to review and merge as we go, especially if I did something fundamentally wrong.
***
I did put it under the `--enable-reference-types` flag as I imagine that even though this PR doesn't complete the full feature set, it probably is the right home. Lmk if not.
I'll also be adding a few github comments to places I want to point out/question.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds basic support for exception handling instructions, according
to the spec:
https://github.com/WebAssembly/exception-handling/blob/master/proposals/Exceptions.md
This PR includes support for:
- Binary reading/writing
- Wast reading/writing
- Stack IR
- Validation
- binaryen.js + C API
- Few IR routines: branch-utils, type-updating, etc
- Few passes: just enough to make `wasm-opt -O` pass
- Tests
This PR does not include support for many optimization passes, fuzzer,
or interpreter. They will be follow-up PRs.
Try-catch construct is modeled in Binaryen IR in a similar manner to
that of if-else: each of try body and catch body will contain a block,
which can be omitted if there is only a single instruction. This block
will not be emitted in wast or binary, as in if-else. As in if-else,
`class Try` contains two expressions each for try body and catch body,
and `catch` is not modeled as an instruction. `exnref` value pushed by
`catch` is get by `pop` instruction.
`br_on_exn` is special: it returns different types of values when taken
and not taken. We make `exnref`, the type `br_on_exn` pushes if not
taken, as `br_on_exn`'s type.
|
|
|
|
|
|
|
|
|
| |
The blacklist means "functions here are to be ignored and not instrumented, we can assume they never unwind." The whitelist means "only these functions, and no others, can unwind." I had hoped such lists would not be necessary, since Asyncify's overhead is much smaller than the old Asyncify and Emterpreter, but as projects have noticed, the overhead to size and speed is still significant. The lists give power users a way to reduce any unnecessary overhead.
A slightly tricky thing is escaping of names: we escape names from the names section (see #2261 #1646). The lists arrive in human-readable format, so we escape them before comparing to the internal escaped names. To enable that I refactored wasm-binary a little bit to provide the escaping logic, cc @yurydelendik
If both lists are specified, an error is shown (since that is meaningless). If a name appears in a list that is not in the module, we show a warning, which will hopefully help people debug typos etc. I had hoped to make this an error, but the problem is that due to inlining etc. a single list will not always work for both unoptimized and optimized builds (a function may vanish when optimizing, due to duplicate function elimination or inlining).
Fixes #2218.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously `StackWriter` and its subclasses had routines for all three
modes (`Binaryen2Binary`, `Binaryen2Stack`, and `Stack2Binary`) within a
single class. This splits routines for each in a separate class and
also factors out binary writing into a separate class
(`BinaryInstWriter`) so other classes can make use of it.
The new classes are:
- `BinaryInstWriter`:
Binary instruction writer. Only responsible for emitting binary
contents and no other logic
- `BinaryenIRWriter`: Converts binaryen IR into something else
- `BinaryenIRToBinaryWriter`: Writes binaryen IR to binary
- `StackIRGenerator`: Converts binaryen IR to stack IR
- `StackIRToBinaryWriter`: Writes stack IR to binary
|
|
|
|
| |
In WebAssembly/exception-handling#79 we agreed to rename `except_ref`
type to `exnref`.
|
|
|
|
|
|
|
|
|
|
|
| |
Including parsing, printing, assembling, disassembling.
TODO:
- interpreting
- effects
- finalization and typing
- fuzzing
- JS/C API
|
|
|
|
|
| |
Fix and test mutable globals support, replace string literals with
constants, and add a pass to emit the target features section.
|