| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
See https://reviews.llvm.org/D91803 - there are now -1 or -2 in places that
mark something that the linker removed as not existing/not relevant. We should
ignore those like we ignore 0s there.
|
|
|
|
|
|
|
|
|
|
| |
Defined types in wasm are really one of the "heap types": a signature type, or
(with GC) a struct or an array type. This refactors the binary and text parsers
to load the defined types into an array of heap types, so that we can start to
parse GC types. This replaces the existing array of signature types (which
could not support a struct or an array).
Locally this PR can parse and print as text simple GC types. For that it was
necessary to also fix Type::getFeatures for GC.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bugs (#3401)
* Count signatures in tuple locals.
* Count nested signature types (confirming @aheejin was right, that was missing).
* Inlining was using the wrong type.
* OptimizeInstructions should return -1 for unhandled types, not error.
* The fuzzer should check for ref types as well, not just typed function references,
similar to what GC does.
* The fuzzer now creates a function if it has no other option for creating a constant
expression of a function type, then does a ref.func of that.
* Handle unreachability in call_ref binary reading.
* S-expression parsing fixes in more places, and add a tiny fuzzer for it.
* Switch fuzzer test to just have the metrics, and not print all the fuzz output which
changes a lot. Also fix noprint handling which only worked on binaries before.
* Fix Properties::getLiteral() to use the specific function type properly, and make
Literal's function constructor require that, to prevent future bugs.
* Turn all input types into nullable types, for now.
|
|
|
|
|
|
|
| |
Although there is only one "type store" right now, a subsequent PR will add a
new "TypeBuilder" class that manages its own universe of temporary types. Rather
than duplicate all the logic behind type creation and canonicalization, it makes
more sense to encapsulate that logic in a class that TypeBuilder will be able to
reuse.
|
|
|
|
|
|
|
|
| |
Includes minimal support in various passes. Also includes actual optimization
work in Directize, which was easy to add.
Almost has fuzzer support, but the actual makeCallRef is just a stub so far.
Includes s-parser support for parsing typed function references types.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
types (#3388)
This adds the new feature and starts to use the new types where relevant. We
use them even without the feature being enabled, as we don't know the features
during wasm loading - but the hope is that given the type is a subtype, it should
all work out. In practice, if you print out the internal type you may see a typed
function reference-specific type for a ref.func for example, instead of a generic
funcref, but it should not affect anything else.
This PR does not support non-nullable types, that is, everything is nullable
for now. As suggested by @tlively this is simpler for now and leaves nullability
for later work (which will apparently require let or something else, and many
passes may need to be changed).
To allow this PR to work, we need to provide a type on creating a RefFunc. The
wasm-builder.h internal API is updated for this, as are the C and JS APIs,
which are breaking changes. cc @dcodeIO
We must also write and read function types properly. This PR improves
collectSignatures to find all the types, and also to sort them by the
dependencies between them (as we can't emit X in the binary if it depends
on Y, and Y has not been emitted - we need to give Y's index). This sorting
ends up changing a few test outputs.
InstrumentLocals support for printing function types that are not funcref
is disabled for now, until we figure out how to make that work and/or
decide if it's important enough to work on.
The fuzzer has various fixes to emit valid types for things (mostly
whitespace there). Also two drive-by fixes to call makeTrivial where it
should be (when we fail to create a specific node, we can't just try to make
another node, in theory it could infinitely recurse).
Binary writing changes here to replace calls to a standalone function to
write out a type with one that is called on the binary writer object itself,
which maintains a mapping of type indexes (getFunctionSignatureByIndex).
|
|
|
|
|
|
|
|
|
| |
When Functions, Globals, Events, and Exports are added to a module, if they are
not already in std::unique_ptrs, they are wrapped in a new std::unique_ptr owned
by the Module. This adds an extra layer of indirection when accessing those
elements that can be avoided by allocating those elements as std::unique_ptrs.
This PR updates wasm-builder to allocate module elements via std::make_unique
rather than `new`. In the future, we should remove the raw pointer versions of
Module::add* to encourage using std::unique_ptrs more broadly.
|
|
|
|
|
|
|
|
| |
Call isFunction to check for a general function type instead of just
a funcref, in places where we care about both, and some other minor
miscellaneous typing fixes in preparation for typed function references
(this will be tested fully at that time).
Change is mostly whitespace.
|
| |
|
|
|
|
| |
function references (#3357)
|
|
|
|
|
| |
We will need this for typed function references support, as then we need
to know full function signatures for all functions when we reach a ref.func,
whose type is then that signature and not the generic funcref.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- atomic.notify -> memory.atomic.notify
- i32.atomic.wait -> memory.atomic.wait32
- i64.atomic.wait -> memory.atomic.wait64
See WebAssembly/threads#149.
This renames instruction name printing but not the internal data
structure names, such as `AtomicNotify`, which are not always the same
as printed instruction names anyway. This also does not modify C API.
But this fixes interface functions in binaryen.js because it seems
binaryen.js's interface functions all follow the corresponding
instruction names.
|
|
|
|
| |
We change the AddrSize which causes all DW_FORM_addr to be written differently.
Depends on https://reviews.llvm.org/D91395
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This function does not return exact instruction names but more of
category names. But when there is a matching instruction, as in case of
`global.get/set` or `local.get/set`, it seems to return instruction
names. In that regard, this makes `getExpressionName`'s return values to
similar to that of real instruction names when possible, in case of
some atomic instructions and `memory.init/copy` and `data.drop`.
It is hard to make a test for this because this function is used in a
very limited way in the codebase, such as:
- When printing error messages
- When printing a stack instruction names, but only for control flow
instructions
- When printing instruction names in Metrics
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We used to check if a load's sign matters before hashing it. If the load does
not extend, then the sign doesn't matter, and we ignored the value there. It
turns out that value could be garbage, as we didn't assign it in the binary
reader, if it wasn't relevant. In the rewrite this was missed, and actually it's
not really possible to do, since we have just a macro for the field, but not the
object it is on - which there may be more than one.
To fix this, just always assign the field. This is simpler anyhow, and avoids
confusion not just here but probably when debugging.
The testcase here is reduced from the fuzzer, and is not a 100% guarantee
to catch a read of uninitialized memory, but it can't hurt, and with ASan it
may be pretty consistent.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds the capability to programatically split a module into a primary and
secondary module such that the primary module can be compiled and run before the
secondary module has been instantiated. All calls to secondary functions (i.e.
functions that have been split out into the secondary module) in the primary
module are rewritten to be indirect calls through the table. Initially, the
table slots of all secondary functions contain references to imported
placeholder functions. When the secondary module is instantiated, it will
automatically patch the table to insert references to the original functions.
The process of module splitting involves these steps:
1. Create the new secondary module.
2. Export globals, events, tables, and memories from the primary module and
import them in the secondary module.
3. Move the deferred functions from the primary to the secondary module.
4. For any secondary function exported from the primary module, export in
its place a trampoline function that makes an indirect call to its
placeholder function (and eventually to the original secondary function),
allocating a new table slot for the placeholder if necessary.
5. Rewrite direct calls from primary functions to secondary functions to be
indirect calls to their placeholder functions (and eventually to their
original secondary functions), allocating new table slots for the
placeholders if necessary.
6. For each primary function directly called from a secondary function, export
the primary function if it is not already exported and import it into the
secondary module.
7. Replace all references to secondary functions in the primary module's table
segments with references to imported placeholder functions.
8. Create new active table segments in the secondary module that will replace
all the placeholder function references in the table with references to
their corresponding secondary functions upon instantiation.
Functions can be used or referenced three ways in a WebAssembly module: they can
be exported, called, or placed in a table. The above procedure introduces a
layer of indirection to each of those mechanisms that removes all references to
secondary functions from the primary module but restores the original program's
semantics once the secondary module is instantiated. As more mechanisms that
reference functions are added in the future, such as ref.func instructions, they
will have to be modified to use a similar layer of indirection.
The code as currently written makes a few assumptions about the module that is
being split:
1. It assumes that mutable-globals is allowed. This could be worked around by
introducing wrapper functions for globals and rewriting secondary code that
accesses them, but now that mutable-globals is shipped on all browsers,
hopefully that extra complexity won't be necessary.
2. It assumes that all table segment offsets are constants. This simplifies the
generation of segments to actively patch in the secondary functions without
overwriting any other table slots. This assumption could be relaxed by 1)
having secondary segments re-write primary function slots as well, 2)
allowing addition in segment offsets, or 3) synthesizing a start function to
modify the table instead of using segments.
3. It assumes that each function appears in the table at most once. This isn't
necessarily true in general or even for LLVM output after function
deduplication. Relaxing this assumption would just require slightly more
complex code, so it is a good candidate for a follow up PR.
Future Binaryen work for this feature includes providing a command line tool
exposing this functionality as well as C API, JS API, and fuzzer support. We
will also want to provide a simple instrumentation pass for finding dead or
late-executing functions that would be good candidates for splitting out. It
would also be good to integrate that instrumentation with future function
outlining work so that dead or exceptional basic blocks could be split out into
a separate module.
|
|
|
| |
Specifically try to cleanup use of asm_v_wasm.h and asmjs constants.
|
|
|
|
|
|
|
| |
Specifically, pick a simple positive canonical NaN as the NaN output, when the output
is a NaN. This is the same as what tools like wabt do.
This fixes a testcase found by the fuzzer on #3289 but it was not that PR's fault.
|
|
|
| |
Fixed bug in memory64-lowering pass for memory.size/grow
|
|
|
|
|
|
| |
Emscripten no longer needs this information as of
https://github.com/emscripten-core/emscripten/pull/12643.
This also removes the need to export __data_end.
|
| |
|
|
|
|
|
|
|
| |
Including saturating, rounding Q15 multiplication as proposed in
https://github.com/WebAssembly/simd/pull/365 and extending multiplications as
proposed in https://github.com/WebAssembly/simd/pull/376. Since these are just
prototypes, skips adding them to the C or JS APIs and the fuzzer, as well as
implementing them in the interpreter.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously when we processed a block for example, we'd do this:
;; start is here
(block (result type)
;; end is here
.. contents ..
)
;; end delimiter is here
Not how this represents the block's start and end as the "header", and
uses an extra delimiter to mark the end.
I think this is wrong, and was an attempt to handle some offsets from
LLVM that otherwise made no sense, ones at the end of the "header".
But it turns out that this makes us completely incorrect on some things
where there is a low/high pc pair, and we need to understand that the
end of a block is at the end opcode at the very end, and not the end of
the header. This PR changes us to do that, i.e.
;; start is here
(block (result type)
.. contents ..
)
;; end is here
This fixes a testcase already in the test suite,
test/passes/fib_nonzero-low-pc_dwarf.bin.txt
where you can see that lexical block now has a valid value for the end, and
not a 0 (the proper scope extends all the way to the end of the big block in
that function, and is now the same in the DWARF before and after we
process it). test/passes/fannkuch3_dwarf.bin.txt is also improved by
this.
To implement this, this removes the BinaryLocations::End delimeter. After
this we just need one type of delimiter actually, but I didn't refactor that any
more to keep this PR small (see TODO).
This removes an assertion in writeDebugLocationEnd() that is no longer
valid: the assert ensures that we wrote an end only if there was a 0 for
the end, but for a control flow structure, we write the end of the "header"
automatically like for any expression, and then overwrite it later when we
finish writing the children and the end marker. We could in theory special-case
control flow structures to avoid the first write, but it would add more complexity.
This uncovered what appears to be a possible bug in our debug_line
handling, see test/passes/fannkuch3_manyopts_dwarf.bin.txt. That needs
to be looked into more, but I suspect that was invalid info from when we
looked at the end of the "header" of control flow structures. Note that there
was one definite bug uncovered here, fixed by the extra
} else if (locationUpdater.hasOldExprEnd(oldAddr)) {
that is added here, which was definitely a bug.
|
|
|
|
|
|
| |
As proposed in https://github.com/WebAssembly/simd/pull/379. Since this
instruction is still being evaluated for inclusion in the SIMD proposal, this PR
does not add support for it to the C/JS APIs or to the fuzzer. This PR also
performs a drive-by fix for unrelated instructions in c-api-kitchen-sink.c
|
|
|
|
|
|
|
|
|
| |
This change makes matchers in OptimizeInstructions more compact and readable by
removing the explicit `Abstract::` namespace from individual operations. In some
cases, this makes multi-line matcher expressions fit on a single line.
This change is only possible because it also adds an explicit "RMW" prefix to
each element of the `AtomicRMWOp` enumeration. Without that, their names
conflicted with the names of Abstract ops.
|
|
|
|
|
|
|
| |
These instructions are proposed in https://github.com/WebAssembly/simd/pull/350.
This PR implements them throughout Binaryen except in the C/JS APIs and in the
fuzzer, where it leaves TODOs instead. Right now these instructions are just
being implemented for prototyping so adding them to the APIs isn't critical and
they aren't generally available to be fuzzed in Wasm engines.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(#3253)
validateGlobally means that we can't do lookups on the module. A few places
were missing that, or had it wrong. I think the reason for the wrong usages is
that we used to have types on the module, and then removed that, so more is
now validatable actually.
This uncovered a real bug, where i64-to-32 would ignore an unreachable
parameter of a call_indirect. That's bad, since if the type is i64, we need
to replace it with two parameters. To fix that, just handle unreachability
there, using the existing logic (which skips the call_indirect entirely in
this case).
|
|
|
|
| |
The use of these passes was removed on the emscripten side
in https://github.com/emscripten-core/emscripten/pull/12536.
|
|
|
|
| |
Fixes: #3226
|
|
|
|
|
|
|
|
|
| |
The s-parser was assigning numbers names per-type where as
the binaryn reader was using the global import count as the
number to append.
This change switches to use per-element count which I think
it preferable as it increases the stability of the auto-generated
names. e.g. memory is now always named `$mimport0`.
|
|
|
| |
Mentioning if it's a memory or a table segment is convenient.
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we are renaming invoke wrappers and `emscripten_longjmp_jmpbuf`
in the wasm backend, this deletes all related renaming routines and
relevant tests. Depends on #3192.
Addresses: #3043 and #3081
Companions:
https://reviews.llvm.org/D88697
emscripten-core/emscripten#12399
|
|
|
| |
When there are two versions of a function, one handling tuples and the other handling non-tuple values, the previous naming convention was to have "Single" in the name of the non-tuple handling function. This PR simplifies the convention and shortens function names by making the names plural for the tuple-handling version and singular for the non-tuple-handling version.
|
| |
|
|
|
|
|
|
| |
This moves dynCall generating functionaity for invokes from
`EmscriptenGlueGenerator` to `GenerateDynCalls` pass. So now
`GenerateDynCalls` pass will take care of all cases we need dynCalls:
functions in tables and invokes.
|
|
|
| |
This depends on https://github.com/emscripten-core/emscripten/pull/12391
|
|
|
|
|
| |
Use overloads instead of templates where applicable and change function names
from PascalCase to camelCase. Also puts the functions in the Bits namespace to
avoid naming conflicts.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Similar to clang and gcc, --fast-math makes us ignore corner cases of floating-point
math like NaN changes and (not done yet) lack of associativity and so forth.
In the future we may want to have separate fast math flags for each specific thing,
like gcc and clang do.
This undoes some changes (#2958 and #3096) where we assumed it was
ok to not change NaN bits, but @binji corrected us. We can only do such things in fast
math mode. This puts those optimizations behind that flag, adds tests for it, and
restores the interpreter to the simpler code from before with no special cases.
|
|
|
| |
SExpressionWasmBuilder was not applying default memory and table import names on the memory and table, unlike on functions, globals and events, where it applies them. Also aligns default import names to use the same shorter forms as in binary parsing.
|
|
|
| |
NFC, except adding most of the boilerplate for the remaining GC instructions. Each implementation site is marked with a respective `TODO (gc): theInstruction` in between the typical boilerplate code.
|
|
|
| |
Integrates `i31ref` types and instructions into the fuzzer, by assuming that `(i31.new (i32.const N))` is constant and hence suitable to be used in global initializers.
|
|
|
| |
Implements the parts of the Extended Name Section Proposal that are trivially applicable to Binaryen, in particular table, memory and global names. Does not yet implement label, type, elem and data names.
|
|
|
| |
Comparing and hashing literals previously depended on `getBits`, which was fine while there were only basic numeric types, but doesn't map well to reference types anymore. Hence this change limits the use of `getBits` to basic numeric types, and implements reference types-aware comparisons and hashing do deal with the newer types.
|
|
|
| |
details: https://github.com/WebAssembly/binaryen/issues/3149
|
|
|
|
|
| |
In that case LLVM emits the address of the declarations area (where locals are
declared) of the function, which is even earlier than the instructions actual
first byte. I'm not sure why it does this, but it's easy to handle.
|
|
|
|
| |
This relaxation has made it to Chrome stable, so it makes sense that we would
allow it in the tools.
|
|
|
| |
Instructions `ref.null`, `ref.is_null`, `ref.func`, `try`, `throw`, `rethrow` and `br_on_exn` were previously missing explicit feature checks, and this change adds them. Note that some of these already didn't validate before for other reasons, like requiring the use of a type checked otherwise, but `ref.null` and `try` validated even in context of FeatureSet::MVP, so better to be sure.
|
|
|
| |
Adds the `i31.new` and `i31.get_s/u` instructions for creating and working with `i31ref` typed values. Does not include fuzzer integration just yet because the fuzzer expects that trivial values it creates are suitable in global initializers, which is not the case for trivial `i31ref` expressions.
|
|
|
| |
With `eqref` now integrated, the `ref.eq` instruction can be implemented. The only valid LHS and RHS value is `(ref.null eq)` for now, but implementation and fuzzer integration is otherwise complete.
|