summaryrefslogtreecommitdiff
path: root/src/wasm
Commit message (Collapse)AuthorAgeFilesLines
* [DWARF] Properly handle LLVM's new tombstone values (#3416)Alon Zakai2020-12-021-4/+15
| | | | | See https://reviews.llvm.org/D91803 - there are now -1 or -2 in places that mark something that the linker removed as not existing/not relevant. We should ignore those like we ignore 0s there.
* [GC types] Refactoring to allow future heap type parsing. NFC (#3409)Alon Zakai2020-12-023-101/+91
| | | | | | | | | | Defined types in wasm are really one of the "heap types": a signature type, or (with GC) a struct or an array type. This refactors the binary and text parsers to load the defined types into an array of heap types, so that we can start to parse GC types. This replaces the existing array of signature types (which could not support a struct or an array). Locally this PR can parse and print as text simple GC types. For that it was necessary to also fix Type::getFeatures for GC.
* [TypedFunctionReferences] Enable call_ref in fuzzer, and fix minor misc fuzz ↵Alon Zakai2020-11-254-20/+46
| | | | | | | | | | | | | | | | | | | | bugs (#3401) * Count signatures in tuple locals. * Count nested signature types (confirming @aheejin was right, that was missing). * Inlining was using the wrong type. * OptimizeInstructions should return -1 for unhandled types, not error. * The fuzzer should check for ref types as well, not just typed function references, similar to what GC does. * The fuzzer now creates a function if it has no other option for creating a constant expression of a function type, then does a ref.func of that. * Handle unreachability in call_ref binary reading. * S-expression parsing fixes in more places, and add a tiny fuzzer for it. * Switch fuzzer test to just have the metrics, and not print all the fuzz output which changes a lot. Also fix noprint handling which only worked on binaries before. * Fix Properties::getLiteral() to use the specific function type properly, and make Literal's function constructor require that, to prevent future bugs. * Turn all input types into nullable types, for now.
* [wasm-type][NFC] Encapsulate type canonicalization in a TypeStore (#3403)Thomas Lively2020-11-251-88/+96
| | | | | | | Although there is only one "type store" right now, a subsequent PR will add a new "TypeBuilder" class that manages its own universe of temporary types. Rather than duplicate all the logic behind type creation and canonicalization, it makes more sense to encapsulate that logic in a class that TypeBuilder will be able to reuse.
* [TypedFunctionReferences] Implement call_ref (#3396)Alon Zakai2020-11-246-111/+198
| | | | | | | | Includes minimal support in various passes. Also includes actual optimization work in Directize, which was easy to add. Almost has fuzzer support, but the actual makeCallRef is just a stub so far. Includes s-parser support for parsing typed function references types.
* [TypedFunctionReferences] Add Typed Function References feature and use the ↵Alon Zakai2020-11-236-26/+159
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | types (#3388) This adds the new feature and starts to use the new types where relevant. We use them even without the feature being enabled, as we don't know the features during wasm loading - but the hope is that given the type is a subtype, it should all work out. In practice, if you print out the internal type you may see a typed function reference-specific type for a ref.func for example, instead of a generic funcref, but it should not affect anything else. This PR does not support non-nullable types, that is, everything is nullable for now. As suggested by @tlively this is simpler for now and leaves nullability for later work (which will apparently require let or something else, and many passes may need to be changed). To allow this PR to work, we need to provide a type on creating a RefFunc. The wasm-builder.h internal API is updated for this, as are the C and JS APIs, which are breaking changes. cc @dcodeIO We must also write and read function types properly. This PR improves collectSignatures to find all the types, and also to sort them by the dependencies between them (as we can't emit X in the binary if it depends on Y, and Y has not been emitted - we need to give Y's index). This sorting ends up changing a few test outputs. InstrumentLocals support for printing function types that are not funcref is disabled for now, until we figure out how to make that work and/or decide if it's important enough to work on. The fuzzer has various fixes to emit valid types for things (mostly whitespace there). Also two drive-by fixes to call makeTrivial where it should be (when we fail to create a specific node, we can't just try to make another node, in theory it could infinitely recurse). Binary writing changes here to replace calls to a standalone function to write out a type with one that is called on the binary writer object itself, which maintains a mapping of type indexes (getFunctionSignatureByIndex).
* [wasm-builder] Construct module elements as unique_ptrs (#3391)Thomas Lively2020-11-191-11/+11
| | | | | | | | | When Functions, Globals, Events, and Exports are added to a module, if they are not already in std::unique_ptrs, they are wrapped in a new std::unique_ptr owned by the Module. This adds an extra layer of indirection when accessing those elements that can be avoided by allocating those elements as std::unique_ptrs. This PR updates wasm-builder to allocate module elements via std::make_unique rather than `new`. In the future, we should remove the raw pointer versions of Module::add* to encourage using std::unique_ptrs more broadly.
* [Types] Handle function types fully in more places (#3381)Alon Zakai2020-11-183-53/+61
| | | | | | | | Call isFunction to check for a general function type instead of just a funcref, in places where we care about both, and some other minor miscellaneous typing fixes in preparation for typed function references (this will be tested fully at that time). Change is mostly whitespace.
* [Types] Add type sorting for Signatures (#3366)Alon Zakai2020-11-151-9/+22
|
* [TypedFunctionReferences] Identify subtyping between funcref and all typed ↵Alon Zakai2020-11-141-2/+15
| | | | function references (#3357)
* [s-parsing] Store full function signatures (#3356)Alon Zakai2020-11-131-4/+4
| | | | | We will need this for typed function references support, as then we need to know full function signatures for all functions when we reach a ref.func, whose type is then that signature and not the generic funcref.
* Rename atomic.notify and *.atomic.wait (#3353)Heejin Ahn2020-11-131-3/+5
| | | | | | | | | | | | | | - atomic.notify -> memory.atomic.notify - i32.atomic.wait -> memory.atomic.wait32 - i64.atomic.wait -> memory.atomic.wait64 See WebAssembly/threads#149. This renames instruction name printing but not the internal data structure names, such as `AtomicNotify`, which are not always the same as printed instruction names anyway. This also does not modify C API. But this fixes interface functions in binaryen.js because it seems binaryen.js's interface functions all follow the corresponding instruction names.
* [wasm64] fix for Memory64Lowering affecting DWARF data (#3348)Wouter van Oortmerssen2020-11-131-2/+10
| | | | We change the AddrSize which causes all DW_FORM_addr to be written differently. Depends on https://reviews.llvm.org/D91395
* Quick followup to #3349 - avoid unnecessary allocations (#3354)Alon Zakai2020-11-131-6/+31
|
* Make getExpressionName resemble instruction names (#3352)Heejin Ahn2020-11-131-9/+9
| | | | | | | | | | | | | | | This function does not return exact instruction names but more of category names. But when there is a matching instruction, as in case of `global.get/set` or `local.get/set`, it seems to return instruction names. In that regard, this makes `getExpressionName`'s return values to similar to that of real instruction names when possible, in case of some atomic instructions and `memory.init/copy` and `data.drop`. It is hard to make a test for this because this function is used in a very limited way in the codebase, such as: - When printing error messages - When printing a stack instruction names, but only for control flow instructions - When printing instruction names in Metrics
* Fix a hashing regression from #3332 (#3349)Alon Zakai2020-11-131-28/+6
| | | | | | | | | | | | | | | We used to check if a load's sign matters before hashing it. If the load does not extend, then the sign doesn't matter, and we ignored the value there. It turns out that value could be garbage, as we didn't assign it in the binary reader, if it wasn't relevant. In the rewrite this was missed, and actually it's not really possible to do, since we have just a macro for the field, but not the object it is on - which there may be more than one. To fix this, just always assign the field. This is simpler anyhow, and avoids confusion not just here but probably when debugging. The testcase here is reduced from the fuzzer, and is not a 100% guarantee to catch a read of uninitialized memory, but it can't hurt, and with ASan it may be pretty consistent.
* Module splitting (#3317)Thomas Lively2020-11-121-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds the capability to programatically split a module into a primary and secondary module such that the primary module can be compiled and run before the secondary module has been instantiated. All calls to secondary functions (i.e. functions that have been split out into the secondary module) in the primary module are rewritten to be indirect calls through the table. Initially, the table slots of all secondary functions contain references to imported placeholder functions. When the secondary module is instantiated, it will automatically patch the table to insert references to the original functions. The process of module splitting involves these steps: 1. Create the new secondary module. 2. Export globals, events, tables, and memories from the primary module and import them in the secondary module. 3. Move the deferred functions from the primary to the secondary module. 4. For any secondary function exported from the primary module, export in its place a trampoline function that makes an indirect call to its placeholder function (and eventually to the original secondary function), allocating a new table slot for the placeholder if necessary. 5. Rewrite direct calls from primary functions to secondary functions to be indirect calls to their placeholder functions (and eventually to their original secondary functions), allocating new table slots for the placeholders if necessary. 6. For each primary function directly called from a secondary function, export the primary function if it is not already exported and import it into the secondary module. 7. Replace all references to secondary functions in the primary module's table segments with references to imported placeholder functions. 8. Create new active table segments in the secondary module that will replace all the placeholder function references in the table with references to their corresponding secondary functions upon instantiation. Functions can be used or referenced three ways in a WebAssembly module: they can be exported, called, or placed in a table. The above procedure introduces a layer of indirection to each of those mechanisms that removes all references to secondary functions from the primary module but restores the original program's semantics once the secondary module is instantiated. As more mechanisms that reference functions are added in the future, such as ref.func instructions, they will have to be modified to use a similar layer of indirection. The code as currently written makes a few assumptions about the module that is being split: 1. It assumes that mutable-globals is allowed. This could be worked around by introducing wrapper functions for globals and rewriting secondary code that accesses them, but now that mutable-globals is shipped on all browsers, hopefully that extra complexity won't be necessary. 2. It assumes that all table segment offsets are constants. This simplifies the generation of segments to actively patch in the secondary functions without overwriting any other table slots. This assumption could be relaxed by 1) having secondary segments re-write primary function slots as well, 2) allowing addition in segment offsets, or 3) synthesizing a start function to modify the table instead of using segments. 3. It assumes that each function appears in the table at most once. This isn't necessarily true in general or even for LLVM output after function deduplication. Relaxing this assumption would just require slightly more complex code, so it is a good candidate for a follow up PR. Future Binaryen work for this feature includes providing a command line tool exposing this functionality as well as C API, JS API, and fuzzer support. We will also want to provide a simple instrumentation pass for finding dead or late-executing functions that would be good candidates for splitting out. It would also be good to integrate that instrumentation with future function outlining work so that dead or exceptional basic blocks could be split out into a separate module.
* Remove dead code and unused includes. NFC. (#3328)Sam Clegg2020-11-084-18/+3
| | | Specifically try to cleanup use of asm_v_wasm.h and asmjs constants.
* Standardize NaNs in the interpreter, when there is nondeterminism (#3298)Alon Zakai2020-10-301-62/+72
| | | | | | | Specifically, pick a simple positive canonical NaN as the NaN output, when the output is a NaN. This is the same as what tools like wabt do. This fixes a testcase found by the fuzzer on #3289 but it was not that PR's fault.
* [Memory64] (#3302)Wouter van Oortmerssen2020-10-301-1/+5
| | | Fixed bug in memory64-lowering pass for memory.size/grow
* wasm-emscripten-finalize: Remove staticBump from metadata (#3300)Sam Clegg2020-10-291-2/+1
| | | | | | Emscripten no longer needs this information as of https://github.com/emscripten-core/emscripten/pull/12643. This also removes the need to export __data_end.
* Remove support for emscripten legacy PIC ABI (#3299)Sam Clegg2020-10-292-43/+0
|
* Prototype new SIMD multiplications (#3291)Thomas Lively2020-10-284-0/+157
| | | | | | | Including saturating, rounding Q15 multiplication as proposed in https://github.com/WebAssembly/simd/pull/365 and extending multiplications as proposed in https://github.com/WebAssembly/simd/pull/376. Since these are just prototypes, skips adding them to the C or JS APIs and the fuzzer, as well as implementing them in the interpreter.
* DWARF: Fix handling of the end of control flow instructions (#3288)Alon Zakai2020-10-273-14/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously when we processed a block for example, we'd do this: ;; start is here (block (result type) ;; end is here .. contents .. ) ;; end delimiter is here Not how this represents the block's start and end as the "header", and uses an extra delimiter to mark the end. I think this is wrong, and was an attempt to handle some offsets from LLVM that otherwise made no sense, ones at the end of the "header". But it turns out that this makes us completely incorrect on some things where there is a low/high pc pair, and we need to understand that the end of a block is at the end opcode at the very end, and not the end of the header. This PR changes us to do that, i.e. ;; start is here (block (result type) .. contents .. ) ;; end is here This fixes a testcase already in the test suite, test/passes/fib_nonzero-low-pc_dwarf.bin.txt where you can see that lexical block now has a valid value for the end, and not a 0 (the proper scope extends all the way to the end of the big block in that function, and is now the same in the DWARF before and after we process it). test/passes/fannkuch3_dwarf.bin.txt is also improved by this. To implement this, this removes the BinaryLocations::End delimeter. After this we just need one type of delimiter actually, but I didn't refactor that any more to keep this PR small (see TODO). This removes an assertion in writeDebugLocationEnd() that is no longer valid: the assert ensures that we wrote an end only if there was a 0 for the end, but for a control flow structure, we write the end of the "header" automatically like for any expression, and then overwrite it later when we finish writing the children and the end marker. We could in theory special-case control flow structures to avoid the first write, but it would add more complexity. This uncovered what appears to be a possible bug in our debug_line handling, see test/passes/fannkuch3_manyopts_dwarf.bin.txt. That needs to be looked into more, but I suspect that was invalid info from when we looked at the end of the "header" of control flow structures. Note that there was one definite bug uncovered here, fixed by the extra } else if (locationUpdater.hasOldExprEnd(oldAddr)) { that is added here, which was definitely a bug.
* Implement i8x16.popcnt (#3286)Thomas Lively2020-10-275-0/+13
| | | | | | As proposed in https://github.com/WebAssembly/simd/pull/379. Since this instruction is still being evaluated for inclusion in the SIMD proposal, this PR does not add support for it to the C/JS APIs or to the fuzzer. This PR also performs a drive-by fix for unrelated instructions in c-api-kitchen-sink.c
* [NFC] `using namespace Abstract` to make matchers more compact (#3284)Thomas Lively2020-10-263-8/+8
| | | | | | | | | This change makes matchers in OptimizeInstructions more compact and readable by removing the explicit `Abstract::` namespace from individual operations. In some cases, this makes multi-line matcher expressions fit on a single line. This change is only possible because it also adds an explicit "RMW" prefix to each element of the `AtomicRMWOp` enumeration. Without that, their names conflicted with the names of Abstract ops.
* Implement v128.{load,store}{8,16,32,64}_lane instructions (#3278)Thomas Lively2020-10-225-0/+228
| | | | | | | These instructions are proposed in https://github.com/WebAssembly/simd/pull/350. This PR implements them throughout Binaryen except in the C/JS APIs and in the fuzzer, where it leaves TODOs instead. Right now these instructions are just being implemented for prototyping so adding them to the APIs isn't critical and they aren't generally available to be fuzzed in Wasm engines.
* Fix validateGlobally usage in validator, and an i64-to-i32 bug hidden by it ↵Alon Zakai2020-10-191-6/+6
| | | | | | | | | | | | | | (#3253) validateGlobally means that we can't do lookups on the module. A few places were missing that, or had it wrong. I think the reason for the wrong usages is that we used to have types on the module, and then removed that, so more is now validatable actually. This uncovered a real bug, where i64-to-32 would ignore an unreachable parameter of a call_indirect. That's bad, since if the type is i64, we need to replace it with two parameters. To fix that, just handle unreachability there, using the existing logic (which skips the call_indirect entirely in this case).
* Remove now-redundant stack pointer manipulation passes (#3251)Sam Clegg2020-10-181-26/+0
| | | | The use of these passes was removed on the emscripten side in https://github.com/emscripten-core/emscripten/pull/12536.
* Only write explicit names to name section (#3241)Sam Clegg2020-10-152-20/+22
| | | | Fixes: #3226
* Assign import names consistently between text and binaryn reader (#3238)Sam Clegg2020-10-142-7/+14
| | | | | | | | | The s-parser was assigning numbers names per-type where as the binaryn reader was using the global import count as the number to append. This change switches to use per-element count which I think it preferable as it increases the stability of the auto-generated names. e.g. memory is now always named `$mimport0`.
* Slightly improve validator error text on segments (#3215)Alon Zakai2020-10-121-2/+2
| | | Mentioning if it's a memory or a table segment is convenient.
* Rename Emscripten EHSjLj functions in wasm backend (#3191)Heejin Ahn2020-10-101-129/+1
| | | | | | | | | | | Now that we are renaming invoke wrappers and `emscripten_longjmp_jmpbuf` in the wasm backend, this deletes all related renaming routines and relevant tests. Depends on #3192. Addresses: #3043 and #3081 Companions: https://reviews.llvm.org/D88697 emscripten-core/emscripten#12399
* Refactor naming convention for functions handling tuples (#3196)Max Graey2020-10-092-10/+38
| | | When there are two versions of a function, one handling tuples and the other handling non-tuple values, the previous naming convention was to have "Single" in the name of the non-tuple handling function. This PR simplifies the convention and shortens function names by making the names plural for the tuple-handling version and singular for the non-tuple-handling version.
* Validate memory64 (#3202)Alon Zakai2020-10-081-1/+5
|
* Let GenerateDynCalls generate dynCalls for invokes (#3192)Heejin Ahn2020-10-021-66/+0
| | | | | | This moves dynCall generating functionaity for invokes from `EmscriptenGlueGenerator` to `GenerateDynCalls` pass. So now `GenerateDynCalls` pass will take care of all cases we need dynCalls: functions in tables and invokes.
* Stop generating __growWasmMemory (#3180)Sam Clegg2020-10-012-14/+0
| | | This depends on https://github.com/emscripten-core/emscripten/pull/12391
* Clean up support/bits.h (#3177)Thomas Lively2020-09-303-12/+12
| | | | | Use overloads instead of templates where applicable and change function names from PascalCase to camelCase. Also puts the functions in the Bits namespace to avoid naming conflicts.
* Add --fast-math mode (#3155)Alon Zakai2020-09-301-37/+4
| | | | | | | | | | | | Similar to clang and gcc, --fast-math makes us ignore corner cases of floating-point math like NaN changes and (not done yet) lack of associativity and so forth. In the future we may want to have separate fast math flags for each specific thing, like gcc and clang do. This undoes some changes (#2958 and #3096) where we assumed it was ok to not change NaN bits, but @binji corrected us. We can only do such things in fast math mode. This puts those optimizations behind that flag, adds tests for it, and restores the interpreter to the simpler code from before with no special cases.
* Fix applying default / unify SExpr and Wasm builder names (#3179)Daniel Wirtz2020-09-301-5/+7
| | | SExpressionWasmBuilder was not applying default memory and table import names on the memory and table, unlike on functions, globals and events, where it applies them. Also aligns default import names to use the same shorter forms as in binary parsing.
* GC: Add stubs for the remaining instructions (#3174)Daniel Wirtz2020-09-295-0/+491
| | | NFC, except adding most of the boilerplate for the remaining GC instructions. Each implementation site is marked with a respective `TODO (gc): theInstruction` in between the typical boilerplate code.
* GC: Fuzzing support for i31 (#3169)Daniel Wirtz2020-09-292-11/+12
| | | Integrates `i31ref` types and instructions into the fuzzer, by assuming that `(i31.new (i32.const N))` is constant and hence suitable to be used in global initializers.
* Prototype extended-name-section proposal (#3162)Daniel Wirtz2020-09-291-23/+163
| | | Implements the parts of the Extended Name Section Proposal that are trivially applicable to Binaryen, in particular table, memory and global names. Does not yet implement label, type, elem and data names.
* Refactor literal equality and hashing to not depend on getBits (#3159)Daniel Wirtz2020-09-291-26/+47
| | | Comparing and hashing literals previously depended on `getBits`, which was fine while there were only basic numeric types, but doesn't map well to reference types anymore. Hence this change limits the use of `getBits` to basic numeric types, and implements reference types-aware comparisons and hashing do deal with the newer types.
* Fix regression in memory.fill due to Memory64 (#3176)Wouter van Oortmerssen2020-09-281-0/+11
| | | details: https://github.com/WebAssembly/binaryen/issues/3149
* Fix low_pc updating when the expression is right at the start (#3175)Alon Zakai2020-09-281-1/+1
| | | | | In that case LLVM emits the address of the declarations area (where locals are declared) of the function, which is even earlier than the instructions actual first byte. I'm not sure why it does this, but it's easy to handle.
* Allow atomics to validate with unshared memory (#3172)Thomas Lively2020-09-251-21/+0
| | | | This relaxation has made it to Chrome stable, so it makes sense that we would allow it in the tools.
* Fix missing feature validations (#3171)Daniel Wirtz2020-09-251-0/+25
| | | Instructions `ref.null`, `ref.is_null`, `ref.func`, `try`, `throw`, `rethrow` and `br_on_exn` were previously missing explicit feature checks, and this change adds them. Note that some of these already didn't validate before for other reasons, like requiring the use of a type checked otherwise, but `ref.null` and `try` validated even in context of FeatureSet::MVP, so better to be sure.
* GC: Add i31 instructions (#3154)Daniel Wirtz2020-09-246-9/+121
| | | Adds the `i31.new` and `i31.get_s/u` instructions for creating and working with `i31ref` typed values. Does not include fuzzer integration just yet because the fuzzer expects that trivial values it creates are suitable in global initializers, which is not the case for trivial `i31ref` expressions.
* GC: Add ref.eq instruction (#3145)Daniel Wirtz2020-09-215-0/+48
| | | With `eqref` now integrated, the `ref.eq` instruction can be implemented. The only valid LHS and RHS value is `(ref.null eq)` for now, but implementation and fuzzer integration is otherwise complete.