summaryrefslogtreecommitdiff
path: root/src/wasm-stack.h
Commit message (Collapse)AuthorAgeFilesLines
* Remove boilerplate in wasm-stack.h (#3347)Alon Zakai2020-11-121-64/+4
|
* Implement v128.{load,store}{8,16,32,64}_lane instructions (#3278)Thomas Lively2020-10-221-0/+1
| | | | | | | These instructions are proposed in https://github.com/WebAssembly/simd/pull/350. This PR implements them throughout Binaryen except in the C/JS APIs and in the fuzzer, where it leaves TODOs instead. Right now these instructions are just being implemented for prototyping so adding them to the APIs isn't critical and they aren't generally available to be fuzzed in Wasm engines.
* GC: Add stubs for the remaining instructions (#3174)Daniel Wirtz2020-09-291-0/+12
| | | NFC, except adding most of the boilerplate for the remaining GC instructions. Each implementation site is marked with a respective `TODO (gc): theInstruction` in between the typical boilerplate code.
* GC: Add i31 instructions (#3154)Daniel Wirtz2020-09-241-0/+2
| | | Adds the `i31.new` and `i31.get_s/u` instructions for creating and working with `i31ref` typed values. Does not include fuzzer integration just yet because the fuzzer expects that trivial values it creates are suitable in global initializers, which is not the case for trivial `i31ref` expressions.
* GC: Add ref.eq instruction (#3145)Daniel Wirtz2020-09-211-0/+1
| | | With `eqref` now integrated, the `ref.eq` instruction can be implemented. The only valid LHS and RHS value is `(ref.null eq)` for now, but implementation and fuzzer integration is otherwise complete.
* Initial implementation of "Memory64" proposal (#3130)Wouter van Oortmerssen2020-09-181-4/+4
| | | Also includes a lot of new spec tests that eventually need to go into the spec repo
* Refactor Host expression to MemorySize and MemoryGrow (#3137)Daniel Wirtz2020-09-171-1/+2
| | | Aligns the internal representations of `memory.size` and `memory.grow` with other more recent memory instructions by removing the legacy `Host` expression class and adding separate expression classes for `MemorySize` and `MemoryGrow`. Simplifies related APIs, but is also a breaking API change.
* Simplify BinaryenIRWriter (#3110)Thomas Lively2020-09-101-497/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | BinaryenIRWriter was previously inconsistent about whether or not it emitted an instruction if that instruction was not reachable. Instructions that produced values were not emitted if they were unreachable, but instructions that did not produce values were always emitted. Additionally, blocks continued to emit their children even after emitting an unreachable child. Since it was not possible to tell whether an unreachable instruction's parent would be emitted, BinaryenIRWriter had to be very defensive and emit many extra `unreachable` instructions around unreachable code to avoid type errors. This PR unifies the logic for emitting all non-control flow instructions and changes the behavior of BinaryenIRWriter so that it never emits instructions that cannot be reached due to having unreachable children. This means that extra `unreachable` instructions now only need to be emitted after unreachable control flow constructs. BinaryenIRWriter now also stops emitting instructions inside blocks after the first unreachable instruction as an extra optimization. This change will also simplify Poppy IR stackification (see #3059) by guaranteeing that instructions with unreachable children will not be emitted into the stackifier. This makes satisfying the Poppy IR rule against unreachable Pops trivial, whereas previously satisfying this rule would have required about about 700 additional lines of code to recompute the types of all unreachable children for any instruction.
* DWARF: Do not reorder locals in binary writing (#2959)Alon Zakai2020-07-231-6/+10
| | | | | | | | | | | | | | | | | | | | | The binary writer reorders locals unconditionally. I forgot about this, and so when I made DWARF disable optimization passes that reorder, this was left active. Optimally the writer would not do this, and the ReorderLocals pass would. But it looks like we need special logic for tuple locals anyhow, as they expand into multiple locals, so some amount of local order changes seems unavoidable atm. Test changes are mostly just lots of offsets, and can be ignored, but the new test test/passes/dwarf-local-order.* shows the issue. It prints $foo once, then after a roundtrip (showing no reordering), then it strips the DWARF section and prints after another roundtrip (which does show reordering). This also makes us avoid the Stack IR writer if DWARF is present, which matches what we do with source maps. This doesn't prevent any known bugs, but it's simpler this way and debugging + Stack IR opts is not an important combination.
* Remove `Push` (#2867)Thomas Lively2020-05-221-8/+0
| | | | | | Push and Pop have been superseded by tuples for their original intended purpose of supporting multivalue. Pop is still used to represent block arguments for exception handling, but there are no plans to use Push for anything now or in the future.
* Emit unreachable tuple.make properly (#2701)Thomas Lively2020-03-181-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We previously thought unreachable `tuple.make` instructions did not require special unreachable handling, but consider the following wast: ``` (module (func $foo (tuple.make (unreachable) (i32.const 42) ) ) ) ``` This validates because the only expression in the body is unreachable, but when it is emitted as a binary it becomes ``` unreachable i32.const 42 ``` This does not validate because it ends with an i32, but the function expected an empty stack at the end. The fix is to emit an extra `unreachable` after unreachable `tuple.make` instructions. Unfortunately it is impossible to write a test for this right now because the binary parser silently drops the `i32.const 42`, making the function valid again.
* Initial multivalue support (#2675)Thomas Lively2020-03-051-0/+30
| | | | | | | | | Implements parsing and emitting of tuple creation and extraction and tuple-typed control flow for both the text and binary formats. TODO: - Extend Precompute/interpreter to handle tuple values - C and JS API support/testing - Figure out how to lower in stack IR - Fuzzing
* Generalize binary writing for tuples (#2670)Thomas Lively2020-02-271-2/+2
| | | | | | | | | Updates `BinaryInstWriter::mapLocalsAndEmitHeader` so it no longer hardcodes each possible local type. Also adds a new inner loop over the elements of any local tuple type in the IR. Updates the map from IR local indices to binary indices to be additionally keyed on the index within a tuple type. Since we do not generate tuple types yet, this additional index is hardcoded to zero everywhere it is used for now. A later PR adding tuple creation operations will extend this functionality and add tests.
* DWARF: Track the positions of 'end', 'else', 'catch' binary locations (#2603)Alon Zakai2020-01-211-7/+9
| | | | | | | | | | | | | | | | | Control flow structures have those in addition to the normal span of (start, end), and we need to track them too. Tracking them during reading requires us to track control flow structures while parsing, so that we can know to which structure an end/else/catch refers to. We track these locations using a map on the side of instruction to its "extra" locations. That avoids increasing the size of the tracking info for the much more common non-control flow instructions. Note that there is one more 'end' location, that of the function (not referring to any instruction). I left that to a later PR to not increase this one too much.
* DWARF: high_pc computation (#2595)Alon Zakai2020-01-161-0/+3
| | | | | | | Update high_pc values. These are interesting as they may be a relative offset compared to the low_pc. For functions we already had both a start and an end. Add such tracking for instructions as well.
* [NFC] Enforce use of `Type::` on type names (#2434)Thomas Lively2020-01-071-27/+28
|
* Generate push/pop in stack IR (#2566)Heejin Ahn2020-01-031-4/+2
| | | | | | | | | | | We have not been generating push and pop instructions in the stack IR. Even though they are not written in binary, they have to be in the stack IR to match the number of inputs and outputs of instructions. Currently `BinaryenIRWriter` is used both for stack IR generation and binary generation, so we should emit those instructions in `BinaryenIRWriter`. `BinaryenIRToBinaryWriter`, which inherits `BinaryenIRWriter`, does not do anything for push and pop instructions, so they are still not emitted in binary.
* Add support for reference types proposal (#2451)Heejin Ahn2019-12-301-0/+30
| | | | | | | | | | | | This adds support for the reference type proposal. This includes support for all reference types (`anyref`, `funcref`(=`anyfunc`), and `nullref`) and four new instructions: `ref.null`, `ref.is_null`, `ref.func`, and new typed `select`. This also adds subtype relationship support between reference types. This does not include table instructions yet. This also does not include wasm2js support. Fixes #2444 and fixes #2447.
* DWARF debug line updating (#2545)Alon Zakai2019-12-201-4/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With this, we can update DWARF debug line info properly as we write a new binary. To do that we track binary locations as we write. Each instruction is mapped to the location it is written to. We must also adjust them as we move code around because of LEB optimization (we emit a function or a section with a 5-byte LEB placeholder, the maximal size; later we shrink it which is almost always possible). writeDWARFSections() now takes a second param, the new locations of instructions. It then maps debug line info from the original offsets in the binary to the new offsets in the binary being written. The core logic for updating the debug line section is in wasm-debug.cpp. It basically tracks state machine logic both to read the existing debug lines and to emit the new ones. I couldn't find a way to reuse LLVM code for this, but reading LLVM's code was very useful here. A final tricky thing we need to do is to update the DWARF section's internal size annotation. The LLVM YAML writing code doesn't do that for us. Luckily it's pretty easy, in fixEmittedSection we just update the first 4 bytes in place to have the section size, after we've emitted it and know the size. This ignores debug lines with a 0 in the line, col, or addr, see WebAssembly/debugging#9 (comment) This ignores debug line offsets into the middle of instructions, which LLVM sometimes emits for some reason, see WebAssembly/debugging#9 (comment) Handling that would likely at least double our memory usage, which is unfortunate - we are run in an LTO manner, where the entire app's DWARF is present, and it may be massive. I think we should see if such odd offsets are a bug in LLVM, and if we can fix or prevent that. This does not emit "special" opcodes for debug lines. Those are purely an optimization, which I wanted to leave for later. (Even without them we decrease the size quite a lot, btw, as many lines have 0s in them...) This adds some testing that shows we can load and save fib2.c and fannkuch.cpp properly. The latter includes more than one function and has nontrivial code. To actually emit correct offsets a few minor fixes are done here: * Fix the code section location tracking during reading - the correct offset we care about is the body of the code section, not including the section declaration and size. * Fix wasm-stack debug line emitting. We need to update in BinaryInstWriter::visit(), that is, right before writing bytes for the instruction. That differs from * BinaryenIRWriter::visit which is a recursive function that also calls the children - so the offset there would be of the first child. For some reason that is correct with source maps, I don't understand why, but it's wrong for DWARF... * Print code section offsets in hex, to match other tools. Remove DWARFUpdate pass, which was useful for testing temporarily, but doesn't make sense now (it just updates without writing a binary). cc @yurydelendik
* Remove 'none' type as a branch target in ReFinalize (#2492)Alon Zakai2019-12-041-1/+1
| | | | | | | | | | | | | | | | | That was needed for super-old wasm type system, where we allowed (block $x (br_if $x (unreachable) (nop) ) ) That is, we differentiated "taken" branches from "named" ones (just referred to by name, but not actually taken as it's in unreachable code). We don't need to differentiate those any more. Remove the ReFinalize code that considered it, and also remove the named/taken distinction in other places.
* vNxM.load_splat instructions (#2350)Thomas Lively2019-09-231-0/+12
| | | | | | | Introduces a new instruction class, `SIMDLoad`. Implements encoding, decoding, parsing, printing, and interpretation of the load and splat instructions, including in the C and JS APIs. `v128.load` remains in the `Load` instruction class for now because the interpreter code expects a `Load` to be able to load any memory value type.
* QFMA/QFMS instructions (#2328)Thomas Lively2019-09-031-6/+6
| | | | | | | | | Renames the SIMDBitselect class to SIMDTernary and adds the new {f32x4,f64x2}.qfm{a,s} ternary instructions. Because the SIMDBitselect class is no more, this is a backwards-incompatible change to the C interface. The new instructions are not yet used in the fuzzer because they are not yet implemented in V8. The corresponding LLVM commit is https://reviews.llvm.org/rL370556.
* Add atomic.fence instruction (#2307)Heejin Ahn2019-08-271-0/+7
| | | | | | | This adds `atomic.fence` instruction: https://github.com/WebAssembly/threads/blob/master/proposals/threads/Overview.md#fence-operator This also fix bugs in `atomic.wait` and `atomic.notify` instructions in binaryen.js and adds tests for them.
* Add basic exception handling support (#2282)Heejin Ahn2019-08-131-2/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds basic support for exception handling instructions, according to the spec: https://github.com/WebAssembly/exception-handling/blob/master/proposals/Exceptions.md This PR includes support for: - Binary reading/writing - Wast reading/writing - Stack IR - Validation - binaryen.js + C API - Few IR routines: branch-utils, type-updating, etc - Few passes: just enough to make `wasm-opt -O` pass - Tests This PR does not include support for many optimization passes, fuzzer, or interpreter. They will be follow-up PRs. Try-catch construct is modeled in Binaryen IR in a similar manner to that of if-else: each of try body and catch body will contain a block, which can be omitted if there is only a single instruction. This block will not be emitted in wast or binary, as in if-else. As in if-else, `class Try` contains two expressions each for try body and catch body, and `catch` is not modeled as an instruction. `exnref` value pushed by `catch` is get by `pop` instruction. `br_on_exn` is special: it returns different types of values when taken and not taken. We make `exnref`, the type `br_on_exn` pushes if not taken, as `br_on_exn`'s type.
* Fix extra unreachable generation (#2266)Heejin Ahn2019-07-271-45/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently various expressions handle this differently, and now we consistently follow this rules: --- For all non-control-flow value-returning instructions, if a type of an expression is unreachable, we emit an unreachable and don't emit the instruction itself. If we don't emit an unreachable, instructions that follow can have validation failure in wasm binary format. For example: ``` [unreachable] (f32.add [unreachable] (i32.eqz [unreachable] (unreachable) ) ... ) ``` This is a valid prgram in binaryen IR, because the unreachable type propagates out of an expression, making both i32.eqz and f32.add unreachable. But in binary format, this becomes: ``` unreachable i32.eqz f32.add ;; validation failure; it expects f32 but takes an i32! ``` And here f32.add causes validation failure in wasm validation. So in this case we add an unreachable to prevent following instructions to consume the current value (here i32.eqz). In actual tests, I used `global.get` to an f32 global, which does not return a value, instead of `f32.add`, because `f32.add` itself will not be emitted if one of argument is unreachable. --- So the changes are: - For instructions that don't return a value, removes unreachable emitting code if it exists. - Add the unreachable emitting code for value-returning instructions if there isn't one. - Check for unreachability only once after emitting all children for atomic instructions. Currently only atomic instructions check unreachability after visiting each children and bail out right after, which is valid, but not consistent with others. - Don't emit an extra unreachable after a return (and return_call). I guess it is unnecessary.
* Refactor stack IR / binary writer (NFC) (#2250)Heejin Ahn2019-07-231-1951/+405
| | | | | | | | | | | | | | | | Previously `StackWriter` and its subclasses had routines for all three modes (`Binaryen2Binary`, `Binaryen2Stack`, and `Stack2Binary`) within a single class. This splits routines for each in a separate class and also factors out binary writing into a separate class (`BinaryInstWriter`) so other classes can make use of it. The new classes are: - `BinaryInstWriter`: Binary instruction writer. Only responsible for emitting binary contents and no other logic - `BinaryenIRWriter`: Converts binaryen IR into something else - `BinaryenIRToBinaryWriter`: Writes binaryen IR to binary - `StackIRGenerator`: Converts binaryen IR to stack IR - `StackIRToBinaryWriter`: Writes stack IR to binary
* Rename except_ref type to exnref (#2224)Heejin Ahn2019-07-141-3/+3
| | | | In WebAssembly/exception-handling#79 we agreed to rename `except_ref` type to `exnref`.
* Initial tail call implementation (#2197)Thomas Lively2019-07-031-4/+6
| | | | | | | | | | | Including parsing, printing, assembling, disassembling. TODO: - interpreting - effects - finalization and typing - fuzzing - JS/C API
* Minimal Push/Pop support (#2207)Alon Zakai2019-07-031-2/+17
| | | | | | | This is the first stage of adding support for stacky/multivaluey things. It adds new push/pop instructions, and so far just shows that they can be read and written, and that the optimizer doesn't do anything immediately wrong on them. No fuzzer support, since there isn't a "correct" way to use these yet. The current test shows some "incorrect" usages of them, which is nice to see that we can parse/emit them, but we should replace them with proper usages of push/pop once we actually have those (see comments in the tests). This should be enough to unblock exceptions (which needs a pop in try-catches). It is also a step towards multivalue (I added some docs about that), but most of multivalue is left to be done.
* Reflect instruction renaming in code (#2128)Heejin Ahn2019-05-211-24/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | - Reflected new renamed instruction names in code and tests: - `get_local` -> `local.get` - `set_local` -> `local.set` - `tee_local` -> `local.tee` - `get_global` -> `global.get` - `set_global` -> `global.set` - `current_memory` -> `memory.size` - `grow_memory` -> `memory.grow` - Removed APIs related to old instruction names in Binaryen.js and added APIs with new names if they are missing. - Renamed `typedef SortedVector LocalSet` to `SetsOfLocals` to prevent name clashes. - Resolved several TODO renaming items in wasm-binary.h: - `TableSwitch` -> `BrTable` - `I32ConvertI64` -> `I32WrapI64` - `I64STruncI32` -> `I64SExtendI32` - `I64UTruncI32` -> `I64UExtendI32` - `F32ConvertF64` -> `F32DemoteI64` - `F64ConvertF32` -> `F64PromoteF32` - Renamed `BinaryenGetFeatures` and `BinaryenSetFeatures` to `BinaryenModuleGetFeatures` and `BinaryenModuleSetFeatures` for consistency.
* Add except_ref type (#2081)Heejin Ahn2019-05-071-0/+3
| | | | This adds except_ref type, which is a part of the exception handling proposal.
* clang-tidy braces changes (#2075)Alon Zakai2019-05-011-45/+90
| | | Applies the changes in #2065, and temprarily disables the hook since it's too slow to run on a change this large. We should re-enable it in a later commit.
* Apply format changes from #2048 (#2059)Alon Zakai2019-04-261-471/+1249
| | | Mass change to apply clang-format to everything. We are applying this in a PR by me so the (git) blame is all mine ;) but @aheejin did all the work to get clang-format set up and all the manual work to tidy up some things to make the output nicer in #2048
* Rename atomic wait/notify instructions (#1972)Heejin Ahn2019-03-301-5/+5
| | | | | | | | This renames the following: - `i32.wait` -> `i32.atomic.wait` - `i64.wait` -> `i64.atomic.wait` - `wake` -> `atomic.notify` to match the spec.
* Optimize stack writer on deeply nested blocks, fixes #1903 (#1905)Alon Zakai2019-02-121-43/+55
| | | | also remove some old debugging
* Bulk memory operations (#1892)Thomas Lively2019-02-051-8/+53
| | | | | | Bulk memory operations The only parts missing are the interpreter implementation and spec tests.
* Rename `idx` to `index` in SIMD code for consistency (#1836)Thomas Lively2018-12-181-3/+3
|
* SIMD (#1820)Thomas Lively2018-12-131-4/+207
| | | | | | | | | Implement and test the following functionality for SIMD. - Parsing and printing - Assembling and disassembling - Interpretation - C API - JS API
* Implement nontrapping float-to-int instructions (#1780)Thomas Lively2018-12-041-0/+8
|
* Add v128 type (#1777)Thomas Lively2018-11-291-1/+11
|
* Remove default cases (#1757)Thomas Lively2018-11-271-8/+7
| | | | | | Where reasonable from a readability perspective, remove default cases in switches over types and instructions. This makes future feature additions easier by making the compiler complain about each location where new types and instructions are not yet handled.
* Unify imported and non-imported things (#1678)Alon Zakai2018-09-191-11/+0
| | | | | | | | | | | | | | Fixes #1649 This moves us to a single object for functions, which can be imported or nor, and likewise for globals (as a result, GetGlobals do not need to check if the global is imported or not, etc.). All imported things now inherit from Importable, which has the module and base of the import, and if they are set then it is an import. For convenient iteration, there are a few helpers like ModuleUtils::iterDefinedGlobals(wasm, [&](Global* global) { .. use global .. }); as often iteration only cares about imported or defined (non-imported) things.
* Add debug information locations to the function prolog/epilog (#1674)Yury Delendik2018-09-171-0/+6
| | | | | | | The current patch: * Preserves the debug locations from function prolog and epilog * Preserves the debug locations of the nested blocks
* Print Stack IR in proper .wat format (#1630)Alon Zakai2018-08-141-45/+0
| | | This now makes --generate-stack-ir --print-stack-ir emit a fully valid .wat wasm file, in stacky format.
* Stack IR (#1623)Alon Zakai2018-07-301-0/+1244
This adds a new IR, "Stack IR". This represents wasm at a very low level, as a simple stream of instructions, basically the same as wasm's binary format. This is unlike Binaryen IR which is structured and in a tree format. This gives some small wins on binary sizes, less than 1% in most cases, usually 0.25-0.50% or so. That's not much by itself, but looking forward this prepares us for multi-value, which we really need an IR like this to be able to optimize well. Also, it's possible there is more we can do already - currently there are just a few stack IR optimizations implemented, DCE local2stack - check if a set_local/get_local pair can be removed, which keeps the set's value on the stack, which if the stars align it can be popped instead of the get. Block removal - remove any blocks with no branches, as they are valid in wasm binary format. Implementation-wise, the IR is defined in wasm-stack.h. A new StackInst is defined, representing a single instruction. Most are simple reflections of Binaryen IR (an add, a load, etc.), and just pointers to them. Control flow constructs are expanded into multiple instructions, like a block turns into a block begin and end, and we may also emit extra unreachables to handle the fact Binaryen IR has unreachable blocks/ifs/loops but wasm does not. Overall, all the Binaryen IR differences with wasm vanish on the way to stack IR. Where this IR lives: Each Function now has a unique_ptr to stack IR, that is, a function may have stack IR alongside the main IR. If the stack IR is present, we write it out during binary writing; if not, we do the same binaryen IR => wasm binary process as before (this PR should not affect speed there). This design lets us use normal Passes on stack IR, in particular this PR defines 3 passes: Generate stack IR Optimize stack IR (might be worth splitting out into separate passes eventually) Print stack IR for debugging purposes Having these as normal passes is convenient as then they can run in parallel across functions and all the other conveniences of our current Pass system. However, a downside of keeping the second IR as an option on Functions, and using normal Passes to operate on it, means that we may get out of sync: if you generate stack IR, then modify binaryen IR, then the stack IR may no longer be valid (for example, maybe you removed locals or modified instructions in place etc.). To avoid that, Passes now define if they modify Binaryen IR or not; if they do, we throw away the stack IR. Miscellaneous notes: Just writing Stack IR, then writing to binary - no optimizations - is 20% slower than going directly to binary, which is one reason why we still support direct writing. This does lead to some "fun" C++ template code to make that convenient: there is a single StackWriter class, templated over the "mode", which is either Binaryen2Binary (direct writing), Binaryen2Stack, or Stack2Binary. This avoids a lot of boilerplate as the 3 modes share a lot of code in overlapping ways. Stack IR does not support source maps / debug info. We just don't use that IR if debug info is present. A tiny text format comment (if emitting non-minified text) indicates stack IR is present, if it is ((; has Stack IR ;)). This may help with debugging, just in case people forget. There is also a pass to print out the stack IR for debug purposes, as mentioned above. The sieve binaryen.js test was actually not validating all along - these new opts broke it in a more noticeable manner. Fixed. Added extra checks in pass-debug mode, to verify that if stack IR should have been thrown out, it was. This should help avoid any confusion with the IR being invalid. Added a comment about the possible future of stack IR as the main IR, depending on optimization results, following some discussion earlier today.