summaryrefslogtreecommitdiff
path: root/src/wasm.h
Commit message (Collapse)AuthorAgeFilesLines
...
* Refactor module element related functions (NFC) (#2550)Heejin Ahn2019-12-231-1/+5
| | | | This does something similar to #2489 for more functions, removing boilerplate code for each module element using template functions.
* DWARF debug line updating (#2545)Alon Zakai2019-12-201-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With this, we can update DWARF debug line info properly as we write a new binary. To do that we track binary locations as we write. Each instruction is mapped to the location it is written to. We must also adjust them as we move code around because of LEB optimization (we emit a function or a section with a 5-byte LEB placeholder, the maximal size; later we shrink it which is almost always possible). writeDWARFSections() now takes a second param, the new locations of instructions. It then maps debug line info from the original offsets in the binary to the new offsets in the binary being written. The core logic for updating the debug line section is in wasm-debug.cpp. It basically tracks state machine logic both to read the existing debug lines and to emit the new ones. I couldn't find a way to reuse LLVM code for this, but reading LLVM's code was very useful here. A final tricky thing we need to do is to update the DWARF section's internal size annotation. The LLVM YAML writing code doesn't do that for us. Luckily it's pretty easy, in fixEmittedSection we just update the first 4 bytes in place to have the section size, after we've emitted it and know the size. This ignores debug lines with a 0 in the line, col, or addr, see WebAssembly/debugging#9 (comment) This ignores debug line offsets into the middle of instructions, which LLVM sometimes emits for some reason, see WebAssembly/debugging#9 (comment) Handling that would likely at least double our memory usage, which is unfortunate - we are run in an LTO manner, where the entire app's DWARF is present, and it may be massive. I think we should see if such odd offsets are a bug in LLVM, and if we can fix or prevent that. This does not emit "special" opcodes for debug lines. Those are purely an optimization, which I wanted to leave for later. (Even without them we decrease the size quite a lot, btw, as many lines have 0s in them...) This adds some testing that shows we can load and save fib2.c and fannkuch.cpp properly. The latter includes more than one function and has nontrivial code. To actually emit correct offsets a few minor fixes are done here: * Fix the code section location tracking during reading - the correct offset we care about is the body of the code section, not including the section declaration and size. * Fix wasm-stack debug line emitting. We need to update in BinaryInstWriter::visit(), that is, right before writing bytes for the instruction. That differs from * BinaryenIRWriter::visit which is a recursive function that also calls the children - so the offset there would be of the first child. For some reason that is correct with source maps, I don't understand why, but it's wrong for DWARF... * Print code section offsets in hex, to match other tools. Remove DWARFUpdate pass, which was useful for testing temporarily, but doesn't make sense now (it just updates without writing a binary). cc @yurydelendik
* Binary format code section offset tracking (#2515)Alon Zakai2019-12-191-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Optionally track the binary format code section offsets, that is, when loading a binary, remember where each IR node was read from. This is necessary for DWARF debug info, as these are the offsets DWARF refers to. (Note that eventually we may want to do something else, like first read the DWARF and only then add debug info annotations into the IR in a more LLVM-like manner, but this is more straightforward and should be enough to update debug lines and ranges). This tracking adds noticeable overhead - every single IR node adds an entry in a map - so avoid it unless actually necessary. Specifically, if the user passes in -g and there are actually DWARF sections in the binary, and we are not about to remove those sections, then we need it. Print binary format code section offsets in text, when printing with -g. This will help debug and test dwarf support. It looks like ;; code offset: 0x7 as an annotation right before each node. Also add support for -g in wasm-opt tests (unlike a pass, it has just one - as a prefix). Helps #2400
* SIMD {i8x16,i16x8}.avgr_u instructions (#2539)Thomas Lively2019-12-181-0/+2
| | | As specified in https://github.com/WebAssembly/simd/pull/126.
* Correctly clear memory / table info in clearModule (#2536)Heejin Ahn2019-12-171-0/+15
| | | | | | Currently `ModuleUtils::clearModule` does not clear `exists` flags in the memory and table, and running RoundTrip pass on any module that has a memory or a table fails as a result. This creates `clear` function in `Memory` and `Table` and makes `clearModule` call them.
* Make local.tee's type its local's type (#2511)Heejin Ahn2019-12-121-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to the current spec, `local.tee`'s return type should be the same as its local's type. (Discussions on whether we should change this rule is going on in WebAssembly/reference-types#55, but here I will assume this spec does not change. If this changes, we should change many parts of Binaryen transformation anyway...) But currently in Binaryen `local.tee`'s type is computed from its value's type. This didn't make any difference in the MVP, but after we have subtype relationship in #2451, this can become a problem. For example: ``` (func $test (result funcref) (local $0 anyref) (local.tee $0 (ref.func $test) ) ) ``` This shouldn't validate in the spec, but this will pass Binaryen validation with the current `local.tee` implementation. This makes `local.tee`'s type computed from the local's type, and makes `LocalSet::makeTee` get a type parameter, to which we should pass the its corresponding local's type. We don't embed the local type in the class `LocalSet` because it may increase memory size. This also fixes the type of `local.get` to be the local type where `local.get` and `local.set` pair is created from `local.tee`.
* Remove FunctionType (#2510)Thomas Lively2019-12-111-27/+2
| | | | | | | | | | | | | | | | | Function signatures were previously redundantly stored on Function objects as well as on FunctionType objects. These two signature representations had to always be kept in sync, which was error-prone and needlessly complex. This PR takes advantage of the new ability of Type to represent multiple value types by consolidating function signatures as a pair of Types (params and results) stored on the Function object. Since there are no longer module-global named function types, significant changes had to be made to the printing and emitting of function types, as well as their parsing and manipulation in various passes. The C and JS APIs and their tests also had to be updated to remove named function types.
* Refactor removing module elements (#2489)Heejin Ahn2019-12-021-0/+6
| | | | | | | | | | | This creates utility functions for removing module elements: removing one element by name, and removing multiple elements using a predicate function. And makes other parts of code use it. I think this is a light-handed approach than calling `Module::updateMaps` after removing only a part of module elements. This also fixes a bug in the inlining pass: it didn't call `Module::updateMaps` after removing functions. After this patch callers don't need to additionally call it anyway.
* Remove FunctionType from Event (#2466)Thomas Lively2019-11-251-10/+2
| | | | | | | | | This is the start of a larger refactoring to remove FunctionType entirely and store types and signatures directly on the entities that use them. This PR updates BrOnExn and Events to remove their use of FunctionType and makes the BinaryWriter traverse the module and collect types rather than using the global FunctionType list. While we are collecting types, we also sort them by frequency as an optimization. Remaining uses of FunctionType in Function, CallIndirect, and parsing will be removed in a future PR.
* Add i32x4.dot_i16x8_s (#2420)Thomas Lively2019-11-041-0/+1
| | | | | This experimental instruction is specified in https://github.com/WebAssembly/simd/pull/127 and is being implemented to enable further investigation of its performance impact.
* Add SIMD integer min and max instructions (#2416)Thomas Lively2019-11-011-0/+12
| | | As proposed in https://github.com/WebAssembly/simd/pull/27.
* v8x16.swizzle (#2368)Thomas Lively2019-10-031-0/+3
| | | | As specified at https://github.com/WebAssembly/simd/blob/master/proposals/simd/SIMD.md#swizzling-using-variable-indices.
* SIMD load and extend instructions (#2353)Thomas Lively2019-09-241-1/+7
| | | | | | Adds support for the new load and extend instructions. Also updates from C++11 to C++17 in order to use generic lambdas in the interpreter implementation.
* v128.andnot instruction (#2355)Thomas Lively2019-09-241-0/+1
| | | | | As specified at https://github.com/WebAssembly/simd/pull/102. Also fixes bugs in the JS API for other SIMD bitwise operators.
* vNxM.load_splat instructions (#2350)Thomas Lively2019-09-231-0/+22
| | | | | | | Introduces a new instruction class, `SIMDLoad`. Implements encoding, decoding, parsing, printing, and interpretation of the load and splat instructions, including in the C and JS APIs. `v128.load` remains in the `Load` instruction class for now because the interpreter code expects a `Load` to be able to load any memory value type.
* SIMD narrowing and widening operations (#2341)Thomas Lively2019-09-141-0/+16
|
* QFMA/QFMS instructions (#2328)Thomas Lively2019-09-031-7/+10
| | | | | | | | | Renames the SIMDBitselect class to SIMDTernary and adds the new {f32x4,f64x2}.qfm{a,s} ternary instructions. Because the SIMDBitselect class is no more, this is a backwards-incompatible change to the C interface. The new instructions are not yet used in the fuzzer because they are not yet implemented in V8. The corresponding LLVM commit is https://reviews.llvm.org/rL370556.
* Add atomic.fence instruction (#2307)Heejin Ahn2019-08-271-1/+13
| | | | | | | This adds `atomic.fence` instruction: https://github.com/WebAssembly/threads/blob/master/proposals/threads/Overview.md#fence-operator This also fix bugs in `atomic.wait` and `atomic.notify` instructions in binaryen.js and adds tests for them.
* Add basic exception handling support (#2282)Heejin Ahn2019-08-131-0/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds basic support for exception handling instructions, according to the spec: https://github.com/WebAssembly/exception-handling/blob/master/proposals/Exceptions.md This PR includes support for: - Binary reading/writing - Wast reading/writing - Stack IR - Validation - binaryen.js + C API - Few IR routines: branch-utils, type-updating, etc - Few passes: just enough to make `wasm-opt -O` pass - Tests This PR does not include support for many optimization passes, fuzzer, or interpreter. They will be follow-up PRs. Try-catch construct is modeled in Binaryen IR in a similar manner to that of if-else: each of try body and catch body will contain a block, which can be omitted if there is only a single instruction. This block will not be emitted in wast or binary, as in if-else. As in if-else, `class Try` contains two expressions each for try body and catch body, and `catch` is not modeled as an instruction. `exnref` value pushed by `catch` is get by `pop` instruction. `br_on_exn` is special: it returns different types of values when taken and not taken. We make `exnref`, the type `br_on_exn` pushes if not taken, as `br_on_exn`'s type.
* Initial tail call implementation (#2197)Thomas Lively2019-07-031-0/+2
| | | | | | | | | | | Including parsing, printing, assembling, disassembling. TODO: - interpreting - effects - finalization and typing - fuzzing - JS/C API
* Minimal Push/Pop support (#2207)Alon Zakai2019-07-031-0/+27
| | | | | | | This is the first stage of adding support for stacky/multivaluey things. It adds new push/pop instructions, and so far just shows that they can be read and written, and that the optimizer doesn't do anything immediately wrong on them. No fuzzer support, since there isn't a "correct" way to use these yet. The current test shows some "incorrect" usages of them, which is nice to see that we can parse/emit them, but we should replace them with proper usages of push/pop once we actually have those (see comments in the tests). This should be enough to unblock exceptions (which needs a pop in try-catches). It is also a step towards multivalue (I added some docs about that), but most of multivalue is left to be done.
* Add event section (#2151)Heejin Ahn2019-05-311-0/+25
| | | | | | | | | | | | | | | | | | This adds support for the event and the event section, as specified in https://github.com/WebAssembly/exception-handling/blob/master/proposals/Exceptions.md#changes-to-the-binary-model. Wasm events are features that suspend the current execution and transfer the control flow to a corresponding handler. Currently the only supported event kind is exceptions. For events, this includes support for - Binary file reading/writing - Wast file reading/writing - Binaryen.js API - Fuzzer - Validation - Metadce - Passes: metrics, minify-imports-and-exports, remove-unused-module-elements
* Refactor type and function parsing (#2143)Heejin Ahn2019-05-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | - Refactored & fixed typeuse parsing rules so now the rules more closely follow the spec. There have been multiple parsing rules that were different in subtle ways, which are supposed to be the same according to the spec. - Duplicate types, i.e., types with the same signature, in the type section are allowed as long as they don't have the same given name. If a name is given, we use it; if type name is not given, we generate one in the form of `$FUNCSIG$` + signature string. If the same generated name already exists in the type section, we append `_` at the end. This causes most of the changes in the autogenerated type names in test outputs. - A typeuse has to be in the order of (type) -> (param) -> (result), if more than one of them exist. In case of function definitions, (local) has to be after all of these. Fixed some test cases that violate this rule. - When only (param)/(result) are given, its type will be the type with the smallest existing type index whose parameter and result are the same. If there's no such type, a new type will be created and inserted. - Added a test case `duplicate_types.wast` to test type namings for duplicate types. - Refactored `parseFunction` function. - Add more overrides to helper functions: `getSig` and `ensureFunctionType`.
* Reflect instruction renaming in code (#2128)Heejin Ahn2019-05-211-17/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | - Reflected new renamed instruction names in code and tests: - `get_local` -> `local.get` - `set_local` -> `local.set` - `tee_local` -> `local.tee` - `get_global` -> `global.get` - `set_global` -> `global.set` - `current_memory` -> `memory.size` - `grow_memory` -> `memory.grow` - Removed APIs related to old instruction names in Binaryen.js and added APIs with new names if they are missing. - Renamed `typedef SortedVector LocalSet` to `SetsOfLocals` to prevent name clashes. - Resolved several TODO renaming items in wasm-binary.h: - `TableSwitch` -> `BrTable` - `I32ConvertI64` -> `I32WrapI64` - `I64STruncI32` -> `I64SExtendI32` - `I64UTruncI32` -> `I64UExtendI32` - `F32ConvertF64` -> `F32DemoteI64` - `F64ConvertF32` -> `F64PromoteF32` - Renamed `BinaryenGetFeatures` and `BinaryenSetFeatures` to `BinaryenModuleGetFeatures` and `BinaryenModuleSetFeatures` for consistency.
* Apply format changes from #2048 (#2059)Alon Zakai2019-04-261-139/+397
| | | Mass change to apply clang-format to everything. We are applying this in a PR by me so the (git) blame is all mine ;) but @aheejin did all the work to get clang-format set up and all the manual work to tidy up some things to make the output nicer in #2048
* Finish bulk memory support (#2030)Thomas Lively2019-04-221-1/+0
| | | | | | | Implement interpretation of remaining bulk memory ops, add bulk memory spec tests with light modifications, fix bugs preventing the fuzzer from running correctly with bulk memory, and fix bugs found by the fuzzer.
* Change default feature set to MVP (#1993)Thomas Lively2019-04-161-1/+1
| | | | | In the absence of the target features section or command line flags. When there are command line flags, it is an error if they do not exactly match the target features section, except if --detect-features has been provided. Also adds a --print-features pass to print the command line flags for all enabled options and uses it to make the feature tests more rigorous.
* Move features from passOptions to Module (#2001)Thomas Lively2019-04-121-0/+7
| | | | | This allows us to emit a (potentially modified) target features section and conditionally emit other sections such as the DataCount section based on the presence of features.
* Better memory fuzzing (#1987)Alon Zakai2019-04-081-4/+4
| | | | | | | | Hash the contents of all of memory and log that out in random places in the fuzzer, so we are more sensitive there and can catch memory bugs. Fix UB that was uncovered by this in the binary writing code - if a segment is empty, we should not look at &vector[0], and instead use vector.data(). Add Builder::addExport convenience method.
* Passive segments (#1976)Thomas Lively2019-04-051-1/+8
| | | | | Adds support for the bulk memory proposal's passive segments. Uses a new (data passive ...) s-expression syntax to mark sections as passive.
* Rename atomic wait/notify instructions (#1972)Heejin Ahn2019-03-301-5/+5
| | | | | | | | This renames the following: - `i32.wait` -> `i32.atomic.wait` - `i64.wait` -> `i64.atomic.wait` - `wake` -> `atomic.notify` to match the spec.
* Validate that types match features (#1949)Thomas Lively2019-03-181-43/+1
| | | | | | Refactors features into a new wasm-features.h file and updates the validator to check that all types are allowed. Currently this is only relevant for the v128 SIMD type, but new types will be added in the future. The test for this change is in #1948.
* Add const specifiers (#1952)Ryoga2019-03-181-2/+13
| | | | | | With this we can write stuff like: const wasm::Expression* p; const wasm::Binary* q = p->cast<wasm::Binary>();
* Remove unnecessary semicolons (#1942)Ryoga2019-03-181-1/+1
| | | Removed semicolons that cause errors when compiling with -pedantic-errors.
* Add sign-ext feature (#1947)Thomas Lively2019-03-151-2/+4
|
* Consistently optimize small added constants into load/store offsets (#1924)Alon Zakai2019-03-011-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | See #1919 - we did not do this consistently before. This adds a lowMemoryUnused option to PassOptions. It can be passed on the commandline with --low-memory-unused. If enabled, we run the new optimize-added-constants pass, which does the real work here, replacing older code in post-emscripten. Aside from running at the proper time (unlike the old pass, see #1919), this also has a -propagate mode, which can do stuff like this: y = x + 10 [..] load(y) [..] load(y) => y = x + 10 [..] load(x, offset=10) [..] load(x, offset=10) That is, it can propagate such offsets to the loads/stores. This pattern is common in big interpreter loops, where the pointers are offsets into a big struct of state. The pass does this propagation by using a new feature of LocalGraph, which can verify which locals are in SSA mode. Binaryen IR is not SSA (intentionally, since it's a later IR), but if a local only has a single set for all gets, that means that local is in such a state, and can be optimized. The tricky thing is that all locals are initialized to zero, so there are at minimum two sets. But if we verify that the real set dominates all the gets, then the zero initialization cannot reach them, and we are safe. This PR also makes safe-heap aware of lowMemoryUnused. If so, we check for not just an access of 0, but the range 0-1023. This makes zlib 5% faster, with either the wasm backend or asm2wasm. It also makes it 0.5% smaller. Also helps sqlite (1.5% faster) and lua (1% faster)
* Fix memory leaks (#1925)Bohdan2019-02-281-0/+1
| | | | | | Fixes #1921 Signed-off-by: Bogdan Vaneev <warchantua@gmail.com>
* Simplify ExpressionAnalyzer (#1920)Alon Zakai2019-02-271-2/+2
| | | | | This refactors the hashing and comparison code to use a single immediate-value iterator. This makes us have a single place that knows the list of immediate fields in every node type, instead of 2. This also fixes a few bugs found by doing that. In particular, this makes us slightly slower than before since we are hashing more fields.
* Bulk memory operations (#1892)Thomas Lively2019-02-051-1/+55
| | | | | | Bulk memory operations The only parts missing are the interpreter implementation and spec tests.
* Code style improvements (#1868)Alon Zakai2019-01-151-33/+36
| | | | * Use modern T p = v; notation to initialize class fields * Use modern X() = default; notation for empty class constructors
* Require unique_ptr to Module::addFunctionType() (#1672)Paweł Bylica2019-01-101-1/+1
| | | | | This fixes the memory leak in WasmBinaryBuilder::readSignatures() caused probably the exception thrown there before the FunctionType object is safe. This also makes it clear that the Module becomes the owner of the FunctionType objects.
* Refactor Features code (#1848)Alon Zakai2019-01-021-1/+5
| | | Add features.h which centralizes all the feature detection code. (I'll need this in another place than the validator which is where it was til now.)
* Rename `idx` to `index` in SIMD code for consistency (#1836)Thomas Lively2018-12-181-2/+2
|
* SIMD (#1820)Thomas Lively2018-12-131-2/+108
| | | | | | | | | Implement and test the following functionality for SIMD. - Parsing and printing - Assembling and disassembling - Interpretation - C API - JS API
* Create API for feature dependent picking in fuzzer (#1821)Thomas Lively2018-12-121-12/+6
|
* Implement nontrapping float-to-int instructions (#1780)Thomas Lively2018-12-041-0/+3
|
* Feature options (#1797)Thomas Lively2018-12-031-6/+34
| | | | Add feature flags and struct interface. Default feature set has all feature enabled.
* Add --strip that removes debug info (#1787)Alon Zakai2018-12-031-0/+5
| | | | This is sort of like --strip on a native binary. The more specific use case for us is e.g. you link with a library that has -g in its CFLAGS, but you don't want debug info in your final executable (I hit this with poppler now). We can make emcc pass this to binaryen if emcc is not building an output with intended debug info.
* Add support for a mutable globals as a Feature (#1785)Sam Clegg2018-11-301-0/+1
| | | | | This picks up from #1644 and indeed borrows the test case from there.
* Support 4GB Memories (#1702)Alon Zakai2018-10-151-3/+7
| | | This fixes asm2wasm parsing of the max to allow 4GB, and also changes the internal Memory::kMaxValue values to reflect that. We used to use kMaxValue to also represent "no limit", so I split that out into kUnlimitedValue.