summaryrefslogtreecommitdiff
path: root/src/wasm/wasm-validator.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Feature options (#1797)Thomas Lively2018-12-031-9/+9
| | | | Add feature flags and struct interface. Default feature set has all feature enabled.
* Add support for a mutable globals as a Feature (#1785)Sam Clegg2018-11-301-0/+9
| | | | | This picks up from #1644 and indeed borrows the test case from there.
* Add v128 type (#1777)Thomas Lively2018-11-291-0/+1
|
* standardize on 'template<' over 'template <' (i.e., remove a space) (#1782)Alon Zakai2018-11-291-2/+2
|
* Remove default cases (#1757)Thomas Lively2018-11-271-4/+4
| | | | | | Where reasonable from a readability perspective, remove default cases in switches over types and instructions. This makes future feature additions easier by making the compiler complain about each location where new types and instructions are not yet handled.
* Fix segment size validation for imported memories (#1745)Sam Clegg2018-11-151-2/+7
| | | | | | | | | | | Without this wasm-opt can't operation on emscripten-produced SIDE_MODULES's which have zero sized memory imports. Technically is not a validation failure if you have segments that are larger than your initial memory, regardless of whether you import them. For non-imported memories it can be helpful though, so leaving it in to catch those errors.
* Support 4GB Memories (#1702)Alon Zakai2018-10-151-1/+1
| | | This fixes asm2wasm parsing of the max to allow 4GB, and also changes the internal Memory::kMaxValue values to reflect that. We used to use kMaxValue to also represent "no limit", so I split that out into kUnlimitedValue.
* properly handle unreachable atomic operations, fixes a regression from #1693 ↵Alon Zakai2018-10-111-2/+2
| | | | (#1696)
* No atomic float operations (#1693)Alon Zakai2018-10-051-1/+7
| | | | | SafeHeap was emitting them, but it looks like they are invalid according to the wasm-threads spec. Fixes kripken/emscripten#7208
* Unify imported and non-imported things (#1678)Alon Zakai2018-09-191-47/+16
| | | | | | | | | | | | | | Fixes #1649 This moves us to a single object for functions, which can be imported or nor, and likewise for globals (as a result, GetGlobals do not need to check if the global is imported or not, etc.). All imported things now inherit from Importable, which has the module and base of the import, and if they are set then it is an import. For convenient iteration, there are a few helpers like ModuleUtils::iterDefinedGlobals(wasm, [&](Global* global) { .. use global .. }); as often iteration only cares about imported or defined (non-imported) things.
* remove PageSize and HasFeature, which wasm removed a while back (#1667)Alon Zakai2018-09-121-3/+1
| | | From #1665 (a fuzz bug noticed they were not handled in stack.h).
* Stack IR (#1623)Alon Zakai2018-07-301-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a new IR, "Stack IR". This represents wasm at a very low level, as a simple stream of instructions, basically the same as wasm's binary format. This is unlike Binaryen IR which is structured and in a tree format. This gives some small wins on binary sizes, less than 1% in most cases, usually 0.25-0.50% or so. That's not much by itself, but looking forward this prepares us for multi-value, which we really need an IR like this to be able to optimize well. Also, it's possible there is more we can do already - currently there are just a few stack IR optimizations implemented, DCE local2stack - check if a set_local/get_local pair can be removed, which keeps the set's value on the stack, which if the stars align it can be popped instead of the get. Block removal - remove any blocks with no branches, as they are valid in wasm binary format. Implementation-wise, the IR is defined in wasm-stack.h. A new StackInst is defined, representing a single instruction. Most are simple reflections of Binaryen IR (an add, a load, etc.), and just pointers to them. Control flow constructs are expanded into multiple instructions, like a block turns into a block begin and end, and we may also emit extra unreachables to handle the fact Binaryen IR has unreachable blocks/ifs/loops but wasm does not. Overall, all the Binaryen IR differences with wasm vanish on the way to stack IR. Where this IR lives: Each Function now has a unique_ptr to stack IR, that is, a function may have stack IR alongside the main IR. If the stack IR is present, we write it out during binary writing; if not, we do the same binaryen IR => wasm binary process as before (this PR should not affect speed there). This design lets us use normal Passes on stack IR, in particular this PR defines 3 passes: Generate stack IR Optimize stack IR (might be worth splitting out into separate passes eventually) Print stack IR for debugging purposes Having these as normal passes is convenient as then they can run in parallel across functions and all the other conveniences of our current Pass system. However, a downside of keeping the second IR as an option on Functions, and using normal Passes to operate on it, means that we may get out of sync: if you generate stack IR, then modify binaryen IR, then the stack IR may no longer be valid (for example, maybe you removed locals or modified instructions in place etc.). To avoid that, Passes now define if they modify Binaryen IR or not; if they do, we throw away the stack IR. Miscellaneous notes: Just writing Stack IR, then writing to binary - no optimizations - is 20% slower than going directly to binary, which is one reason why we still support direct writing. This does lead to some "fun" C++ template code to make that convenient: there is a single StackWriter class, templated over the "mode", which is either Binaryen2Binary (direct writing), Binaryen2Stack, or Stack2Binary. This avoids a lot of boilerplate as the 3 modes share a lot of code in overlapping ways. Stack IR does not support source maps / debug info. We just don't use that IR if debug info is present. A tiny text format comment (if emitting non-minified text) indicates stack IR is present, if it is ((; has Stack IR ;)). This may help with debugging, just in case people forget. There is also a pass to print out the stack IR for debug purposes, as mentioned above. The sieve binaryen.js test was actually not validating all along - these new opts broke it in a more noticeable manner. Fixed. Added extra checks in pass-debug mode, to verify that if stack IR should have been thrown out, it was. This should help avoid any confusion with the IR being invalid. Added a comment about the possible future of stack IR as the main IR, depending on optimization results, following some discussion earlier today.
* Optimize validation of many nested blocks (#1576)Alon Zakai2018-05-301-44/+49
| | | | | | | On the testcase from https://github.com/tweag/asterius/issues/19#issuecomment-393052653 this makes us almost 3x faster, and use 25% less memory. The main improvement here is to simplify and optimize the data structures the validator uses to validate br targets: use unordered maps, and use one less of them. Also some speedups from using that map more effectively (use of iterators to avoid multiple lookups). Also move the duplicate-node checks to the internal IR validation section, which makes more sense anyhow (it's not wasm validation, it's internal IR validation, which like the check for stale internal types, we do only if debugging).
* Optimize equivalent locals (#1540)Alon Zakai2018-05-101-0/+1
| | | | | | | | | If locals are known to contain the same value, we can * Pick which local to use for a get_local of any of them. Makes sense to prefer the most common, to increase the chance of one dropping to zero uses. * Remove copies between a local and one that we know contains the same value. This is a consistent win, small though, around 0.1-0.2%.
* Fix MSVC warnings when compiling the binaryen target (#1535)Daniel Wirtz2018-05-091-4/+4
|
* Fix bad param/var type error handling (#1499)Alon Zakai2018-04-101-0/+6
| | | Improve error handling, validation, and assertions for having a non-concrete type in an inappropriate place. Fixes a fuzz testcase.
* Function pointer cast emulation (#1468)Alon Zakai2018-03-131-0/+6
| | | | | | | | | | | This adds a pass that implements "function pointer cast emulation" - allows indirect calls to go through even if the number of arguments or their types is incorrect. That is undefined behavior in C/C++ but in practice somehow works in native archs. It is even relied upon in e.g. Python. Emscripten already has such emulation for asm.js, which also worked for asm2wasm. This implements something like it in binaryen which also allows the wasm backend to use it. As a result, Python should now be portable using the wasm backend. The mechanism used for the emulation is to make all indirect calls use a fixed number of arguments, all of type i64, and a return type of also i64. Thunks are then placed in the table which translate the arguments properly for the target, basically by reinterpreting to i64 and back. As a result, receiving an i64 when an i32 is sent will have the upper bits all zero, and the reverse would truncate the upper bits, etc. (Note that this is different than emscripten's existing emulation, which converts (as signed) to a double. That makes sense for JS where double's can contain all numeric values, but in wasm we have i64s. Also, bitwise conversion may be more like what native archs do anyhow. It is enough for Python.) Also adds validation for a function's type matching the function's actual params and result (surprised we didn't have that before, but we didn't, and there was even a place in the test suite where that was wrong). Also simplifies the build script by moving two cpp files into the wasm/ subdir, so they can be built once and shared between the various tools.
* Fuzz fix: if global does not exist, report error and don't run the rest of ↵Alon Zakai2018-03-071-37/+38
| | | | the checks (#1461)
* Rename WasmType => Type (#1398)Alon Zakai2018-02-021-29/+29
| | | | * rename WasmType to Type. it's in the wasm:: namespace anyhow, and without Wasm- it fits in better alongside Index, Address, Expression, Module, etc.
* Validation fixes for #1317 (#1347)Alon Zakai2018-01-031-0/+16
| | | | | | | * add get_global/set_global validation * validate get_local index * update builds * fix tests
* allow exporting an import (#1326)Alon Zakai2017-12-081-9/+6
|
* accept overlapping segments (#1289)Alon Zakai2017-11-141-5/+2
|
* Restrict validation output to just validation errors in the API (#1253)Daniel Wirtz2017-11-011-2/+0
| | | Do not print the entire and possibly very large module when validation fails. Leave printing to tools using the validator, instead of always doing it in the validator where it can't be overridden.
* Add Features enum to IR (#1250)Derek Schuff2017-10-271-1/+10
| | | | | | | | | | | | This enum describes which wasm features the IR is expected to include. The validator should reject operations which require excluded features, and passes should avoid producing IR which requires excluded features. This makes it easier to catch possible errors in Binaryen producers (e.g. emscripten). Asm2wasm has a flag to enable or disable atomics. Other tools currently just accept all features (as, dis and opt are just for inspecting or modifying existing modules, so it would be annoying to have to use flags with those tools and I expect the risk of accidentally introducing atomics to be low).
* notation change: AST => IR (#1245)Alon Zakai2017-10-241-2/+2
| | | The IR is indeed a tree, but not an "abstract syntax tree" since there is no language for which it is the syntax (except in the most trivial and meaningless sense).
* Atomics support in interpreter + optimizer + fuzz fixes for that (#1227)Alon Zakai2017-10-201-3/+4
|
* Refactor validator API to use enums (#1209)Alon Zakai2017-10-031-5/+5
| | | | * refactor validator API to use enums
* Fast validation (#1204)Alon Zakai2017-10-021-192/+476
| | | | | | | This makes wasm validation parallel (the function part). This makes loading+validating tanks (a 12MB wasm file) 2.3x faster on a 4-core machine (from 3.5 to 1.5 seconds). It's a big speedup because most of loading+validating was actually validating. It's also noticeable during compilation, since we validate by default at the end. 8% faster on -O2 and 23% on -O0. So actually fairly significant on -O0 builds. As a bonus, this PR also moves the code from being 99% in the header to be 1% in the header.
* Update text syntax for shared memory limits (#1197)Derek Schuff2017-09-221-0/+1
| | | | Following WebAssembly/threads#58 e.g. (memory $0 23 256 shared) is now (memory $0 (shared 23 256))
* Expressions should not appear twice in the ast (#1191)Alon Zakai2017-09-181-0/+18
|
* Add support for sign-extension operators from threading proposal (#1167)Derek Schuff2017-09-061-2/+11
| | | These are not atomic operations, but are added with the atomic operations to keep from having to define atomic versions of all the sign-extending loads (an atomic zero-extending load + signext operation can be used instead).
* clean up untaken => unreachable, as well as unnecessary named stuff in ↵Alon Zakai2017-09-061-6/+1
| | | | validation that was from when we differentiated reachable from unreachable breaks (#1166)
* Return to more structured type rules for block and if (#1148)Alon Zakai2017-09-051-9/+11
| | | | | | | | * if a block has a concrete final element (or a break with a value), then even if it has an unreachable child, keep it with that concrete type. this means we no longe allow the silly case of a block with an unreachable in the middle and a concrete as the final element while the block is unreachable - after this change, the block would have the type of the final element * if an if has a concrete element in one arm, make it have that type as a result, even if the if condition is unreachable, to parallel block * make type rules for brs and switches simpler, ignore whether they are reachable or not. whether they are dead code should not affect how they influence other types in our IR.
* wasm-reduce tool (#1139)Alon Zakai2017-09-011-7/+12
| | | Reduce an interesting wasm to a smaller still interesting wasm. This takes an arbitrary command to run, and reduces the wasm as much as it can while keeping the behavior of that command fixed. This can be used to reduce compiler bugs in an arbitrary VM, etc.
* Add support for atomic wait and wake operators (#1140)Derek Schuff2017-08-241-0/+14
| | | According to spec at https://github.com/WebAssembly/threads/blob/master/proposals/threads/Overview.md#wait-and-wake-operators
* improve WasmValidator::validateMemBytes, check for unreasonable sizes even ↵Alon Zakai2017-07-191-6/+6
| | | | type is unreachable (#1102)
* Merge pull request #1095 from WebAssembly/fuzz-3Alon Zakai2017-07-181-2/+5
|\ | | | | More fuzz fixes
| * fix validation of memBytes, if the load type is unreachable, we can't and ↵Alon Zakai (kripken)2017-07-131-2/+5
| | | | | | | | shouldn't try to validate
* | Validation for AtomicRMW and cmpxchg (#1092)Derek Schuff2017-07-141-1/+25
|/ | | | Also fix cases where fail() had the arguments backwards. Wasn't an error because lol templates. Also fix printModuleComponent template to SFINAE on Expression* so we properly get the specialized version.
* Merge pull request #1087 from WebAssembly/fuzz-2Alon Zakai2017-07-121-15/+19
|\ | | | | Fuzz fixes
| * refactor and improve break validation. breaks names are unique, so we don't ↵Alon Zakai (kripken)2017-07-111-15/+19
| | | | | | | | need a stack, and break targets must exist even if they are not actually taken
* | Refactor validation failure and printing, validate atomic memory (#1090)Derek Schuff2017-07-121-10/+42
| |
* | add docs and error hints when a Call should be a CallImport (#1081)Alon Zakai2017-07-111-1/+6
|/ | | | | | * add docs and error hints when a Call should be a CallImport * fix binaryen API docs in docs/
* Factor wasm validator into a cpp file (#1086)Derek Schuff2017-07-101-0/+639
Also small cleanup to CMake libraries