summaryrefslogtreecommitdiff
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* Consistently optimize small added constants into load/store offsets (#1924)Alon Zakai2019-03-0116-101/+382
| | | | | | | | | | | | | | | | | | | | | | | | | | See #1919 - we did not do this consistently before. This adds a lowMemoryUnused option to PassOptions. It can be passed on the commandline with --low-memory-unused. If enabled, we run the new optimize-added-constants pass, which does the real work here, replacing older code in post-emscripten. Aside from running at the proper time (unlike the old pass, see #1919), this also has a -propagate mode, which can do stuff like this: y = x + 10 [..] load(y) [..] load(y) => y = x + 10 [..] load(x, offset=10) [..] load(x, offset=10) That is, it can propagate such offsets to the loads/stores. This pattern is common in big interpreter loops, where the pointers are offsets into a big struct of state. The pass does this propagation by using a new feature of LocalGraph, which can verify which locals are in SSA mode. Binaryen IR is not SSA (intentionally, since it's a later IR), but if a local only has a single set for all gets, that means that local is in such a state, and can be optimized. The tricky thing is that all locals are initialized to zero, so there are at minimum two sets. But if we verify that the real set dominates all the gets, then the zero initialization cannot reach them, and we are safe. This PR also makes safe-heap aware of lowMemoryUnused. If so, we check for not just an access of 0, but the range 0-1023. This makes zlib 5% faster, with either the wasm backend or asm2wasm. It also makes it 0.5% smaller. Also helps sqlite (1.5% faster) and lua (1% faster)
* Fix memory leaks (#1925)Bohdan2019-02-283-12/+26
| | | | | | Fixes #1921 Signed-off-by: Bogdan Vaneev <warchantua@gmail.com>
* Optimize normally with debug info (#1927)Alon Zakai2019-02-282-11/+20
| | | | | * optimize normally with debug info - some of it may be removed, but that's the price of higher optimization levels, and by optimizing normally in profiling and -g2 etc. builds they are more comparable to normal ones, yielding better data * copy debug locations automatically in replaceCurrent in wasm-traversal, so optimization passes at least by default will preserve debuggability
* Remove reference to old __wasm_nullptr function (#1928)Sam Clegg2019-02-281-5/+1
|
* Simplify ExpressionAnalyzer (#1920)Alon Zakai2019-02-274-497/+351
| | | | | This refactors the hashing and comparison code to use a single immediate-value iterator. This makes us have a single place that knows the list of immediate fields in every node type, instead of 2. This also fixes a few bugs found by doing that. In particular, this makes us slightly slower than before since we are hashing more fields.
* Dead return value elimination in DeadArgumentElimination (#1917)Alon Zakai2019-02-263-12/+111
| | | | | | | * Finds functions whose return value is always dropped, and removes the return. * Run multiple iterations of the pass, as one can enable others. * Do not run DeadArgumentElimination at all if debug info is present (with these improvements, it became much more likely to destroy debug info). Saves 2.5% on hello world, because of some simple libc calls.
* Vacuum unused values (#1918)Alon Zakai2019-02-251-9/+36
| | | | | | | | | | | | | | Checks if a value is being dropped higher up, like ``` (drop (block i32 (block i32 (i32.const 1) ) ) ) ``` Handling this forces us to be careful in that pass about whether a value is used, and whether the type matters (for example, we can't replace a unary with its child in all cases, if the return value matters).
* SmallVector (#1912)Alon Zakai2019-02-255-9/+187
| | | | | Trying to refactor the code to be simpler and less redundant, I ran into some perf issues that it seems like a small vector, with fixed-size storage and optional additional storage as needed, might help with. This implements that class and uses it in a few places. This seems to help, I see some 1-2% fewer instructions and cycles in `perf stat`, but it's hard to tell if it really makes a noticeable difference.
* add an option to not fuzz memory (#1915)Alon Zakai2019-02-252-1/+23
|
* NaN fuzzing improvements (#1913)Alon Zakai2019-02-196-10/+68
| | | | | | | | | * make DE_NAN avoid creating nan literals in the first place * add a reducer option `--denan` to not introduce nans in destructive reduction * add a `Literal::isNaN()` method * also remove the default exception logging from the fuzzer js glue, which is a source of non-useful VM differences (like nan nondeterminism) * added an option `--no-fuzz-nans` to make it easy to avoid nans when fuzzing (without hacking the source and recompiling). Background: trying to get fuzzing on jsc working despite this open issue: https://bugs.webkit.org/show_bug.cgi?id=175691
* if no output is specified to wasm-opt, warn that we are emitting nothing (#1908)Alon Zakai2019-02-151-1/+3
| | | | | | | | | | | | | | A user that just does ``` wasm-opt input.wasm -O ``` may assume that the input file should have been optimized. But without `-o` we don't emit any output. Often you may not want any output, like if you just want to run a pass like `--metrics`. But for most users wasm-opt is probably going to be used as an optimizer of files. So this PR suggests we emit a warning in that case. For comparison, `llvm-opt` would print to the console, but it avoids printing a binary there so it issues a warning. Instead of this warning, perhaps we should do the same? That would also not be confusing. Closes #1907
* respect --no-validation in pass-debug mode (#1904)Alon Zakai2019-02-121-15/+18
|
* Optimize stack writer on deeply nested blocks, fixes #1903 (#1905)Alon Zakai2019-02-121-43/+55
| | | | also remove some old debugging
* legalize invokes even when doing minimal legalization, as js needs them (#1883)Alon Zakai2019-02-081-40/+47
| | | See [emscripten-core/emscripten#7679
* avoid the deprecated and removed Pointer_stringify (#1906)Alon Zakai2019-02-071-21/+21
|
* Fix a fuzz bug with peeking forward in binary reading. (#1902)Alon Zakai2019-02-071-2/+1
| | | Fixes #1900
* wasm-emscripten-finalize: separateDataSegments() fix (#1897)Alon Zakai2019-02-063-3/+10
| | | | | We should emit a file with only the data segments, starting from the global base, and not starting from zero (the data before is unneeded, and the emscripten loading code assumes it isn't there). Also fix the auto updater to work properly on .mem test updating.
* fix breakage (#1901)Alon Zakai2019-02-061-10/+0
| | | We landed two PRs that had a logic conflict but not a source conflict (bulk memory added ops, comparison optimization removed the need for PUSH ops that bulk memory added).
* fix printing of unreachable atomics, and add print fuzzing (#1899)Alon Zakai2019-02-061-2/+2
|
* fix binaryen.js bindings handling of literals (#1896)Alon Zakai2019-02-063-14/+26
| | | The hardcoded 16 size was no longer valid. This was broken for a while, but happened to not overwrite important memory. Testing with the wasm backend did hit breakage.
* use iteration in ExpressionAnalyzer::flexibleEqual, for less manual work on ↵Alon Zakai2019-02-061-70/+16
|\ | | | | | | each new node (#1895)
| * use iteration in ExpressionAnalyzer::flexibleEqual, for less manual work on ↵Alon Zakai2019-02-041-70/+16
| | | | | | | | each new node
* | throw an early error in s-expr-parsing makeBlock, if not inside a function ↵Alon Zakai2019-02-061-0/+1
| | | | | | | | | | (#1894) Fixes #1893
* | Bulk memory operations (#1892)Thomas Lively2019-02-0523-19/+810
| | | | | | | | | | | | Bulk memory operations The only parts missing are the interpreter implementation and spec tests.
* | Fix EM_ASM+pthreads (#1891)Alon Zakai2019-02-041-6/+9
|/ | | To calculate the metadata, we must look at the segments. If we split them out earlier (which we do for threads), they aren't there.
* Strip the producers section in --strip-producers (#1875)Alon Zakai2019-01-316-13/+32
| | | | | | | | WebAssembly/tool-conventions#93 has a summary of emscripten's current thinking on this. For Binaryen, we don't want to do anything to the producers section by default, but do want it to be possible to optionally remove it. To achieve that, this PR * creates a --strip-producers pass that removes that section. * creates a --strip-debug pass that removes debug info, same as the old --strip, which is still around but deprecated. A followup in emscripten will use this pass by default.
* wasm-emscripten-finalize: Emit illegal dynCalls, and legalize them (#1890)Alon Zakai2019-01-292-21/+15
| | | Before this, we just did not emit illegal dynCalls. This was wrong as we do need them (e.g. if a function with a setjmp call calls a function with an i64 param - we'll have an invoke with that signature there). We just need to legalize them. This fixes that by first emitting them, and second by running legalization late, after dynCalls have been generated, so it legalizes them too.
* Increase FuncCastEmulation NUM_PARAMS (#1884)Will Glynn2019-01-291-1/+1
| | | | | | | | | | FuncCastEmulation supports a hardcoded number of parameters: // This should be enough for everybody. (As described above, we need this // to match when dynamically linking, and also dynamic linking is why we // can't just detect this automatically in the module we see.) static const int NUM_PARAMS = 15; Turns out 15 is not enough for everybody: Ruby 2.6.0 needs NUM_PARAMS = 16. This patch is necessary to support Ruby 2.6.0 in WebAssembly, and in fact is the only patch needed to make the relevant build process work with an otherwise normal emscripten toolchain.
* Handle EM_ASM/EM_JS in LLVM wasm backend O0 output (#1888)Alon Zakai2019-01-282-18/+57
| | | | | | | See emscripten-core/emscripten#7928 - we have been optimizing all wasms until now, and noticed this when the wasm object file path did not do so. When not optimizing, our methods of handling EM_ASM and EM_JS fail since the patterns are different. Specifically, for EM_ASM we hunt for emscripten_asm_const(X, where X is a constant, but without opts it may be a get of a local. For EM_JS, the function body may not just contain a const, but a block with a set of the const and a return of a get later. This adds logic to track gets and sets in basic blocks, which is sufficient to handle this.
* validate all function indexes in binary reading (#1887)Alon Zakai2019-01-241-3/+3
| | | fixes bug reported in comment on e63c4a7 , #1885 (comment) , #1879 (comment)
* Validate unique local names, and use validation in wasm2js. Fixes #1885 (#1886)Alon Zakai2019-01-235-13/+27
| | | | | * Also fixes some bugs in wasm2js tests that did not validate. * Rename FeatureOptions => ToolOptions, as they now contain all the basic stuff each tool needs for commandline options (validation yes or no, and which features if so).
* More misc ASAN fixes (#1882)Alon Zakai2019-01-224-0/+11
| | | | | | | | | | * fix buffer overflow in simple_ast.h printing. * check wasm binary format reading of function export indexes for errors. * check if s-expr format imports have a non-empty module and base. Fixes #1876 Fixes #1877 Fixes #1879
* Show a proper error on an invalid type in binary reading ; fixes #1872 (#1874)Alon Zakai2019-01-191-2/+2
|
* Emscripten stack simplification (#1870)Alon Zakai2019-01-164-20/+26
| | | | | | This takes advantage of the recent memory simplification in emscripten, where JS static allocation is done at compile time. That means we know the stack's initial location at compile time, and can apply it. This is the binaryen side of that: * asm2wasm support for asm.js globals with an initial value var X = Y; where Y is not 0 (which is what the stack now is). * wasm-emscripten-finalize support for a flag --initial-stack-pointer=X, and remove the old code to import the stack's initial location.
* Misc minor ASAN fixes (#1869)Alon Zakai2019-01-162-7/+14
| | | | | | | | | | * handle end of input in skipWhitespace in s-parser. fixes #1863 * ignore debug locations when not in a function ; fixes #1867 * error properly on invalid user section sizes ; fixes #1866 * throw a proper error on invalid call offsets in binary reading ; fixes #1865
* Code style improvements (#1868)Alon Zakai2019-01-1523-102/+108
| | | | * Use modern T p = v; notation to initialize class fields * Use modern X() = default; notation for empty class constructors
* Compare binaryen fuzz-exec to JS VMs (#1856)Alon Zakai2019-01-104-39/+61
| | | | | | | | | | | The main fuzz_opt.py script compares JS VMs, and separately runs binaryen's fuzz-exec that compares the binaryen interpreter to itself (before and after opts). This PR lets us directly compare binaryen's interpreter output to JS VMs. This found a bunch of minor things we can do better on both sides, giving more fuzz coverage. To enable this, a bunch of tiny fixes were needed: * Add --fuzz-exec-before which is like --fuzz-exec but just runs the code before opts are run, instead of before and after. * Normalize double printing (so JS and C++ print comparable things). This includes negative zero in JS, which we never printed properly til now. * Various improvements to how we print fuzz-exec logging - remove unuseful things, and normalize the others across JS and C++. * Properly legalize the wasm when --emit-js-wrapper (i.e., we will run the code from JS), and use that in the JS wrapper code.
* Fix copying of globals (#1854)Alon Zakai2019-01-101-0/+2
| | | | | This broke when we refactored imports, as now Global has two more fields. Test is on --func-metrics, which depends on copying to compute some things.
* Fix build on macOS High Sierra 10.13.1 and Xcode 9.2 (9C40b), which does not ↵juj2019-01-101-0/+4
| | | | have aligned_alloc() (not sure if newer macOS/Xcodes do, or if this an issue with old macOS/Xcode version) (#1862)
* Require unique_ptr to Module::addFunctionType() (#1672)Paweł Bylica2019-01-109-26/+23
| | | | | This fixes the memory leak in WasmBinaryBuilder::readSignatures() caused probably the exception thrown there before the FunctionType object is safe. This also makes it clear that the Module becomes the owner of the FunctionType objects.
* Aligned allocation fixes. Fixes #1845 (#1846)Alon Zakai2019-01-092-8/+64
| | | | | | | | | | | | | | The error in #1845 shows: /<<PKGBUILDDIR>>/src/mixed_arena.h: In member function 'void* MixedArena::allocSpace(size_t, size_t)': /<<PKGBUILDDIR>>/src/mixed_arena.h:125:43: error: 'new' of type 'MixedArena::Chunk' {aka 'std::aligned_storage<32768, 16>::type'} with extended alignment 16 [-Werror=aligned-new=] chunks.push_back(new Chunk[numChunks]); ^ /<<PKGBUILDDIR>>/src/mixed_arena.h:125:43: note: uses 'void* operator new [](std::size_t)', which does not have an alignment parameter /<<PKGBUILDDIR>>/src/mixed_arena.h:125:43: note: use '-faligned-new' to enable C++17 over-aligned new support It turns out I had misread the aligned_storage docs, and they don't actually do what we need, which is a convenient cross-platform way to do aligned allocation, since new itself doesn't support that. Sadly it seems there is no cross-platform way to do it right now, so I added a header in support which abstracts over the windows and everything-else ways. Also add some ctest testing, which runs on windows, so we get basic windows coverage in our CI.
* Remove interp and fix tests (#1858)Alon Zakai2019-01-081-548/+0
| | | Updates tests to the latest notation changes, and also remove wasm.js (see kripken/emscripten#7831 ) as we'd need to either rebuild it or update it for the new notation as well, and it's not used at this point.
* determinism fix for code-folding (#1852)Alon Zakai2019-01-081-4/+12
| | | Don't depend on the hash values for ordering - use a fixed order based on order of appearance.
* Massive renaming (#1855)Thomas Lively2019-01-0736-627/+641
| | | | | | Automated renaming according to https://github.com/WebAssembly/spec/issues/884#issuecomment-426433329.
* determinism fix for SSAify::computeGetsAndPhis (#1850)Alon Zakai2019-01-031-3/+3
|
* Don't emit simd in fuzzer unless requested (some code paths we missed) (#1849)Alon Zakai2019-01-021-17/+28
|
* Determinism fix for SSA pass (#1841)Alon Zakai2019-01-021-7/+6
| | | We iterated over a set. Instead, iterate over the relevant items in their order in the IR.
* Refactor Features code (#1848)Alon Zakai2019-01-023-4/+183
| | | Add features.h which centralizes all the feature detection code. (I'll need this in another place than the validator which is where it was til now.)
* Minor code style cleanups (#1844)Alon Zakai2019-01-022-26/+25
|
* Fix fuzzing JS glue code (#1843)Alon Zakai2018-12-273-1/+21
| | | | | | | | | After we added logging to the fuzzer, we forgot to add to the JS glue code the necessary imports so it can be run there too. Also adds legalization for the JS glue code imports and exports. Also adds a missing validator check on imports having a function type (the fuzzing code was missing one). Fixes #1842