summaryrefslogtreecommitdiff
path: root/src/support
Commit message (Collapse)AuthorAgeFilesLines
...
* standardize on 'template<' over 'template <' (i.e., remove a space) (#1782)Alon Zakai2018-11-295-11/+11
|
* Add wasm-emscripten-finalize flag to separate data segments into a file (#1741)Derek Schuff2018-11-141-0/+4
| | | | This writes the data section into a file suitable for use with emscripten's --memory-init-file flag
* Don't call static desructors when Fatal() errors occur (#1722)Sam Clegg2018-11-022-3/+6
| | | | | This was causing a deadlock while destroying the thread pool.
* DeadArgumentElimination Pass (#1641)Alon Zakai2018-09-051-1/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a pass to remove unnecessary call arguments in an LTO-like manner, that is: * If a parameter is not actually used in a function, we don't need to send anything, and can remove it from the function's declaration. Concretely, (func $a (param $x i32) ..no uses of $x.. ) (func $b (call $a (..)) ) => (func $a ..no uses of $x.. ) (func $b (call $a) ) And * If a parameter is only ever sent the same constant value, we can just set that constant value in the function (which then means that the values sent from the outside are no longer used, as in the previous point). Concretely, (func $a (param $x i32) ..may use $x.. ) (func $b (call $a (i32.const 1)) (call $a (i32.const 1)) ) => (func $a (local $x i32) (set_local $x (i32.const 1) ..may use $x.. ) (func $b (call $a) (call $a) ) How much this helps depends on the codebase obviously, but sometimes it is pretty useful. For example, it shrinks 0.72% on Unity and 0.37% on Mono. Note that those numbers include not just the optimization itself, but the other optimizations it then enables - in particular the second point from earlier leads to inlining a constant value, which often allows constant propagation, and also removing parameters may enable more duplicate function elimination, etc. - which explains how this can shrink Unity by almost 1%. Implementation is pretty straightforward, but there is some work to make the heavy part of the pass parallel, and a bunch of corner cases to avoid (can't change a function that is exported or in the table, etc.). Like the Inlining pass, there is both a standard and an "optimizing" version of this pass - the latter also optimizes the functions it changes, as like Inlining, it's useful to not need to re-run all function optimizations on the whole module.
* Stack IR (#1623)Alon Zakai2018-07-301-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a new IR, "Stack IR". This represents wasm at a very low level, as a simple stream of instructions, basically the same as wasm's binary format. This is unlike Binaryen IR which is structured and in a tree format. This gives some small wins on binary sizes, less than 1% in most cases, usually 0.25-0.50% or so. That's not much by itself, but looking forward this prepares us for multi-value, which we really need an IR like this to be able to optimize well. Also, it's possible there is more we can do already - currently there are just a few stack IR optimizations implemented, DCE local2stack - check if a set_local/get_local pair can be removed, which keeps the set's value on the stack, which if the stars align it can be popped instead of the get. Block removal - remove any blocks with no branches, as they are valid in wasm binary format. Implementation-wise, the IR is defined in wasm-stack.h. A new StackInst is defined, representing a single instruction. Most are simple reflections of Binaryen IR (an add, a load, etc.), and just pointers to them. Control flow constructs are expanded into multiple instructions, like a block turns into a block begin and end, and we may also emit extra unreachables to handle the fact Binaryen IR has unreachable blocks/ifs/loops but wasm does not. Overall, all the Binaryen IR differences with wasm vanish on the way to stack IR. Where this IR lives: Each Function now has a unique_ptr to stack IR, that is, a function may have stack IR alongside the main IR. If the stack IR is present, we write it out during binary writing; if not, we do the same binaryen IR => wasm binary process as before (this PR should not affect speed there). This design lets us use normal Passes on stack IR, in particular this PR defines 3 passes: Generate stack IR Optimize stack IR (might be worth splitting out into separate passes eventually) Print stack IR for debugging purposes Having these as normal passes is convenient as then they can run in parallel across functions and all the other conveniences of our current Pass system. However, a downside of keeping the second IR as an option on Functions, and using normal Passes to operate on it, means that we may get out of sync: if you generate stack IR, then modify binaryen IR, then the stack IR may no longer be valid (for example, maybe you removed locals or modified instructions in place etc.). To avoid that, Passes now define if they modify Binaryen IR or not; if they do, we throw away the stack IR. Miscellaneous notes: Just writing Stack IR, then writing to binary - no optimizations - is 20% slower than going directly to binary, which is one reason why we still support direct writing. This does lead to some "fun" C++ template code to make that convenient: there is a single StackWriter class, templated over the "mode", which is either Binaryen2Binary (direct writing), Binaryen2Stack, or Stack2Binary. This avoids a lot of boilerplate as the 3 modes share a lot of code in overlapping ways. Stack IR does not support source maps / debug info. We just don't use that IR if debug info is present. A tiny text format comment (if emitting non-minified text) indicates stack IR is present, if it is ((; has Stack IR ;)). This may help with debugging, just in case people forget. There is also a pass to print out the stack IR for debug purposes, as mentioned above. The sieve binaryen.js test was actually not validating all along - these new opts broke it in a more noticeable manner. Fixed. Added extra checks in pass-debug mode, to verify that if stack IR should have been thrown out, it was. This should help avoid any confusion with the IR being invalid. Added a comment about the possible future of stack IR as the main IR, depending on optimization results, following some discussion earlier today.
* duplicate-function-elimination improvements (#1590)Alon Zakai2018-06-071-3/+6
| | | | | | | On a codebase with 370K functions, 160K were in fact duplicate (!)... and it took many many passes to figure that out, over 2 minutes in fact (!), as A and B may be identical only after we see that the functions C1, C2 that they call are identical (so there can be long "chains" here). To avoid this, limit how many passes we do. In -O1, just do one pass - that gets most duplicates. In -O2, do 10 passes - that gets almost all of it on this codebase. And in -O3 (or -Os/-Oz) do as many passes as necessary (i.e., the old behavior). This at least lets iteration builds (-O1) be nice and fast. This PR also refactors the hashing code used in that pass, moving it to nicer header files for clearer readability. Also some other minor cleanups in hashing code that helped debug this.
* refactor Path utils: store the bin/ dir so that all users of the API can use ↵Alon Zakai2018-03-303-24/+79
| | | | it by the standard calls, even if it was modified by user input (move it out of just being in wasm-reduce.cpp) (#1489)
* Support wasm-reduce for Windows (#1488)Michael Ferris2018-03-261-3/+5
|
* More simple math opts (#1414)Alon Zakai2018-02-142-5/+10
| | | | | | | | * optimize more simple math operations: mul of 0, or of 0, and of 0, mul of 1, mul of a power of 2, urem of a power of 2 * fix asm2wasm callImport parsing: the optimizer may get rid of the added offset to a function table * update js builds
* wasm-reduce tweaks and improvements (#1405)Alon Zakai2018-02-111-1/+1
| | | | | * wasm-reduce tweaks and improvements: better error messages, better validation, better function removal, etc.
* 'std::string &' => 'std::string& ' (#1403)Alon Zakai2018-02-053-13/+13
| | | The & on the type is the proper convention.
* Simplify ThreadPool::isRunning (#1391)Alon Zakai2018-01-302-3/+2
| | | | | | * simplify ThreadPool::isRunning: it doesn't need to be static and to go through the global unique_ptr * it's undefined behavior to access the threadpool from a shutting down thread, as the parent is being destroyed
* ThreadPool refactoring (#1389)Alon Zakai2018-01-262-33/+43
| | | | | | | | Refactor ThreadPool code for clarity and to fix some bugs with using the pool from different threads in parallel. We have a singleton pool, and need to ensure it is created only once and used only by one thread at a time. This model is a simple way to ensure we use a number of threads equal to the number of cores, more or less (a pool per Module might lead to number of cores * number of Modules being optimized). This refactoring adds a parent pointer in the worker threads (giving them direct access to the pool makes it simpler to make sure that pool and thread creation and teardown are threadsafe). This commit also adds proper locking around pool creation and pool usage.
* Threading fixes (#1377)Alon Zakai2018-01-242-6/+21
| | | | | | * threading fixes, be careful when creating the pool (more than one thread may try to) and don't create it just to check if its running in the thread constructor assertions * child threads will call ::get() - don't do initialize() under the lock
* Refactor optimization defaults (#1366)Alon Zakai2018-01-171-24/+0
| | | | | Followup to #1357. This moves the optimization settings into pass.h, and uses it from there in the various places. This also splits up huge lines from the tracing code, which put all block children (whose number can be arbitrarily large) on one line. This seems to have caused random errors on the bots, I suspect from overflowing a buffer. Anyhow, it's much more clear to split the lines at a reasonable length.
* Add optimize, shrink level and debug info options to C/JS (#1357)Daniel Wirtz2018-01-171-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add optimize, shrink level and debug info options to C/JS * Add instantiate functionality for creating additional unique instances of the API * Use a workaround when running tests in node Tests misuse a module as a script by concatenating, so instead of catching this case in the library, catch it there * Update sieve test Seems optimized output changed due to running with optimize levels 2/1 now * Use the options with all pass runners * Update relooper-fuzz C-API test * Share defaults between tools and the C-API * Add a test for optimize levels * Unify node test support in check.by and auto_update_tests.py * Also add getters for optimize levels and test them * Also test debugInfo * Add debug info to C tests that used it as well * Fix missing NODEJS import in auto_update_tests * Detect node.js version (WASM support) * Update hello-world JS test (now also runs with node) * feature-test WebAssembly in node instead * Document that these options apply globally, and where * Make sure hello-world.js output doesn't differ between mozjs/node
* Redundant Set Elimination pass (#1344)Alon Zakai2018-01-052-1/+66
| | | | This optimizes #1343. It looks for stores of a value that is already present in the local, which in particular can remove the initial set to 0 of loops starting at zero, since all locals are initialized to that already. This helps in real-world code, but is not super-common since coalescing means we tend to have assigned something else to it anyhow before we need it to be zero, so this mainly helps in small functions (and running this before coalescing would extend live ranges in potentially bad ways).
* SpillPointers pass (#1339)Alon Zakai2017-12-301-0/+103
| | | | | | | | | | | | | This is an experiment to help with Boehm-style GC. It will spill things that could be pointers to the C stack, so that they can be seen by conservative garbage collection. The spills add code size and runtime overhead, but actually less than I thought: 10% slower (smaller than the difference between VMs), 15% gzip size larger. We can do even better with more optimizations for this, like a dead store elimination pass. This PR does the following: * Add the new pass. * Create an abi/ dir, with info about the pointer size and stack manipulation utilities. * Separates out the liveness analysis from CoalesceLocals, so that other passes can use it (like SpillPointers). * Refactor out the SortedVector class from the liveness analysis to a separate file (just seems nicer that way).
* wasm-metadce tool (#1320)Alon Zakai2017-12-062-1/+403
| | | | | | | This adds a new tool for better dead code elimination. The problem this helps overcome is when the wasm module is part of something larger, like a wasm+JS combination, and therefore doing DCE in either one is not sufficient as it can't remove a cycle spanning the wasm and JS worlds. Concretely, when binaryen performs DCE by itself, it can never remove an export, because it considers those roots - but in the larger ("meta") space outside, they may actually be removable. To solve that, this tool receives a description of the outside graph (in very abstract form), including which nodes are roots. It then adds to that graph nodes from the wasm, so that we have a single graph representing the entire space (the outside + wasm + connections between them). It then performs DCE, finding what is not reachable from the roots, and cleaning it up from the wasm. It of course can't clean up things from the outside, since all it has is the abstract representation of those things in the graph, but it prints out the ids of the removable nodes, which an outside tool can use. This tool is written in as general a way as possible, hopefully it can have multiple uses. The use I have in mind is to write something in emscripten that uses this to DCE the JS+wasm combination that we emit.
* support fixed (non-relocatable) segments in wasm-merge. also a few printing ↵Alon Zakai2017-12-051-2/+5
| | | | fixes for multiple segments, which we never really printed that prettily (#1316)
* Fix wasm-reduce testing out of tree (#1284)Alon Zakai2017-11-211-0/+58
| | | | * fix wasm-reduce when out-of-tree: do not use a hardcoded bin/wasm-opt, instead add a Path namespace with utilities to get the proper path, and use BINARYEN_ROOT which our test setup code ensures
* Thread fixes (#1205)Alon Zakai2017-10-021-1/+8
| | | | | | * don't use multiple threads in torture tests, which are parallel anyhow * if we fail to create a thread, don't use multiple threads
* disambiguate hash usage (#1182)Alon Zakai2017-09-131-1/+1
|
* Const hoisting (#1176)Alon Zakai2017-09-121-0/+5
| | | A pass that hoists repeating constants to a local, and replaces their uses with a get of that local. This can reduce binary size, but can also *increase* gzip size, so it's mostly for experimentation and not used by default.
* wasm-reduce tool (#1139)Alon Zakai2017-09-012-1/+20
| | | Reduce an interesting wasm to a smaller still interesting wasm. This takes an arbitrary command to run, and reduces the wasm as much as it can while keeping the behavior of that command fixed. This can be used to reduce compiler bugs in an arbitrary VM, etc.
* Get wasm2asm building again (#1107)Thomas Lively2017-08-021-0/+7
| | | | | | | | | | | | | | | | | | * Get wasm2asm building again Updates CMakeLists.txt to have wasm2asm built by default, updates wasm2asm.h to account for recent interface changes, and restores JSPrinter functionality. * Implement splice for array values * Clean up wasm2asm testing * Print semicolons after statements in blocks * Cleanups and semicolons for condition arms * Prettify semicolon emission
* Fix wasm::read_file() to read correctly sized input strings in text mode. ↵juj2017-07-181-0/+6
| | | | (#1088)
* SSA pass (#1049)Alon Zakai2017-06-131-0/+55
| | | | | | | * Add SSA pass which ensures a single assign for each local, except for merged locals where we ensure exactly a single assign from one of the paths leading to that use * Also add InstrumentLocals pass, useful for debugging locals (similar to InstrumentMemory but for locals) * Fix a PickLoadSigns bug with tees not being ignored, which was not noticed until now because we ran it on flatter output by default, but the ssa pass uncovered the bug
* Log callImport fatal error to cerr so it is not buffered. (#1036)Sam Clegg2017-06-121-1/+3
| | | | | | | Use Fatal() rather than stdout or report callImport error Without this the write to stdout can be lost (Since the following line aborts)
* Fix build with gcc 7 (#957)Morris Hafner2017-03-291-0/+1
| | | | 1. Add a missing <functional> include 2. Put the // fallthrough comment after the closing bracket so the compiler does not emit a implicit fallthrough warning.
* New binaryen.js (#922)Alon Zakai2017-03-241-0/+4
| | | New binaryen.js implementation, based on the C API underneath and with a JS-friendly API on top. See docs under docs/ for API details.
* Fully handle EM_ASM in s2wasm (#910)jgravelle-google2017-02-231-0/+58
| | | | | | | | | | | | * Fully handle EM_ASM in s2wasm * Iterate with size_ts, remember to erase from importsMap as well * Fix dot_s test EM_ASM signatures * Move Name out to its own file, support/name.h * Move removeImportsWithSubstring out of Module class
* optimize linear sums (#904)Alon Zakai2017-02-161-1/+1
|
* Remove unused captures to fix warnings/errors when compiling with Clang (#896)Eric Holk2017-02-031-2/+1
|
* more consistent placement of & and *, on the type (#848)Alon Zakai2016-11-281-15/+15
|
* Wrap description (#839)Loo Rong Jie2016-11-283-5/+32
|
* Fix Windows colors (#833)Loo Rong Jie2016-11-111-3/+2
| | | | * Fix Windows colors and update README.md
* Use steady_clock to measure code execution time (#776)Loo Rong Jie2016-10-171-3/+3
|
* Fix crash when loading archive files, dereferencing iterator .end() is ↵juj2016-10-141-7/+3
| | | | undefined behavior. (#769)
* Fix crash on loading archives, firstRegularData member field was not ↵juj2016-10-131-1/+1
| | | | initialized to null which caused dereferencing a garbage pointer. (#770)
* ensure we create the OptimizeInstructions database on demand, avoiding ↵Alon Zakai2016-09-071-0/+18
| | | | global ctors
* if we don't recognize the platform in colors.h, just do nothing for colorsAlon Zakai2016-09-071-0/+9
|
* Color support for Windows (#693)Loo Rong Jie2016-09-072-8/+34
|
* Replace std::unique<T>(new T()) with make_unique<T>().Logan Chien2016-08-261-3/+4
| | | | | | | | | | | | | | | | This commit modernize the code base by replacing: std::unique_ptr<T>(new T(...)) with: make_unique<T>(...) or: wasm::make_unique<T>(...) This is a step closer to adopt C++14 std::make_unique<T>(...).
* add support for symbol assignments, closes #4422 (#615)Dominic Chen2016-07-111-1/+1
| | | Adds support for aliases to objects, to go along with the existing support for aliases to functions.
* Build fixes/workarounds to support Visual Studio 2013 build, which has ↵juj2016-06-213-3/+4
| | | | trouble with some new C++11 constructs. (#581)
* use WASM_UNUSED in some places to fix compiler warning/error on unused ↵Alon Zakai2016-06-081-0/+2
| | | | variables we only use in asserts (#579)
* refactor a getNumCores methodAlon Zakai2016-06-022-5/+12
|
* add hash utility, and support for hashing and comparing expressionsAlon Zakai2016-05-281-0/+39
|
* tweak learning index pickingAlon Zakai2016-05-171-2/+1
|