forks/binaryen.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	Don't include `$` with names unless outputting to wat format (#2506)	Sam Clegg	2019-12-06	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \|	The `$` is not actually part of the name, its the marker that starts a name in the wat format. It can be confusing to see it show up when doing `cerr << name`, for example. This change has Print.cpp add the `$` which seem like the right place to do this. Plus it revealed a bunch of places where were not calling printName to escape all the names we were printing.
*	Add some tracing to wasm-emscripten-finalize (#2505)	Sam Clegg	2019-12-05	1	-1/+1
\| \| \| \| \|	Also fix, but in splitting the names of the trace channels. Obviously I can't write string.split correctly in C first time around.
*	Add string parameter to WASM_UNREACHABLE (#2499)	Sam Clegg	2019-12-05	4	-1/+62
\| \| \| \| \|	This works more like llvm's unreachable handler in that is preserves information even in release builds.
*	Add BYN_ENABLE_ASSERTSION option to allow assertions to be disabled. (#2500)	Sam Clegg	2019-12-04	3	-2/+15
\| \| \| \| \| \| \| \|	We always enable assertions by default, but this options allows for a build without them. Fix all errors in the ASSERTIONS=OFF build, even though we don't normally build this its good to keep it building.
*	cmake: Convert to using lowercase for and functions/macros (#2495)	Sam Clegg	2019-12-04	1	-2/+2
\| \| \|	This is line with modern cmake conventions is much less SHOUTY!
*	Convert to using DEBUG macros (#2497)	Sam Clegg	2019-12-04	2	-35/+21
\| \| \| \| \| \|	This means that debugging/tracing can now be enabled and controlled centrally without managing and passing state around the codebase.
*	Add BYN_DEBUG/BYN_TRACE macros similar to LLVM's debug system (#2496)	Sam Clegg	2019-12-04	4	-2/+119
\| \| \| \| \| \| \| \| \|	This allows for debug trace message to be split my channel. So you can pass `--debug` to simply debug everything, or `--debug=opt` to only debug wasm-opt. This change is the initial introduction but as a followup I hope to convert all tracing over to this new system so we can more easily control the debug output.
*	Collect all object files from the object libraries in a CMake variable (#2477)	Immanuel Haffner	2019-11-26	1	-2/+1
\| \| \| \| \| \| \| \| \|	using the `$<TARGET_OBJECTS:objlib>` syntax. Use this variable when adding `libbinaryen` as static or shared library. Additionally, use the variable with the object files to simplify the `TARGET_LINK_LIBRARIES` commands: add the object libraries to the sources of executables and drop the use of our libraries in `TARGET_LINK_LIBRARIES`. (Object libraries cannot be linked but must be used as sources. See https://cmake.org/pipermail/cmake/2018-June/067721.html)
*	Revert "Build libbinaryen as a monolithic statically/shared library (#2463)" ↵	Alon Zakai	2019-11-25	1	-1/+1
\| \| \| \| \|	(#2474) This reverts commit bf8f36c31c0b8e6213bce840be66937dd6d0f6af.
*	Build libbinaryen as a monolithic statically/shared library (#2463)	Immanuel Haffner	2019-11-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	* Transform libraries created in subdirectories from statically linked libraries to CMake object libraries. * Link object libraries as `PRIVATE` to `libbinaryen`. According to CMake documentation: "Libraries and targets following PRIVATE are linked to, but are not made part of the link interface." This is exactly what we want, as we only want the C API to be part of the interface.
*	Fix autoreducing when not in the binaryen directory (#2390)	Alon Zakai	2019-10-17	2	-8/+31
\| \| \| \|	This uses argv[0] as the default way to find the location of the wasm binaries (wasm-reduce needs to call wasm-opt).
*	asyncify: support *-matching in whitelist and blacklist (#2344)	Beuc	2019-09-23	1	-5/+6
\| \| \|	See emscripten-core/emscripten#9381 for rationale.
*	Support response files, and use that in Asyncify (#2319)	Alon Zakai	2019-08-30	3	-0/+24
\| \| \|	See emscripten-core/emscripten#9206, the asyncify names can need complex escaping, so this provides an escape hatch.
*	Allow all features on wasm2js and add atomic tests (#2311)	Heejin Ahn	2019-08-28	1	-4/+0
\| \| \| \| \| \|	This adds `-all` argument to wasm2js testing and fixes wasm2js to actually take that argument (currently it doesn't, when it takes a wast file). This also adds a wasm2js test for `atomic.fence` instruction that was added in #2307.
*	Support --version argument in command line tools (#2304)	Sam Clegg	2019-08-20	1	-0/+9
\|
*	Proper Asyncify list name handling (#2275)	Alon Zakai	2019-07-31	1	-0/+42
\| \| \| \| \|	The lists are comma separated, but the names can have internal commas since they are human-readable. This adds awareness of bracketing things, so void foo(int, double) is parsed as a single function name, properly. Helps emscripten-core/emscripten#9128
*	Bysyncify: allow wildcard endings in import list (#2190)	Alon Zakai	2019-06-30	2	-3/+72
\| \| \|	This allows us to do things in emscripten like note that all env.invoke_* functions are important.
*	Enable compiling on GCC < 5 (#2149)	Matt Topol	2019-06-12	1	-1/+1
\| \| \|	_ISOC11_SOURCE is the preprocessor flag that specifies whether or not aligned_alloc is defined and exists. While GCC versions lower than 5 do include C++11 and C++14 constructs, they do not include std::aligned_alloc, so this check allows compiling on those versions of GCC by defaulting down to posix_memalign in those situations appropriately.
*	Allow color API to enable and disable colors (#2111)	Siddharth	2019-05-17	2	-5/+7
\| \| \| \| \| \|	This is useful for front-ends which wish to selectively enable or disable coloring. Also expose these APIs from the C API.
*	clang-tidy braces changes (#2075)	Alon Zakai	2019-05-01	9	-32/+62
\| \| \|	Applies the changes in #2065, and temprarily disables the hook since it's too slow to run on a change this large. We should re-enable it in a later commit.
*	Apply format changes from #2048 (#2059)	Alon Zakai	2019-04-26	26	-370/+411
\| \| \|	Mass change to apply clang-format to everything. We are applying this in a PR by me so the (git) blame is all mine ;) but @aheejin did all the work to get clang-format set up and all the manual work to tidy up some things to make the output nicer in #2048
*	Move features from passOptions to Module (#2001)	Thomas Lively	2019-04-12	1	-0/+3
\| \| \| \| \|	This allows us to emit a (potentially modified) target features section and conditionally emit other sections such as the DataCount section based on the presence of features.
*	Wasm2js refactoring (#1997)	Alon Zakai	2019-04-11	1	-0/+66
\| \| \| \| \| \| \| \| \| \| \| \| \|	Early work for #1929 * Leave core wasm module - the "asm.js function" - to Wasm2JSBuilder, and add Wasm2JSGlue which emits the code before and after that. Currently that's some ES6 code, but we may want to change that later. * Add add AssertionEmitter class for the sole purpose of emitting modules + assertions for testing. This avoids some hacks from before like starting from index 1 (assuming the module at first position was already parsed and printed) and printing of the f32Equal etc. functions not at the very top (which was due to technical limitations before). Logic-wise, there should be no visible change, except some whitespace and reodering, and that I made the exceptions print out the source of the assertion that failed from the wast: -if (!check2()) fail2(); +if (!check2()) throw 'assertion failed: ( assert_return ( call add ( i32.const 1 ) ( i32.const 1 ) ) ( i32.const 2 ) )'; (fail2 etc. did not exist, and seems to just have given a unique number for each assertion?)
*	Use target features section in wasm-opt (#1967)	Thomas Lively	2019-04-03	1	-2/+2
\| \| \| \| \| \| \|	If the user does not supply features explicitly on the command line, read and use the features in the target features section for validation and passes. If the user does supply features explicitly, error if they are not a superset of the features marked as used in the target features section and the user does not explicitly handle this.
*	Allow tools to read from stdin (#1950)	Thomas Lively	2019-03-18	2	-1/+13
\| \| \| \|	This is necessary to write tests that don't require temporary files, such as in #1948, and is generally useful.
*	Remove unnecessary semicolons (#1942)	Ryoga	2019-03-18	1	-1/+1
\| \| \|	Removed semicolons that cause errors when compiling with -pedantic-errors.
*	Use stdout for --help message (#1937)	Sam Clegg	2019-03-06	1	-9/+9
\| \| \| \|	Noramlly --help is considered normal output not error output. For example its normally to pipe the output of --help to a pager.
*	Simplify ExpressionAnalyzer (#1920)	Alon Zakai	2019-02-27	1	-0/+1
\| \| \| \| \|	This refactors the hashing and comparison code to use a single immediate-value iterator. This makes us have a single place that knows the list of immediate fields in every node type, instead of 2. This also fixes a few bugs found by doing that. In particular, this makes us slightly slower than before since we are hashing more fields.
*	SmallVector (#1912)	Alon Zakai	2019-02-25	1	-0/+172
\| \| \| \| \|	Trying to refactor the code to be simpler and less redundant, I ran into some perf issues that it seems like a small vector, with fixed-size storage and optional additional storage as needed, might help with. This implements that class and uses it in a few places. This seems to help, I see some 1-2% fewer instructions and cycles in `perf stat`, but it's hard to tell if it really makes a noticeable difference.
*	Code style improvements (#1868)	Alon Zakai	2019-01-15	3	-6/+6
\| \| \| \|	* Use modern T p = v; notation to initialize class fields * Use modern X() = default; notation for empty class constructors
*	Fix build on macOS High Sierra 10.13.1 and Xcode 9.2 (9C40b), which does not ↵	juj	2019-01-10	1	-0/+4
\| \| \| \|	have aligned_alloc() (not sure if newer macOS/Xcodes do, or if this an issue with old macOS/Xcode version) (#1862)
*	Aligned allocation fixes. Fixes #1845 (#1846)	Alon Zakai	2019-01-09	1	-0/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The error in #1845 shows: /<<PKGBUILDDIR>>/src/mixed_arena.h: In member function 'void* MixedArena::allocSpace(size_t, size_t)': /<<PKGBUILDDIR>>/src/mixed_arena.h:125:43: error: 'new' of type 'MixedArena::Chunk' {aka 'std::aligned_storage<32768, 16>::type'} with extended alignment 16 [-Werror=aligned-new=] chunks.push_back(new Chunk[numChunks]); ^ /<<PKGBUILDDIR>>/src/mixed_arena.h:125:43: note: uses 'void* operator new [](std::size_t)', which does not have an alignment parameter /<<PKGBUILDDIR>>/src/mixed_arena.h:125:43: note: use '-faligned-new' to enable C++17 over-aligned new support It turns out I had misread the aligned_storage docs, and they don't actually do what we need, which is a convenient cross-platform way to do aligned allocation, since new itself doesn't support that. Sadly it seems there is no cross-platform way to do it right now, so I added a header in support which abstracts over the windows and everything-else ways. Also add some ctest testing, which runs on windows, so we get basic windows coverage in our CI.
*	Consistent spacing around the namespace keyword (#1829)	Alon Zakai	2018-12-15	1	-0/+2
\|
*	standardize on 'template<' over 'template <' (i.e., remove a space) (#1782)	Alon Zakai	2018-11-29	5	-11/+11
\|
*	Add wasm-emscripten-finalize flag to separate data segments into a file (#1741)	Derek Schuff	2018-11-14	1	-0/+4
\| \| \| \|	This writes the data section into a file suitable for use with emscripten's --memory-init-file flag
*	Don't call static desructors when Fatal() errors occur (#1722)	Sam Clegg	2018-11-02	2	-3/+6
\| \| \| \| \|	This was causing a deadlock while destroying the thread pool.
*	DeadArgumentElimination Pass (#1641)	Alon Zakai	2018-09-05	1	-1/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a pass to remove unnecessary call arguments in an LTO-like manner, that is: * If a parameter is not actually used in a function, we don't need to send anything, and can remove it from the function's declaration. Concretely, (func $a (param $x i32) ..no uses of $x.. ) (func $b (call $a (..)) ) => (func $a ..no uses of $x.. ) (func $b (call $a) ) And * If a parameter is only ever sent the same constant value, we can just set that constant value in the function (which then means that the values sent from the outside are no longer used, as in the previous point). Concretely, (func $a (param $x i32) ..may use $x.. ) (func $b (call $a (i32.const 1)) (call $a (i32.const 1)) ) => (func $a (local $x i32) (set_local $x (i32.const 1) ..may use $x.. ) (func $b (call $a) (call $a) ) How much this helps depends on the codebase obviously, but sometimes it is pretty useful. For example, it shrinks 0.72% on Unity and 0.37% on Mono. Note that those numbers include not just the optimization itself, but the other optimizations it then enables - in particular the second point from earlier leads to inlining a constant value, which often allows constant propagation, and also removing parameters may enable more duplicate function elimination, etc. - which explains how this can shrink Unity by almost 1%. Implementation is pretty straightforward, but there is some work to make the heavy part of the pass parallel, and a bunch of corner cases to avoid (can't change a function that is exported or in the table, etc.). Like the Inlining pass, there is both a standard and an "optimizing" version of this pass - the latter also optimizes the functions it changes, as like Inlining, it's useful to not need to re-run all function optimizations on the whole module.
*	Stack IR (#1623)	Alon Zakai	2018-07-30	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a new IR, "Stack IR". This represents wasm at a very low level, as a simple stream of instructions, basically the same as wasm's binary format. This is unlike Binaryen IR which is structured and in a tree format. This gives some small wins on binary sizes, less than 1% in most cases, usually 0.25-0.50% or so. That's not much by itself, but looking forward this prepares us for multi-value, which we really need an IR like this to be able to optimize well. Also, it's possible there is more we can do already - currently there are just a few stack IR optimizations implemented, DCE local2stack - check if a set_local/get_local pair can be removed, which keeps the set's value on the stack, which if the stars align it can be popped instead of the get. Block removal - remove any blocks with no branches, as they are valid in wasm binary format. Implementation-wise, the IR is defined in wasm-stack.h. A new StackInst is defined, representing a single instruction. Most are simple reflections of Binaryen IR (an add, a load, etc.), and just pointers to them. Control flow constructs are expanded into multiple instructions, like a block turns into a block begin and end, and we may also emit extra unreachables to handle the fact Binaryen IR has unreachable blocks/ifs/loops but wasm does not. Overall, all the Binaryen IR differences with wasm vanish on the way to stack IR. Where this IR lives: Each Function now has a unique_ptr to stack IR, that is, a function may have stack IR alongside the main IR. If the stack IR is present, we write it out during binary writing; if not, we do the same binaryen IR => wasm binary process as before (this PR should not affect speed there). This design lets us use normal Passes on stack IR, in particular this PR defines 3 passes: Generate stack IR Optimize stack IR (might be worth splitting out into separate passes eventually) Print stack IR for debugging purposes Having these as normal passes is convenient as then they can run in parallel across functions and all the other conveniences of our current Pass system. However, a downside of keeping the second IR as an option on Functions, and using normal Passes to operate on it, means that we may get out of sync: if you generate stack IR, then modify binaryen IR, then the stack IR may no longer be valid (for example, maybe you removed locals or modified instructions in place etc.). To avoid that, Passes now define if they modify Binaryen IR or not; if they do, we throw away the stack IR. Miscellaneous notes: Just writing Stack IR, then writing to binary - no optimizations - is 20% slower than going directly to binary, which is one reason why we still support direct writing. This does lead to some "fun" C++ template code to make that convenient: there is a single StackWriter class, templated over the "mode", which is either Binaryen2Binary (direct writing), Binaryen2Stack, or Stack2Binary. This avoids a lot of boilerplate as the 3 modes share a lot of code in overlapping ways. Stack IR does not support source maps / debug info. We just don't use that IR if debug info is present. A tiny text format comment (if emitting non-minified text) indicates stack IR is present, if it is ((; has Stack IR ;)). This may help with debugging, just in case people forget. There is also a pass to print out the stack IR for debug purposes, as mentioned above. The sieve binaryen.js test was actually not validating all along - these new opts broke it in a more noticeable manner. Fixed. Added extra checks in pass-debug mode, to verify that if stack IR should have been thrown out, it was. This should help avoid any confusion with the IR being invalid. Added a comment about the possible future of stack IR as the main IR, depending on optimization results, following some discussion earlier today.
*	duplicate-function-elimination improvements (#1590)	Alon Zakai	2018-06-07	1	-3/+6
\| \| \| \| \| \| \|	On a codebase with 370K functions, 160K were in fact duplicate (!)... and it took many many passes to figure that out, over 2 minutes in fact (!), as A and B may be identical only after we see that the functions C1, C2 that they call are identical (so there can be long "chains" here). To avoid this, limit how many passes we do. In -O1, just do one pass - that gets most duplicates. In -O2, do 10 passes - that gets almost all of it on this codebase. And in -O3 (or -Os/-Oz) do as many passes as necessary (i.e., the old behavior). This at least lets iteration builds (-O1) be nice and fast. This PR also refactors the hashing code used in that pass, moving it to nicer header files for clearer readability. Also some other minor cleanups in hashing code that helped debug this.
*	refactor Path utils: store the bin/ dir so that all users of the API can use ↵	Alon Zakai	2018-03-30	3	-24/+79
\| \| \| \|	it by the standard calls, even if it was modified by user input (move it out of just being in wasm-reduce.cpp) (#1489)
*	Support wasm-reduce for Windows (#1488)	Michael Ferris	2018-03-26	1	-3/+5
\|
*	More simple math opts (#1414)	Alon Zakai	2018-02-14	2	-5/+10
\| \| \| \| \| \| \| \|	* optimize more simple math operations: mul of 0, or of 0, and of 0, mul of 1, mul of a power of 2, urem of a power of 2 * fix asm2wasm callImport parsing: the optimizer may get rid of the added offset to a function table * update js builds
*	wasm-reduce tweaks and improvements (#1405)	Alon Zakai	2018-02-11	1	-1/+1
\| \| \| \| \|	* wasm-reduce tweaks and improvements: better error messages, better validation, better function removal, etc.
*	'std::string &' => 'std::string& ' (#1403)	Alon Zakai	2018-02-05	3	-13/+13
\| \| \|	The & on the type is the proper convention.
*	Simplify ThreadPool::isRunning (#1391)	Alon Zakai	2018-01-30	2	-3/+2
\| \| \| \| \| \|	* simplify ThreadPool::isRunning: it doesn't need to be static and to go through the global unique_ptr * it's undefined behavior to access the threadpool from a shutting down thread, as the parent is being destroyed
*	ThreadPool refactoring (#1389)	Alon Zakai	2018-01-26	2	-33/+43
\| \| \| \| \| \| \| \|	Refactor ThreadPool code for clarity and to fix some bugs with using the pool from different threads in parallel. We have a singleton pool, and need to ensure it is created only once and used only by one thread at a time. This model is a simple way to ensure we use a number of threads equal to the number of cores, more or less (a pool per Module might lead to number of cores * number of Modules being optimized). This refactoring adds a parent pointer in the worker threads (giving them direct access to the pool makes it simpler to make sure that pool and thread creation and teardown are threadsafe). This commit also adds proper locking around pool creation and pool usage.
*	Threading fixes (#1377)	Alon Zakai	2018-01-24	2	-6/+21
\| \| \| \| \| \|	* threading fixes, be careful when creating the pool (more than one thread may try to) and don't create it just to check if its running in the thread constructor assertions * child threads will call ::get() - don't do initialize() under the lock
*	Refactor optimization defaults (#1366)	Alon Zakai	2018-01-17	1	-24/+0
\| \| \| \| \|	Followup to #1357. This moves the optimization settings into pass.h, and uses it from there in the various places. This also splits up huge lines from the tracing code, which put all block children (whose number can be arbitrarily large) on one line. This seems to have caused random errors on the bots, I suspect from overflowing a buffer. Anyhow, it's much more clear to split the lines at a reasonable length.
*	Add optimize, shrink level and debug info options to C/JS (#1357)	Daniel Wirtz	2018-01-17	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add optimize, shrink level and debug info options to C/JS * Add instantiate functionality for creating additional unique instances of the API * Use a workaround when running tests in node Tests misuse a module as a script by concatenating, so instead of catching this case in the library, catch it there * Update sieve test Seems optimized output changed due to running with optimize levels 2/1 now * Use the options with all pass runners * Update relooper-fuzz C-API test * Share defaults between tools and the C-API * Add a test for optimize levels * Unify node test support in check.by and auto_update_tests.py * Also add getters for optimize levels and test them * Also test debugInfo * Add debug info to C tests that used it as well * Fix missing NODEJS import in auto_update_tests * Detect node.js version (WASM support) * Update hello-world JS test (now also runs with node) * feature-test WebAssembly in node instead * Document that these options apply globally, and where * Make sure hello-world.js output doesn't differ between mozjs/node
*	Redundant Set Elimination pass (#1344)	Alon Zakai	2018-01-05	2	-1/+66
\| \| \| \|	This optimizes #1343. It looks for stores of a value that is already present in the local, which in particular can remove the initial set to 0 of loops starting at zero, since all locals are initialized to that already. This helps in real-world code, but is not super-common since coalescing means we tend to have assigned something else to it anyhow before we need it to be zero, so this mainly helps in small functions (and running this before coalescing would extend live ranges in potentially bad ways).