| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
| |
* optimize more simple math operations: mul of 0, or of 0, and of 0, mul of 1, mul of a power of 2, urem of a power of 2
* fix asm2wasm callImport parsing: the optimizer may get rid of the added offset to a function table
* update js builds
|
|
|
|
|
| |
* fix safe heap check for load/store of fewer bytes than the type
|
|
|
|
|
| |
* wasm-reduce tweaks and improvements: better error messages, better validation, better function removal, etc.
|
|
|
|
| |
it's useful to test infinite loops (#1404)
|
|
|
| |
This adds necessary command line options for addFunction support, and generates required jsCall imports and generates jsCall thunk functions.
|
|
|
|
|
|
|
|
| |
* Dedupe function names when reading a binary
* More robust name deduplication, use .s instead of _s
* Add name-duplicated wasm binaries
|
|
|
|
|
|
|
|
|
|
| |
* Allow wasm2asm to generate "almost asm"
If grow_memory, current_memory or export memory is
used then generate "almost asm" with memory growth
support.
* Log reason for "almost asm" to stderr
|
|
|
|
|
|
| |
* check isAtomic in comparisons
* hash isAtomic
|
|
|
| |
The & on the type is the proper convention.
|
|
|
| |
These instances may simply be caught by reference instead.
|
|
|
|
|
|
|
| |
Simplify inlining logic: don't special case the first iteration or change behavior based on when we are optimizing or not. Instead, use one simpler set of heuristics for both inlining and inlining-optimizing. We only run inlining-optimizing by default anyhow, no point to try to make inlining without optimizations useful by itself, it's not a realistic use case. (inlining is still useful for debugging, and if you will run optimizations anyhow later on everything, in which case inlining-optimizing might add some redundancy.) The simpler heuristics after this let us do a somewhat better job as we are no longer paranoid about inlining in multiple iterations.
Also raise limit on inlining things that are obviously worth it from size 1 to size 2: things of size 2 will never lead to an increase in code size after we optimize (it takes at least 3 nodes to generate something that reads two locals and reverses their order, which would require a temp local in the outside scope etc.).
Also fix infinite recursion of inlining an infinitely recursive set of calls.
|
|
|
|
| |
* rename WasmType to Type. it's in the wasm:: namespace anyhow, and without Wasm- it fits in better alongside Index, Address, Expression, Module, etc.
|
| |
|
|
|
|
|
|
| |
* simplify ThreadPool::isRunning: it doesn't need to be static and to go through the global unique_ptr
* it's undefined behavior to access the threadpool from a shutting down thread, as the parent is being destroyed
|
| |
|
|
|
|
|
|
| |
* Initial source map support for C/JS
* Also test getDebugInfoFileName
|
| |
|
|
|
|
|
|
|
|
| |
Refactor ThreadPool code for clarity and to fix some bugs with using the pool from different threads in parallel.
We have a singleton pool, and need to ensure it is created only once and used only by one thread at a time. This model is a simple way to ensure we use a number of threads equal to the number of cores, more or less (a pool per Module might lead to number of cores * number of Modules being optimized).
This refactoring adds a parent pointer in the worker threads (giving them direct access to the pool makes it simpler to make sure that pool and thread creation and teardown are threadsafe). This commit also adds proper locking around pool creation and pool usage.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* fix unused variable warnings in wasm2asm.cpp
No need to give the bad_alloc exception a name when we do nothing with it.
* add appropriate casts for shift operands
MSVC complains about implicitly converting results from 32 bits to 64
bits here, so we might as well make it clear that we intended the 64-bit
shift to happen in the first place.
* disable C4722 on MSVC
We annotated Fatal::~Fatal with WASM_NORETURN, yet MSVC still warns, so
take the next step and silence the warning completely.
* don't warn about "deprecated" POSIX functions with MSVC
Non-Windows platforms don't have the names MSVC recommends using, and
everybody understands the POSIX names, so just silence the warning.
|
|
|
|
|
|
| |
* inline 1-element functions when optimizing, as they will be smaller than the call we are replacing
* add an extra useful vacuum after inlining+optimizing
|
|
|
|
|
|
| |
* threading fixes, be careful when creating the pool (more than one thread may try to) and don't create it just to check if its running in the thread constructor assertions
* child threads will call ::get() - don't do initialize() under the lock
|
|
|
|
|
| |
This simplifies the logic there into a more standard flow operation. This is not always faster, but it is much faster on the worst cases we saw before like sqlite, and it is simpler.
The rewrite also fixes a fuzz bug.
|
|
|
|
|
|
|
|
| |
(#1379)
* show the binary bytes we can remove without each export, in --func-metrics
* check start too
|
|
|
|
|
|
|
|
| |
* fix wait and wake binary format support, they have alignments and offsets
* don't emit unreachable parts of atomic operations, for simplicity and to avoid special handling
* don't emit atomic waits by default in the fuzzer, they hang in native vm support
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Skeleton of a beginning of o2wasm, WIP and probably not going to be used
* Get building post-cherry-pick
* ast->ir, remove commented out code, include a debug module print because linking
* Read linking section, print emscripten metadata json
* WasmBinaryWriter emits user sections on Module
* Remove debugging prints, everything that isn't needed to build metadata
* Rename o2wasm to lld-metadata
* lld-metadata support for outputting to file
* Use tables index instead of function index for initializer functions
* Add lld-emscripten tool to add emscripten-runtime functions to wasm modules (built with lld)
* Handle EM_ASM in lld-emscripten
* Add a list of functions to forcibly export (for initializer functions)
* Disable incorrect initializer function reading
* Add error printing when parsing .o files in lld-metadata
* Remove ';; METADATA: ' prefix from lld-metadata, output is now standalone json
* Support em_asm consts that aren't at the start of a segment
* Initial test framework for lld-metadata tool
* Add em_asm test
* Add support for WASM_INIT_FUNCS in the linking section
* Remove reloc section parsing because it's unused
* lld-emscripten can read and write text
* Add test harness for lld-emscripten
* Export all functions for now
* Add missing lld test output
* Add support for reading object files differently
Only difference so far is in importing mutable globals being an object
file representation for symbols, but invalid wasm.
* Update help strings
* Update linking tests for stackAlloc fix
* Rename lld-emscripten,lld-metadata to wasm-emscripten-finalize,wasm-link-metadata
* Add help text to header comments
* auto& instead of auto &
* Extract LinkType to abi/wasm-object.h
* Remove special handling for wasm object file reading, allow mutable globals
* Add braces around default switch case
* Fix flake8 errors
* Handle generating dyncall thunks for imports as well
* Use explicit bool for stackPointerGlobal
* Use glob patterns for lld file iteration
* Use __wasm_call_ctors for all initializer functions
|
| |
|
| |
|
|
|
|
|
|
| |
fixes #1369
* Update binaries and kitchen-sink test
|
|
|
| |
Currently, whenever a global is used, wasm2asm fails with an assertion error here. While this PR doesn't implement anything, it allows the use of non-i64 globals that aren't lowered anyway.
|
| |
|
| |
|
| |
|
|
|
|
|
| |
Followup to #1357. This moves the optimization settings into pass.h, and uses it from there in the various places.
This also splits up huge lines from the tracing code, which put all block children (whose number can be arbitrarily large) on one line. This seems to have caused random errors on the bots, I suspect from overflowing a buffer. Anyhow, it's much more clear to split the lines at a reasonable length.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* run dfe at the very end, as it may be more effective after inlining
* optimize reorder-functions
* do a final dfe in asm2wasm after all other opts
* make inlining deterministic: std::atomic<T> values are not zero-initialized
* do global post opts at the end of asm2wasm, and don't also do them in the module builder
* fix function type removing
* don't inline+optimize when preserving debug info
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add optimize, shrink level and debug info options to C/JS
* Add instantiate functionality for creating additional unique instances of the API
* Use a workaround when running tests in node
Tests misuse a module as a script by concatenating, so instead of catching this case in the library, catch it there
* Update sieve test
Seems optimized output changed due to running with optimize levels 2/1 now
* Use the options with all pass runners
* Update relooper-fuzz C-API test
* Share defaults between tools and the C-API
* Add a test for optimize levels
* Unify node test support in check.by and auto_update_tests.py
* Also add getters for optimize levels and test them
* Also test debugInfo
* Add debug info to C tests that used it as well
* Fix missing NODEJS import in auto_update_tests
* Detect node.js version (WASM support)
* Update hello-world JS test (now also runs with node)
* feature-test WebAssembly in node instead
* Document that these options apply globally, and where
* Make sure hello-world.js output doesn't differ between mozjs/node
|
|
|
| |
Emits binary size and opcode counts for each function, which helps investigating what's taking up space in a wasm binary.
|
|
|
|
| |
arrive with them (#1354)
|
|
|
|
| |
(#1356)
|
|
|
| |
We can remove the memory/table (itself, or an import if imported) if they are not used. This is pretty minor on a large wasm file, but when reading small wasts it's very noticeable to have an unused memory and table all the time.
|
|
|
|
|
| |
Instead merge constant-offset segments if we must in order to stay under the limit.
If we can't - too many non-constant-offset segments - then issue a warning.
|
|
|
|
| |
This optimizes #1343. It looks for stores of a value that is already present in the local, which in particular can remove the initial set to 0 of loops starting at zero, since all locals are initialized to that already. This helps in real-world code, but is not super-common since coalescing means we tend to have assigned something else to it anyhow before we need it to be zero, so this mainly helps in small functions (and running this before coalescing would extend live ranges in potentially bad ways).
|
|
|
|
| |
It was returning the top of the allocated space rather than the bottom.
Fix taken from @tbfleming in kripken/emscripten#5974
|
|
|
|
|
|
|
| |
* add get_global/set_global validation
* validate get_local index
* update builds
* fix tests
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* binaryen.js and wasm.js don't need filesystem support
* newest emscripten no longer uses Runtime.*
* build fixes for binaryen.js and wasm.js also move binaryen.js to use standard emscripten MODULARIZE
* run binaryen.js in all possible engines ; update js builds
* don't emit debug build to a different name, just emit binaryen.js. makes testing easier and safer
* remove volatile things from binaryen.js info printing in tests
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is an experiment to help with Boehm-style GC. It will spill things that could be pointers to the C stack, so that they can be seen by conservative garbage collection.
The spills add code size and runtime overhead, but actually less than I thought: 10% slower (smaller than the difference between VMs), 15% gzip size larger. We can do even better with more optimizations for this, like a dead store elimination pass.
This PR does the following:
* Add the new pass.
* Create an abi/ dir, with info about the pointer size and stack manipulation utilities.
* Separates out the liveness analysis from CoalesceLocals, so that other passes can use it (like SpillPointers).
* Refactor out the SortedVector class from the liveness analysis to a separate file (just seems nicer that way).
|
| |
|
|
|
|
|
|
|
|
|
| |
This optimizes the situation described in #1331. Namely, when x is copied into y, then on subsequent gets of x we could use y instead, and vice versa, as their value is equal. Specifically, this seems to get rid of the definite overlap in the live ranges of x and y, as removing it allows coalesce-locals to merge them. The pass therefore does nothing if the live range of y ends there anyhow.
The danger here is that we may extend the live range so that it causes more conflicts with other things, so this is a heuristic, but I've tested it on every codebase I can find and it always produces a net win, even on one I saw a 0.4% reduction of code size, which surprised me.
This is a fairly slow pass, because it uses LocalGraph which isn't much optimized. This PR includes a minor optimization for it, but we should rewrite it. Meanwhile this is just enabled in -O3 and -Oz.
This PR also includes some fuzzing improvements, to better test stuff like this.
|
|
|
|
|
|
| |
* Check if there is a currFunction before using it (we need it for some stacky code; a valid wasm wouldn't need a function in that location anyhow, as what can be put in a memory/table offset is very limited).
* Huge alignment led us to do a power of 2 shift that is undefined behavior.
Also adds a test facility to check we don't crash on testcases.
|
| |
|