| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
(#1696)
|
|
|
|
|
| |
SafeHeap was emitting them, but it looks like they are invalid according to the wasm-threads spec.
Fixes kripken/emscripten#7208
|
|
|
| |
While debugging to fix the waterfall regressions I noticed that wasm-reduce regressed. We need to be more careful with visitFunction which now may visit an imported function - I found a few not-well-tested passes that also regressed that way.
|
|
|
| |
Fixes the 3 regressions mentioned in a comment on #1678
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes #1649
This moves us to a single object for functions, which can be imported or nor, and likewise for globals (as a result, GetGlobals do not need to check if the global is imported or not, etc.). All imported things now inherit from Importable, which has the module and base of the import, and if they are set then it is an import.
For convenient iteration, there are a few helpers like
ModuleUtils::iterDefinedGlobals(wasm, [&](Global* global) {
.. use global ..
});
as often iteration only cares about imported or defined (non-imported) things.
|
|
|
|
|
|
|
| |
The current patch:
* Preserves the debug locations from function prolog and epilog
* Preserves the debug locations of the nested blocks
|
|
|
|
|
|
|
|
|
|
|
|
| |
* show a proper error for an empty asm2wasm input
* handle end of input in processExpressions in binary reading
* memory segment sizes should be unsigned
* validate input in wasm-ctor-eval
* update tests
|
|
|
| |
From #1665 (a fuzz bug noticed they were not handled in stack.h).
|
|
|
|
|
| |
We need to check not to read past the length of the string.
From #1665
|
|
|
|
|
|
| |
* Error if there are more locals than browsers allow (50,000). We usually just warn about stuff like this, but we do need some limit (or else we hang or OOM), and if so, why not use the agreed-upon Web limit.
* Do not generate nice string names for locals in binary parsing - the name is just $var$x instead of $x, so not much benefit, and worse as our names are interned this is actually slow (which is why the fuzz testcase here hangs instead of OOMing).
Testcases and bugreport in #1663.
|
|
|
|
|
| |
The change means that nan values will be compared bitwise when writing A == B, and so the float rule of a nan is different from itself would not apply.
I think this is a safer default. In particular this PR fixes a fuzz bug in the rse pass, which placed Literals in a hash table, and due to nan != nan, an infinite loop... Also, looks like we really want a bitwise comparison pretty much everywhere anyhow, as can be seen in the diff here. Really the single place we need a floaty comparison is in the intepreter where we implement f32.eq etc., and there the code was already using the proper code path anyhow.
|
|
|
|
|
| |
We're about to rewrite both names anyway.
This fixes LLVM's c++ invokes, which get escaped to not match as of
PR #1646
|
| |
|
|
|
|
|
|
| |
The 'dylink' user section must be emitted before all other sections, per the spec (to allow simple parsing by loaders)
This PR makes reading and writing of a dynamic library remain a valid dynamic library.
|
|
|
|
| |
spec compatible. (#1646)
|
|
|
|
| |
This ensures that 64-bit values are correctly handled on the
JS boundary.
|
|
|
|
|
| |
Allowing duplicates here was causes emscripten to generate a JS
object with duplicate keys.
|
|
|
| |
This now makes --generate-stack-ir --print-stack-ir emit a fully valid .wat wasm file, in stacky format.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a new IR, "Stack IR". This represents wasm at a very low level, as a simple stream of instructions, basically the same as wasm's binary format. This is unlike Binaryen IR which is structured and in a tree format.
This gives some small wins on binary sizes, less than 1% in most cases, usually 0.25-0.50% or so. That's not much by itself, but looking forward this prepares us for multi-value, which we really need an IR like this to be able to optimize well. Also, it's possible there is more we can do already - currently there are just a few stack IR optimizations implemented,
DCE
local2stack - check if a set_local/get_local pair can be removed, which keeps the set's value on the stack, which if the stars align it can be popped instead of the get.
Block removal - remove any blocks with no branches, as they are valid in wasm binary format.
Implementation-wise, the IR is defined in wasm-stack.h. A new StackInst is defined, representing a single instruction. Most are simple reflections of Binaryen IR (an add, a load, etc.), and just pointers to them. Control flow constructs are expanded into multiple instructions, like a block turns into a block begin and end, and we may also emit extra unreachables to handle the fact Binaryen IR has unreachable blocks/ifs/loops but wasm does not. Overall, all the Binaryen IR differences with wasm vanish on the way to stack IR.
Where this IR lives: Each Function now has a unique_ptr to stack IR, that is, a function may have stack IR alongside the main IR. If the stack IR is present, we write it out during binary writing; if not, we do the same binaryen IR => wasm binary process as before (this PR should not affect speed there). This design lets us use normal Passes on stack IR, in particular this PR defines 3 passes:
Generate stack IR
Optimize stack IR (might be worth splitting out into separate passes eventually)
Print stack IR for debugging purposes
Having these as normal passes is convenient as then they can run in parallel across functions and all the other conveniences of our current Pass system. However, a downside of keeping the second IR as an option on Functions, and using normal Passes to operate on it, means that we may get out of sync: if you generate stack IR, then modify binaryen IR, then the stack IR may no longer be valid (for example, maybe you removed locals or modified instructions in place etc.). To avoid that, Passes now define if they modify Binaryen IR or not; if they do, we throw away the stack IR.
Miscellaneous notes:
Just writing Stack IR, then writing to binary - no optimizations - is 20% slower than going directly to binary, which is one reason why we still support direct writing. This does lead to some "fun" C++ template code to make that convenient: there is a single StackWriter class, templated over the "mode", which is either Binaryen2Binary (direct writing), Binaryen2Stack, or Stack2Binary. This avoids a lot of boilerplate as the 3 modes share a lot of code in overlapping ways.
Stack IR does not support source maps / debug info. We just don't use that IR if debug info is present.
A tiny text format comment (if emitting non-minified text) indicates stack IR is present, if it is ((; has Stack IR ;)). This may help with debugging, just in case people forget. There is also a pass to print out the stack IR for debug purposes, as mentioned above.
The sieve binaryen.js test was actually not validating all along - these new opts broke it in a more noticeable manner. Fixed.
Added extra checks in pass-debug mode, to verify that if stack IR should have been thrown out, it was. This should help avoid any confusion with the IR being invalid.
Added a comment about the possible future of stack IR as the main IR, depending on optimization results, following some discussion earlier today.
|
| |
|
|
|
|
|
|
|
| |
This separates out the WasmBinaryWriter parts that do stack writing into a separate class, StackWriter. Previously the WasmBinaryWriter did both the general writing and the stack stuff, and the stack stuff has global state, which it manually cleaned up etc. - seems nicer to have it as a separate class, a class focused on just that one thing.
Should be no functional changes in this PR.
Also add a timeout to the wasm-reduce test, which happened to fail on one of the commits here. It was running slower on that commit for some reason, could have been random - I verified that general wasm writing speed is unaffected by this PR. (But I added the timeout to prevent future random timeouts.)
|
|
|
|
|
|
| |
* code cleanups in wasm-binary: remove an & param, and standardize whitespace
* add some docs for how the relooper handles blocks with no outgoing branches [ci skip]
|
|
|
|
|
|
|
| |
See #1479 (comment)
Also a one-line readme update, remove an obsolete compiler (mir2wasm) and add a new one (asterius).
Also improve warning and error reporting in binaryen.js - show a stack trace when relevant (instead of node.js process.exit), and avoid atexit warning spam in debug builds.
|
|
|
|
| |
s2wasm is no longer used my emscripten and as far as I know now
as no other users.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
* support source map input in wasm-opt, refactoring the loading code into wasm-io
* use wasm-io in wasm-as
* support output source maps in wasm-opt
* add a test for wasm-opt and source maps
|
|
|
|
| |
This check is supposed to check if rename is needed so it
need to compare to the original.
|
|
|
|
|
|
| |
We ran into an issue recently where wasm-emscripten-finalize
was being passed input without any debug names and this is not
currently supported.
|
|
|
|
| |
jsCallStartIndex (#1579)
|
|
|
|
|
|
|
| |
On the testcase from https://github.com/tweag/asterius/issues/19#issuecomment-393052653 this makes us almost 3x faster, and use 25% less memory.
The main improvement here is to simplify and optimize the data structures the validator uses to validate br targets: use unordered maps, and use one less of them. Also some speedups from using that map more effectively (use of iterators to avoid multiple lookups).
Also move the duplicate-node checks to the internal IR validation section, which makes more sense anyhow (it's not wasm validation, it's internal IR validation, which like the check for stale internal types, we do only if debugging).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
implementFunctions should use the export names, not the
internal/debug name for a function. This is especially
imported with lld where the debug names are demanagled.
implementFunctions should only contain functions that are
accessible from outside the module. i.e. those that have
been exported. There is no point in adding internal-only
functions to this list as they won't be accessible from
outside anyway.
Tesed with emscripten using: ./tests/runner.py binaryen2.test_time
|
|
|
|
|
|
|
|
|
| |
wasm-emscripten (#1539)
This allows the same functionality to be used also in
wasm-emscripten-finalize (i.e. the lld path).
|
|
|
|
|
|
|
|
|
| |
If locals are known to contain the same value, we can
* Pick which local to use for a get_local of any of them. Makes sense to prefer the most common, to increase the chance of one dropping to zero uses.
* Remove copies between a local and one that we know contains the same value.
This is a consistent win, small though, around 0.1-0.2%.
|
| |
|
| |
|
|
|
|
|
|
| |
* Move more logic to the Literal class. We now leave all the work to there, except for handling traps.
* Avoid switching on the type, then the opcode, then Literal method usually switches on the type again - instead, do one big switch for the opcodes (then the Literal method is unchanged) which is shorter and clearer, and avoids that first switching.
|
|
|
|
|
| |
Report the offset with the error.
Also fix a compiler warning about comparing signed/unsigned types in the LEB code.
|
|
|
|
|
|
|
|
|
| |
Stuff like x + 5 != 2 => x != -3.
Also some cleanups of utility functions I noticed while writing this, isTypeFloat => isFloatType.
Inspired by
https://github.com/golang/go/blob/master/src/cmd/compile/internal/ssa/gen/generic.rules
|
|
|
| |
Improve error handling, validation, and assertions for having a non-concrete type in an inappropriate place. Fixes a fuzz testcase.
|
|
|
|
|
| |
The optimization in #1495 had a bug which was found by the fuzzer: our binary format parsing will not emit unreachable code (it may be stacky, so we ignore it). However, while parsing it we note breaks that are taken there, and then we removed that code, leading to a state where a break was not taken in the code, but we thought it was.
This PR clarifies the difference between unreachable code in the wasm sense (anything from the start of a block til an unreachable is "reachable") and the literal sense (even that code at the start may not be literally reachable if the block is not reachable), and then we use literal unreachability to know what code will be ignored and therefore we should ignore breaks in.
|
|
|
|
| |
break to it - use that to avoid rescanning blocks for unreachability purposes (#1495)
|
| |
|
|
|
|
| |
found by valgrind (#1478)
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a pass that implements "function pointer cast emulation" - allows indirect calls to go through even if the number of arguments or their types is incorrect. That is undefined behavior in C/C++ but in practice somehow works in native archs. It is even relied upon in e.g. Python.
Emscripten already has such emulation for asm.js, which also worked for asm2wasm. This implements something like it in binaryen which also allows the wasm backend to use it. As a result, Python should now be portable using the wasm backend.
The mechanism used for the emulation is to make all indirect calls use a fixed number of arguments, all of type i64, and a return type of also i64. Thunks are then placed in the table which translate the arguments properly for the target, basically by reinterpreting to i64 and back. As a result, receiving an i64 when an i32 is sent will have the upper bits all zero, and the reverse would truncate the upper bits, etc. (Note that this is different than emscripten's existing emulation, which converts (as signed) to a double. That makes sense for JS where double's can contain all numeric values, but in wasm we have i64s. Also, bitwise conversion may be more like what native archs do anyhow. It is enough for Python.)
Also adds validation for a function's type matching the function's actual params and result (surprised we didn't have that before, but we didn't, and there was even a place in the test suite where that was wrong).
Also simplifies the build script by moving two cpp files into the wasm/ subdir, so they can be built once and shared between the various tools.
|
|
|
|
| |
the checks (#1461)
|
|
|
|
| |
collisions between say a global import and a function with a name from the name section that happens to match it (#1424)
|
|
|
|
|
|
|
| |
* wasm-link-metadata: Use `__data_end` symbol.
* Add --global-base param to emscripten-wasm-finalize to compute staticBump properly
* Let ModuleWriter write to a provided Output object
|
|
|
|
|
|
|
|
| |
* Dedupe function names when reading a binary
* More robust name deduplication, use .s instead of _s
* Add name-duplicated wasm binaries
|
|
|
| |
These instances may simply be caught by reference instead.
|
|
|
|
| |
* rename WasmType to Type. it's in the wasm:: namespace anyhow, and without Wasm- it fits in better alongside Index, Address, Expression, Module, etc.
|