forks/binaryen.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	[NFC] Move optimizeSubsequentStructSet() to a new pass, ↵	Alon Zakai	2024-09-03	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	HeapStoreOptimization (#6882) This just moves code out of OptimizeInstructions to the new pass. The existing test is renamed and now runs the new pass instead. The new pass is run right after each --optimize-instructions invocation, so it should not cause any noticeable effects whatsoever, making this NFC. The motivation here is that there is a bug in the pass, see the new testcase added at the end, which shows the bug. It is not practical to fix that bug in OptimizeInstructions since we need more than peephole optimizations to do so. This PR moves the code to a new pass so we can fix it there properly, later. The new pass is named HeapStoreOptimization since the same infrastructure we will need to fix the bug will also help dead store elimination and related things.
*	[FP16] Implement madd and nmadd. (#6878)	Brendan Dahl	2024-09-03	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Specified at https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md A few notes: - The F32x4 and F64x2 versions of madd and nmadd are missing spect tests. - For madd, the implementation was incorrectly doing `(bc)+a` where it should be `(ab)+c`. - For nmadd, the implementation was incorrectly doing `(-bc)+a` where it should be `-(ab)+c`. - There doesn't appear to be a great way to actually implement a fused nmadd, but the spec allows the double rounded version I added.
*	Ignore fp16 in the fuzzer (#6881)	Alon Zakai	2024-08-29	2	-1/+3
\| \| \| \|	Add the feature flag in V8 invocations, but also disable the feature as it isn't quite ready yet.
*	Rename relaxed SIMD fma instructions to match spec. (#6876)	Brendan Dahl	2024-08-27	1	-4/+4
\| \| \| \| \| \| \|	The instructions relaxed_fma and relaxed_fnma have been renamed to relaxed_madd and relaxed_nmadd. https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md#binary-format
*	[FP16] Implement unary operations. (#6867)	Brendan Dahl	2024-08-27	1	-0/+7
\| \| \| \|	Specified at https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md
*	Support more reference constants in wast scripts (#6865)	Thomas Lively	2024-08-26	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Spec tests use constants like `ref.array` and `ref.eq` to assert that exported function return references of the correct types. Support more such constants in the wast parser. Also fix a bug where the interpretation of `array.new_data` for arrays of packed fields was not properly truncating the packed data. Move the function for reading fields from memory from literal.cpp to wasm-interpreter.h, where the function for truncating packed data lives. Other bugs prevent us from enabling any more spec tests as a result of this change, but we can get farther through several of them before failing. Update the comments about the failures accordingly.
*	[FP16] Implement arithmetic operations. (#6855)	Brendan Dahl	2024-08-21	1	-0/+8
\| \| \| \|	Specified at https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md
*	Support `ref.extern n` in spec tests (#6858)	Thomas Lively	2024-08-21	2	-12/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Spec tests pass the value `ref.extern n`, where `n` is some integer, into exported functions that expect to receive externrefs and receive such values back out as return values. The payload serves to distinguish externrefs so the test can assert that the correct one was returned. Parse these values in wast scripts and represent them as externalized i31refs carrying the payload. We will need a different representation eventually, since some tests explicitly expect these externrefs to not be i31refs, but this suffices to get several new tests passing. To get the memory64 version of table_grow.wast passing, additionally fix the interpreter to handle growing 64-bit tables correctly. Delete the local versions of the upstream tests that can now be run successfully.
*	[NFC] Triage spec test problems (#6857)	Thomas Lively	2024-08-21	1	-81/+80
\| \| \| \|	Add comments to the spec test skip list briefly explaining why each skipped spec test must be skipped.
*	Fix encoding of heap type definitions (#6856)	Thomas Lively	2024-08-20	1	-3/+2
\| \| \| \| \| \| \| \|	The leading bytes that indicate what kind of heap type is being defined are bytes, but we were previously treating them as SLEB128-encoded values. Since we emit the smallest LEB encodings possible, we were writing the correct bytes in output files, but we were also improperly accepting binaries that used more than one byte to encode these values. This was caught by an upstream spec test.
*	Add the upstream spec testsuite as a submodule (#6853)	Thomas Lively	2024-08-20	2	-6/+99
\| \| \| \| \| \|	Run the upstream tests by default, except for a large list of them that do not successfully run. Remove the local version of those that do successfully run where the local version is entirely subsumed by the upstream version.
*	[Exceptions] Finish interpreter + optimizer support for try_table. (#6814)	Sébastien Doeraene	2024-08-20	1	-6/+7
\| \| \| \| \| \|	* Add interpreter support for exnref values. * Fix optimization passes to support try_table. * Enable the interpreter (but not in V8, see code) on exceptions.
*	Add a pass for minimizing recursion groups (#6832)	Thomas Lively	2024-08-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Most of our type optimization passes emit all non-public types as a single large rec group, which trivially ensures that different types remain different, even if they are optimized to have the same structure. Usually emitting a single large rec group is fine, but it also means that if the module is split, all of the types will need to be repeated in all of the split modules. To better support this use case, add a pass that can split the large rec group back into minimal rec groups, taking care to preserve separate type identities by emitting different permutations of the same group where possible or by inserting unused brand types to differentiate them.
*	Implement table.init (#6827)	Alon Zakai	2024-08-16	1	-3/+1
\| \| \| \| \|	Also use TableInit in the interpreter to initialize module's table state, which will now handle traps properly, fixing #6431
*	Testing: Add an env var to pick the V8 binary (#6836)	Alon Zakai	2024-08-16	1	-2/+2
\| \| \| \| \|	Also we had a mix of os.environ.get and os.getenv. Prefer the former, as the default value does actual work, so it's a little more efficient to not run it unnecessarily. That is, os.getenv('X', work()) is less efficient than os.environ.get('X') or work().
*	Monomorphization: Add a flag to control the required improvement (#6837)	Alon Zakai	2024-08-14	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The argument is the minimum benefit we must see for us to decide to optimize, e.g. --monomorphize --pass-arg=monomorphize-min-benefit@50 When the minimum benefit is 50% then if we reduce the cost by 50% through monomorphization then we optimize there. 95% would only optimize when we remove almost all the cost, etc. In practice I see 95% will actually tend to reduce code size overall, as while we add monomorphized versions of functions, we only do so when we remove a lot of work and size, and after inlining we gain benefits. However, 50% or even lower can lead to better benchmark results, in return for larger code size, just like with inlining. To be careful, the default is set to 95%. Previously we optimized whenever we saw any benefit at all, which is the same as requiring a minimum benefit of 0%. Old tests have the flag applied in this PR to set that value, so they do not change.
*	[FP16] Implement relation operations. (#6825)	Brendan Dahl	2024-08-09	1	-0/+6
\| \| \| \|	Specified at https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md
*	[FP16] Implement lane access instructions. (#6821)	Brendan Dahl	2024-08-08	1	-0/+3
\| \| \| \|	Specified at https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md
*	[FP16] Disable float 16 fuzzing for now. (#6822)	Brendan Dahl	2024-08-07	1	-0/+2
\|
*	[FP16] Implement load and store instructions. (#6796)	Brendan Dahl	2024-08-06	1	-0/+2
\| \| \| \|	Specified at https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md
*	Cost analysis: Remove "Unacceptable" hack (#6782)	Alon Zakai	2024-07-25	2	-0/+442
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We marked various expressions as having cost "Unacceptable", fixed at 100, to ensure we never moved them out from an If arm, etc. Giving them such a high cost avoids that problem - the cost is higher than the limit we have for moving code from conditional to unconditional execution - but it also means the total cost is unrealistic. For example, a function with one such instruction + an add (cost 1) would end up with cost 101, and removing the add would look insignificant, which causes issues for things that want to compare costs (like Monomorphization). To fix this, adjust some costs. The main change here is to give casts a cost of 5. I measured this in depth, see the attached benchmark scripts, and it looks clear that in both V8 and SpiderMonkey the cost of a cast is high enough to make it not worth turning an if with ref.test arm into a select (which would always execute the test). Other costs adjusted here matter a lot less, because they are on operations that have side effects and so the optimizer will anyhow not move them from conditional to unconditional execution, but I tried to make them a bit more realistic while I was removing "Unacceptable": * Give most atomic operations the 10 cost we've been using for atomic loads/ stores. Perhaps wait and notify should be slower, however, but it seems like assuming fast switching might be more relevant. * Give growth operations a cost of 20, and throw operations a cost of 10. These numbers are entirely made up as I am not even sure how to measure them in a useful way (but, again, this should not matter much as they have side effects).
*	[threads] Update the fuzzer for shared types (#6771)	Thomas Lively	2024-07-18	1	-17/+9
\| \| \| \| \| \| \| \|	Update the fuzzer to both handle shared types in initial contents and create and use new shared types without crashing or producing invalid modules. Since V8 does not have a complete implementation of shared-everything-threads yet, disable fuzzing V8 when shared-everything is enabled. To avoid losing too much coverage of V8, disable shared-everything in the fuzzer more frequently than other features.
*	Validate features for types used in element segments (#6769)	Thomas Lively	2024-07-18	1	-0/+1
\|
*	Validate features for types used in tables (#6768)	Thomas Lively	2024-07-18	1	-0/+1
\| \| \| \|	We previously special-cased things like GC types, but switch to a more general solution of detecting what features a table's type requires.
*	[threads] ref.i31_shared requires shared-everything in validation (#6767)	Thomas Lively	2024-07-18	1	-0/+1
\|
*	[threads] Simplify and generalize reftype writing without GC (#6766)	Thomas Lively	2024-07-18	1	-1/+1
\| \| \| \| \| \|	Similar to #6765, but for types instead of heap types. Generalize the logic for transforming written reference types to types that are supported without GC so that it will automatically handle shared types and other new types correctly.
*	[threads] Simplify and generalize heap type writing without GC (#6765)	Thomas Lively	2024-07-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	We represent `ref.null`s as having bottom heap types, even when GC is not enabled. Bottom heap types are a feature of the GC proposal, so in that case the binary writer needs to write the corresponding top type instead. We previously had separate logic for this for each type hierarchy in the binary writer, but that did not handle shared types and would not have automatically handled other new types, either. Simplify and generalize the implementation and test that we can write `ref.null`s of shared types without GC enabled.
*	[threads] Fix shared ref.eq and disallow mixed-shareability (#6763)	Thomas Lively	2024-07-17	1	-0/+1
\| \| \| \| \| \| \|	Update the validator to reject mixed-shareability ref.eq, although this is still under discussion in https://github.com/WebAssembly/shared-everything-threads/issues/76. Fix the implementation of `Literal::operator==` to work properly with shared i31ref.
*	[threads] Validate all features required by ref.null (#6757)	Thomas Lively	2024-07-16	1	-0/+2
\| \| \| \| \| \| \|	`ref.null` of shared types should only be allowed when shared-everything is enabled, but we were previously checking only that reference types were enabled when validating `ref.null`. Update the code to check all features required by the null type and factor out shared logic for printing lists of missing feature options in error messages.
*	[NFC][threads] Ignore type-ssa-shared.wast in fuzzer (#6754)	Thomas Lively	2024-07-16	1	-0/+1
\| \| \| \|	The fuzzer does not yet properly handle initial contents containing shared types.
*	Remove non-standard `i31.new` (#6736)	Thomas Lively	2024-07-12	1	-1/+0
\| \| \| \|	The standard name for the instruction is `ref.i31`. Remove support for the non-standard name and update tests that were still using it.
*	[threads] ref.i31_shared (#6735)	Thomas Lively	2024-07-12	3	-3/+5
\| \| \| \| \| \| \|	Implement `ref.i31_shared` the new instruction for creating references to shared i31s. Implement binary and text parsing and emitting as well as interpretation. Copy the upstream spec test for i31 and modify it so that all the heap types are shared. Comment out some parts that we do not yet support.
*	[StackIR] Allow StackIR to be disabled from the commandline (#6725)	Alon Zakai	2024-07-10	1	-0/+1
\| \| \| \| \| \| \| \| \|	Normally we use it when optimizing (above a certain level). This lets the user prevent it from being used even then. Also add optimization options to wasm-metadce so that this is possible there as well and not just in wasm-opt (this also opens the door to running more passes in metadce, which may be useful later).
*	Rename external conversion instructions (#6716)	Jérôme Vouillon	2024-07-08	2	-3/+5
\| \| \| \| \| \| \| \| \|	Rename instructions `extern.internalize` into `any.convert_extern` and `extern.externalize` into `extern.convert_any` to follow more closely the spec. This was changed in https://github.com/WebAssembly/gc/issues/432. The legacy name is still accepted in text inputs and in the C and JS APIs.
*	[threads] Ignore shared-array.wast in fuzzer initial contents (#6706)	Thomas Lively	2024-06-26	1	-0/+1
\|
*	[threads] Validate shared-polymorphic instructions (#6702)	Thomas Lively	2024-06-25	1	-0/+1
\| \| \| \|	Such as `ref.eq`, `i31.get_{s,u}`, and `array.len`. Also validate that struct and array operations work on shared structs and arrays.
*	Re-enable spec tests requiring multivalue (#6684)	Thomas Lively	2024-06-20	1	-2/+0
\| \| \|	And delete tests that no longer pass now that multivalue is standard.
*	Validate memarg offsets (#6683)	Thomas Lively	2024-06-20	1	-2/+1
\| \| \| \| \|	For 32-bit memories, the offset value must be in the u32 range. Update the address.wast spec test to assert that a module with an overlarge offset value is invalid rather than malformed.
*	Validate that names are valid UTF-8 (#6682)	Thomas Lively	2024-06-19	1	-4/+0
\| \| \| \| \| \|	Add an `isUTF8` utility and use it in both the text and binary parsers. Add missing checks for overlong encodings and overlarge code points in our WTF8 reader, which the new utility uses. Re-enable the spec tests that test UTF-8 validation.
*	Fix validation of unused LEB128 bits (#6680)	Thomas Lively	2024-06-19	1	-1/+0
\| \| \| \| \|	The unused bits must be a sign extension of the significant value, but we were previously only validating that unsigned LEBs had their unused bytes set to zero. Re-enable the spec test that checks for proper validation.
*	Check malformed mutability on imported globals (#6679)	Thomas Lively	2024-06-18	1	-1/+0
\| \| \|	And re-enable the globals.wast spec test, which checks this.
*	Re-enable binary.wast spec test (#6677)	Thomas Lively	2024-06-18	1	-3/+0
\| \| \| \| \| \|	Fix the wast parser to accept IDs on quoted modules, remove tests that are invalidated by the multimemory proposal, and add validation that the total number of variables in a function is less than 2^32 and that the code section is present if there is a non-empty function section.
*	[Parser] Fix bug in unreachable fallback logic (#6676)	Thomas Lively	2024-06-18	1	-3/+0
\| \| \| \| \| \| \| \| \|	When popping past an unreachable instruction would lead to popping from an empty stack or popping an incorrect type, we need to avoid popping and produce new Unreachable instructions instead to ensure we parse valid IR. The logic for this was flawed and made the synthetic Unreachable come before the popped unreachable child, which was not correct in the case that that popped unreachable was a branch or other non-trapping instruction. Fix and simplify the logic and re-enable the spec test that uncovered the bug.
*	fix(#6671): fix possible stack buffer overflow in gen-s-parser.inc (#6678)	mtb	2024-06-18	1	-0/+6
\| \| \| \| \| \|	The stack buffer overflow is occurring because memcpy(buf, op.data(), op.size()); can write up to op.size() bytes into buf, but buf is only 33 bytes long. If op.size() is greater than 33, this will result in a buffer overflow.
*	Reject invalid section IDs (#6675)	Thomas Lively	2024-06-18	1	-1/+0
\| \| \| \| \| \|	Rather than treating them as custom sections. Also fix UB where invalid `Section` enum values could be used as keys in a map. Use the raw `uint8_t` section IDs as keys instead. Re-enable a disabled spec test that was failing because of this bug and UB.
*	Enable more spec tests (#6669)	Thomas Lively	2024-06-17	2	-43/+30
\| \| \| \| \|	Re-triage all the disabled spec tests and re-enable many of them. Improve the module splitting logic to correctly handle (by skipping) quoted modules and their associated assertions.
*	[threads] Binary reading and writing of shared composite types (#6664)	Thomas Lively	2024-06-14	1	-0/+4
\| \| \| \|	Also update the parser so that implicit type uses are not matched with shared function types.
*	Remove obsolete parser code (#6607)	Thomas Lively	2024-05-29	1	-617/+598
\| \| \| \| \|	Remove `SExpressionParser`, `SExpressionWasmBuilder`, and `cashew::Parser`. Simplify gen-s-parser.py. Remove the --new-wat-parser and --deprecated-wat-parser flags.
*	Fuzzer: Stop testing with TurboFan as Turboshaft is rolling out and is ↵	Alon Zakai	2024-05-28	1	-7/+0
\| \| \| \|	faster (#6623)
*	Fix binary emitting of br_if with a refined value by emitting a cast (#6510)	Alon Zakai	2024-05-16	1	-1/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This makes us compliant with the wasm spec by adding a cast: we use the refined type for br_if fallthrough values, and the wasm spec uses the branch target. If the two differ, we add a cast after the br_if to make things match. Alternatively we could match the wasm spec's typing in our IR, but we hope the wasm spec will improve here, and so this is will only be temporary in that case. Even if not, this is useful because by using the most refined type in the IR we optimize in the best way possible, and only suffer when we emit fixups in the binary, but in practice those cases are very rare: br_if is almost always dropped rather than used, in real-world code (except for fuzz cases and exploits). We check carefully when a br_if value is actually used (and not dropped) and its type actually differs, and it does not already have a cast. The last condition ensures that we do not keep adding casts over repeated roundtripping.