forks/binaryen.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	StringNew: Trap on OOB start index (#6438)	Alon Zakai	2024-03-25	2	-1/+54
\|
*	Generate interesting strings in fuzzer (#6430)	Thomas Lively	2024-03-23	1	-2/+38
\| \| \| \|	Instead of generating exclusively ascii strings, generate empty strings and strings containing various unicode characters and unpaired surrogates as well.
*	Remove "minimal" JS import/export legalization (#6428)	Sam Clegg	2024-03-22	8	-152/+8
\| \| \| \| \| \| \| \| \| \| \| \|	This change removes the "minimal" mode from `LegalizeJSInterface` which was added in #1883. The idea behind this change was to avoid legalizing most function except those we know that JS will be calling. The idea was that for dynamic linking we always want the non-legalized version to be shared between wasm module. These days we solve this problem in a different way with the `legalize-js-interface-export-originals` which exports the original functions alongside the legalized ones. Emscripten then always prefers the `$orig` functions when doing dynamic linking.
*	[Strings] Represent string values as WTF-16 internally (#6418)	Thomas Lively	2024-03-22	20	-207/+339
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	WTF-16, i.e. arbitrary sequences of 16-bit values, is the encoding of Java and JavaScript strings, and using the same encoding makes the interpretation of string operations trivial, even when accounting for non-ascii characters. Specifically, use little-endian WTF-16. Re-encode string constants from WTF-8 to WTF-16 in the parsers, then back to WTF-8 in the writers. Update the constructor for string `Literal`s to interpret the string as WTF-16 and store a sequence of WTF-16 code units, i.e. 16-bit integers. Update `Builder::makeConstantExpression` accordingly to convert from the new `Literal` string representation back to a WTF-16 string. Update the interpreter to remove the logic for detecting non-ascii characters and bailing out. The naive implementations of all the string operations are correct now that our string encoding matches the JS string encoding.
*	Remove extra conversion in string test (#6426)	Thomas Lively	2024-03-22	1	-6/+2
\| \| \| \|	`string.encode_wtf16_array` operates on stringref, not on wtf16 string views, so this conversion was causing validation errors when passed to V8 by the fuzzer.
*	Add missing conversions in string slice tests (#6424)	Thomas Lively	2024-03-22	1	-3/+9
\| \| \| \|	Our validator apparently does not catch this type issue yet. For now just fix the tests.
*	Update file name in INITIAL_CONTENTS_IGNORE (#6425)	Thomas Lively	2024-03-22	1	-1/+1
\| \| \| \|	The test file was renamed, but the fuzzer still used the old name in INITIAL_CONTENTS_IGNORE.
*	Precompute: Mark StringEncode as non-removable, just like ArrayCopy (#6423)	Alon Zakai	2024-03-22	2	-7/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The only StringEncode we support is the one that writes into an array, so it has the same effects as ArrayCopy. Precompute needs to be made aware of such side effects in a manual manner (as we already do for ArrayCopy etc.): it simply tries to execute code in the interpreter, and if it succeeds it replaces; it does not check for side effects (checking for side effects would prevent optimizing cases where the side effects do not happen, as we check them statically, e.g. dividing by a non-zero constant does not trap but a division would be seen as having a potential trap effect). I verified no other string operation is hit by this: all the others emit or operate on immutable strings; it is just StringEncode that is basically an Array operation that appears in the Strings proposal.)
*	[Strings] Handle overflow in string.encode_wtf16_array (#6422)	Alon Zakai	2024-03-22	2	-2/+43
\|
*	CodeFolding: Fix up old EH when we fold away an If (#6420)	Alon Zakai	2024-03-22	2	-1/+69
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The pass does (among other things) this: (if condition X X ) => (block (drop condition ) X ;; deduplicated ) After that the condition is now nested in a block, so we may need EH fixups if it contains a pop.
*	Mark non-closed types as requiring GC (#6421)	Thomas Lively	2024-03-21	2	-1/+14
\| \| \|	This omission was able to cause a problem with text round-tripping.
*	[Strings] Implement TODOs in the fuzzer (#6416)	Alon Zakai	2024-03-21	1	-1/+6
\|
*	[Strings] Add (partial) validation for StringNew (#6417)	Alon Zakai	2024-03-21	1	-1/+34
\|
*	[Strings] Emit unreachable when a string instruction cannot be emitted ↵	Alon Zakai	2024-03-21	1	-0/+13
\| \| \| \| \|	properly (#6415) See WebAssembly/stringref#66
*	[Strings] Fix StringSlice end computation (#6414)	Alon Zakai	2024-03-21	2	-3/+17
\| \| \| \| \|	Like JS string slicing, if the end index is out of bounds that is fine, we clamp to the end. This also matches the behavior in V8 and the spec.
*	Revert "Strings: Disable precomputing for now (#6412)" (#6413)	Alon Zakai	2024-03-20	3	-74/+13
\| \| \| \| \| \| \| \|	This reverts commit 70ac213fce134840609190a5d3a18118a089ba8a. Reverts #6412 On second thought we found a way to make fixing this less urgent, and the code size downsides of this are worrying, so let's revert it.
*	Strings: Disable precomputing for now (#6412)	Alon Zakai	2024-03-20	3	-13/+74
\| \| \| \|	Our UTF implementation is still not fully stable it seems as we have reports of issues. Disable it for now.
*	[Strings] Avoid mishandling unicode in StringConcat (#6411)	Roberto Lublinerman	2024-03-19	2	-1/+32
\|
*	Atomics: Handle timeouts in waits in the (single-threaded) interpreter (#6408)	Alon Zakai	2024-03-19	2	-3/+30
\| \| \| \| \| \| \| \| \|	The interpreter does not run multiple threads, and it was returning 0 from atomic.wait, which means it was woken up. But it is more correct for it to return 2, which means it timed out - which is actually the case, as no other thread exists that can wake it up. However, even that is not good for fuzzing as the timeout may be infinite or large, so just emit a host limit error on any timeout for now, until we actually implement threads.
*	[Strings] Implement stringview_wtf16.slice (#6404)	Alon Zakai	2024-03-19	4	-5/+95
\|
*	Typed continuations: suspend instructions (#6393)	Frank Emrich	2024-03-19	29	-22/+281
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This PR is part of a series that adds basic support for the [typed continuations/wasmfx proposal](https://github.com/wasmfx/specfx). This particular PR adds support for the `suspend` instruction for suspending with a given tag, documented [here](https://github.com/wasmfx/specfx/blob/main/proposals/continuations/Overview.md#instructions). These instructions are of the form `(suspend $tag)`. Assuming that `$tag` is defined with _n_ `param` types `t_1` to `t_n`, the instruction consumes _n_ arguments of types `t_1` to `t_n`. Its result type is the same as the `result` type of the tag. Thus, the folded textual representation looks like `(suspend $tag arg1 ... argn)`. Support for the instruction is implemented in both the old and the new wat parser. Note that this PR does not implement validation of the new instruction. This PR also fixes finalization of `cont.new`, `cont.bind` and `resume` nodes in those cases where any of their children are unreachable.
*	[Strings] Avoid mishandling unicode in interpreter (#6405)	Thomas Lively	2024-03-18	2	-5/+135
\| \| \| \| \| \| \|	Our interpreter implementations of `stringview_wtf16.length`, `stringview_wtf16.get_codeunit`, and `string.encode_wtf16_array` are not unicode-aware, so they were previously incorrect in the face of multi-byte code units. As a fix, bail out of the interpretation if there is a non-ascii code point that would make our naive implementation incorrect.
*	[NFC] Fix build error on RISC-V 64 (#6410)	moui0	2024-03-18	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Similar issue as: #6330 FAILED: src/passes/CMakeFiles/passes.dir/Precompute.cpp.o /usr/bin/c++ -I/build/binaryen/src/binaryen-version_117/src -I/build/binaryen/src/binaryen-version_117/third_party/llvm-project/include -I/build/binaryen/src/binaryen-version_117/build -march=rv64gc -mabi=lp64d -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -fstack-clash-protection -Wp,-D_GLIBCXX_ASSERTIONS -g -ffile-prefix-map=/build/binaryen/src=/usr/src/debug/binaryen -DBUILD_LLVM_DWARF -Wall -Werror -Wextra -Wno-unused-parameter -Wno-dangling-pointer -fno-omit-frame-pointer -fno-rtti -Wno-implicit-int-float-conversion -Wno-unknown-warning-option -Wswitch -Wimplicit-fallthrough -Wnon-virtual-dtor -fPIC -fdiagnostics-color=always -O3 -DNDEBUG -UNDEBUG -std=c++17 -MD -MT src/passes/CMakeFiles/passes.dir/Precompute.cpp.o -MF src/passes/CMakeFiles/passes.dir/Precompute.cpp.o.d -o src/passes/CMakeFiles/passes.dir/Precompute.cpp.o -c /build/binaryen/src/binaryen-version_117/src/passes/Precompute.cpp In file included from /build/binaryen/src/binaryen-version_117/src/wasm-traversal.h:30, from /build/binaryen/src/binaryen-version_117/src/pass.h:24, from /build/binaryen/src/binaryen-version_117/src/ir/intrinsics.h:20, from /build/binaryen/src/binaryen-version_117/src/ir/effects.h:20, from /build/binaryen/src/binaryen-version_117/src/passes/Precompute.cpp:30: In copy constructor ‘wasm::SmallVector<wasm::Expression, 10>::SmallVector(const wasm::SmallVector<wasm::Expression, 10>&)’, inlined from ‘constexpr std::pair<_T1, _T2>::pair(const _T1&, const _T2&) [with _U1 = wasm::Select* const; _U2 = wasm::SmallVector<wasm::Expression, 10>; typename std::enable_if<(std::_PCC<true, _T1, _T2>::_ConstructiblePair<_U1, _U2>() && std::_PCC<true, _T1, _T2>::_ImplicitlyConvertiblePair<_U1, _U2>()), bool>::type <anonymous> = true; _T1 = wasm::Select const; _T2 = wasm::SmallVector<wasm::Expression, 10>]’ at /usr/include/c++/13.2.1/bits/stl_pair.h:559:21, inlined from ‘T& wasm::InsertOrderedMap<Key, T>::operator[](const Key&) [with Key = wasm::Select; T = wasm::SmallVector<wasm::Expression, 10>]’ at /build/binaryen/src/binaryen-version_117/src/support/insert_ordered.h:112:29: /build/binaryen/src/binaryen-version_117/src/support/small_vector.h:42:38: error: ‘<unnamed>.wasm::SmallVector<wasm::Expression, 10>::fixed’ is used uninitialized [-Werror=uninitialized] 42 \| template<typename T, size_t N> class SmallVector { \| ^~~~~~~~~~~ In file included from /build/binaryen/src/binaryen-version_117/src/passes/Precompute.cpp:38: /build/binaryen/src/binaryen-version_117/src/support/insert_ordered.h: In function ‘T& wasm::InsertOrderedMap<Key, T>::operator[](const Key&) [with Key = wasm::Select; T = wasm::SmallVector<wasm::Expression, 10>]’: /build/binaryen/src/binaryen-version_117/src/support/insert_ordered.h:112:29: note: ‘<anonymous>’ declared here 112 \| std::pair<const Key, T> kv = {k, {}}; \| ^~
*	DeadArgumentElimination/SignaturePruning: Prune params even if called with ↵	Alon Zakai	2024-03-18	7	-125/+822
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	effects (#6395) Before this PR, when we saw a param was unused we sometimes could not remove it. For example, if there was one call like this: (call $target (call $other) ) That nested call has effects, so we can't just remove it from the outer call - we'd need to move it first. That motion was hard to integrate which was why it was left out, but it turns out that is sometimes very important. E.g. in Java it is common to have such calls that send the this parameter as the result of another call; not being able to remove such params meant we kept those nested calls alive, creating empty structs just to have something to send there. To fix this, this builds on top of #6394 which makes it easier to move all children out of a parent, leaving only nested things that can be easily moved around and removed. In more detail, DeadArgumentElimination/SignaturePruning track whether we run into effects that prevent removing a field. If we do, then we queue an operation to move the children out, which we do using a new utility ParamUtils::localizeCallsTo. The pass then does another iteration after that operation. Alternatively we could try to move things around immediately, but that is quite hard: those passes already track a lot of state. It is simpler to do the fixup in an entirely separate utility. That does come at the cost of the utility doing another pass on the module and the pass itself running another iteration, but this situation is not the most common.
*	[Strings] Implement string.concat in the interpreter (#6403)	Roberto Lublinerman	2024-03-15	2	-1/+40
\|
*	[Strings] Implement string.encode_wtf16_array (#6402)	Alon Zakai	2024-03-14	2	-1/+105
\|
*	[Strings] Fix precomputing of StringEq (#6401)	Alon Zakai	2024-03-14	2	-35/+46
\| \| \| \| \| \| \| \|	We incorrectly overrode the string operations in the interpreter's subclasses. But string operations can be implemented in the topmost class there (as they depend on no module state), so just implement them there, once, in a proper way. This fixes StringEq by removing its override, and moves the others to the right place.
*	Fix ASan/TSan errors by using LLVM 18 (#6396)	Alon Zakai	2024-03-14	1	-8/+16
\| \| \| \| \|	Github actions CI started to fail for no obvious reason. It seems some VM change happened, and very recent LLVM/clang is needed to keep running sanitizers. LLVM 18 is the first version that works.
*	[NFC] Refactor ChildLocalizer to handle unreachable code better (#6394)	Alon Zakai	2024-03-14	3	-24/+133
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is NFC in the current users, but is necessary functionality for a later PR. ChildLocalizer moves children into locals as needed. It used to stop when it saw the first unreachable. After this change we move such unreachable children out of the parent as well, making this more uniform: all interacting effects are moved out, and all that is left nested in the parent can be moved around and removed as desired. Also add a getReplacement helper that makes using this easier. This cannot be tested comprehensively with the current user as that user will not call this code path on an unreachable parent at all, so this just adds what can be tested. The later PR will have tests for all corner cases.
*	DCE: Fix old EH on a pop that gets moved in a catch body (#6400)	Alon Zakai	2024-03-14	2	-9/+79
\|
*	Fix a build error when assertions are disabled (#6397)	Thomas Lively	2024-03-13	1	-2/+4
\| \| \| \| \|	Add `[[maybe_unused]]` to variables that are only used in assertions. In builds without assertions enabled, these were causing compiler errors about unused variables.
*	Remove legacy GC encodings (#5874)	Thomas Lively	2024-03-12	2	-151/+31
\| \| \| \| \|	It was previously possible to opt in to using the legacy GC opcodes with a build time flag. Now that WasmGC has shipped and users have migrated to the standard opcodes, remove the option to use the legacy encodings.
*	Fix Emscripten build with -Wno-unused-command-line-argument (#6392)	Thomas Lively	2024-03-12	1	-1/+5
\| \| \| \|	Emscripten had started complaining about the repeated NODERAWFS arguments in the link command, but they would be nontrivial to deduplicate.
*	Fuzzer: Fix up null outputs in wasm2js optimized builds (#6374)	Alon Zakai	2024-03-08	1	-0/+15
\| \| \| \| \| \| \| \|	This is fallout from #6310 where we moved to use fuzz_shell.js for all fuzzing purposes. That script doesn't know wasm types, all it has on the JS side is the number of arguments to a function, and it passes in null for them all regardless of their type. That normally works fine - null is cast to the right type upon use - but in wasm2js optimized builds we can remove casts, which can make that noticeable.
*	Check for unreachable in `Select::finalize(Type)` (#6389)	Thomas Lively	2024-03-08	1	-1/+9
\| \| \| \|	Previously selects finalized with explicit types would never be marked unreachable, even when they should have been.
*	[NFC] Clean up the unreachable replacement code in Print.cpp (#6388)	Thomas Lively	2024-03-08	1	-108/+56
\| \| \| \| \| \| \|	When instructions cannot be printed because the children from which they are supposed to get their type immediates are unreachable or null, we print blocks of their dropped children followed by unreachables. But the logic for making this happen was more complicated than necessary and in fact included dead code. Clean it up.
*	Fix printing of bulk array ops (#6387)	Thomas Lively	2024-03-08	4	-66/+212
\| \| \| \| \| \| \| \| \|	When the bulk array ops had unreachable or null array types, they were replaced with blocks, but not using the correct code that also prints all their children as dropped followed by an unreachable. This meant that the text output in those cases did not parse as a valid module. Fix the bug. A follow-up PR will simplify the code to prevent similar bugs from occurring in the future.
*	Regenerate test output (#6385)	Thomas Lively	2024-03-07	3	-26/+30
\| \| \| \|	The checked in test outputs were out of sync with what the auto update script produces.
*	[IRBuilder] Validate tuple arities (#6384)	Thomas Lively	2024-03-07	1	-0/+12
\| \| \| \|	Throw errors if tuple arity immediates are less than 2 or if tuple index immediates are out of bounds.
*	Expose features option in C API binary reading (#6380)	Surma	2024-03-07	4	-4/+41
\| \| \| \|	This allows reading a module that requires a particular feature set. The old API assumed only MVP features.
*	Do not write assertions to split.wast for spec tests (#6383)	Thomas Lively	2024-03-07	2	-3/+3
\| \| \| \| \| \| \| \| \| \|	As part of our running of spec tests, we split out each module in a test script into a separate text file for processing with wasm-opt. We previously included the test assertions corresponding to the module into that text file, where they were ignored by the legacy text parser. The new parser errors out due to the extra tokens after the module, though, so to avoid problems once we switch to the new parser, stop including the assertions in those text files. Also remove a nearby unused argument as a drive-by cleanup.
*	Handle extended const segment offsets in the fuzzer (#6382)	Thomas Lively	2024-03-07	2	-13/+14
\| \| \| \| \| \|	The fuzzer already had logic to remove all references to non-imported globals from global initializers and data segment offsets, but it was missing for element segment offsets. Add it, and also add a missing check line for the new test that uncovered this bug as initial fuzzer input.
*	Fix EH fuzz bugs (#6381)	Thomas Lively	2024-03-07	2	-2/+2
\| \| \| \| \|	Due to a typo, the fuzzer was making externrefs when it should have been making exnrefs. Fix that and also let eh-utils.cpp know that TryTable exists to avoid an assertion failure.
*	Print '(offset ...)` in data and element segments (#6379)	Thomas Lively	2024-03-06	2	-2/+23
\| \| \| \| \| \| \|	Previously we just printed the offset instruction(s) directly, which is a valid shorthand only when there is a single instruction. In the case of extended constant instructions, there can potentially be multiple instructions, in which case the explicit `offset` clause is required. Print the full clause when necessary.
*	Add sourcemap support to wasm-metadce and wasm-merge (#6372)	Jérôme Vouillon	2024-03-06	12	-12/+271
\|
*	[Parser] Improve parsed IR for multivalue returns (#6378)	Thomas Lively	2024-03-05	2	-11/+13
\| \| \| \| \|	Rather than reassembling a tuple from multiple pops, let the pop implementation assemble the tuple. This produces less code in cases where there is already a tuple of the proper size on top of the stack. It also simplifies the code.
*	Fuzzer: Standardize notation for exception prefixes (#6369)	Alon Zakai	2024-03-05	2	-4/+10
\| \| \| \| \| \| \| \| \|	We had exception: in one and exception thrown: in another. Making those consistent allows fuzz_shell.js to print the exception after that prefix, which makes debugging easier sometimes. Also canonicalize tag names. Like funcref names, JS VMs print out the internal name, which can change after opts, so canonicalize it.
*	[Parser] Propagate debug locations like the old parser (#6377)	Thomas Lively	2024-03-05	2	-0/+81
\| \| \| \| \| \| \| \| \|	Add a pass that propagates debug locations to unannotated child and sibling expressions after parsing. The new parser on its own only attaches debug locations to directly annotated instructions, but this pass, which we run unconditionally, emulates the behavior of the previous parser for compatibility with existing programs. It does unintuitive things to programs using the non-nested format because it runs on nested Binaryen IR, so we may want to rethink this at some point.
*	Fuzzer: Ignore fuzz testcases that make VMs run out of stack (#6376)	Alon Zakai	2024-03-04	1	-8/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the stack runs out is observable and optimizations can change it, so we must ignore such testcases. Also add some logic to help debug stuff like this, as suggested by tlively in the past, to add some metrics on the reasons we ignored a testcase. That emits something like this: (ignored 253 iters, for reasons {'too many errors vs calls': 230, '[host limit ': 20, 'uninitialized non-defaultable local': 3}) As a drive by make the metrics print wasm bytes/iter rather than by second (the former is easy to compute from the latter anyhow, and the latter is more interesting I think).
*	[Parser] Support prologue and epilogue sourcemap annotations (#6370)	Thomas Lively	2024-03-04	7	-33/+91
\| \| \| \| \| \| \|	and fix a bug with sourcemap annotations on folded `if` conditions. Update IRBuilder to apply prologue and epilogue source locations when beginning and ending a function scope. Add basic support in the parser for explicitly tracking annotations on module fields, although only do anything with them in the case of prologue source location annotations.