forks/binaryen.git -

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Ensure printed tuple.extract arity is valid (#6487)	Thomas Lively	2024-04-11	1	-1/+6
\| \| \| \| \| \|	We previously printed the size of the tuple operand as the arity, but that printed `1` when the operand is unreachable. We don't allow our text input to use `1` as the arity, so don't print it, either. Instead, print the smallest valid arity, `2`, in this case.
*	[Parser] Parse contref and nullcontref types (#6485)	Thomas Lively	2024-04-10	2	-0/+16
\|
*	Heap2Local: Fix signed struct/array reads (#6484)	Alon Zakai	2024-04-10	1	-5/+10
\| \| \| \| \| \|	In #6480 I forgot that StructGet can be signed, which means we need to emit a sign-extend. Arrays already copied the field as part of Array2Struct.
*	Improve inlining of `return_call*` (#6477)	Jérôme Vouillon	2024-04-10	2	-28/+107
\| \| \| \| \|	Use the previous implementation when no return_call is in a try block. This avoids moving code around (as a sibling of the caller body or the inlined body), so that should allow more local optimizations after inlining.
*	Heap2Local: Optimize packed fields (#6480)	Alon Zakai	2024-04-09	1	-12/+26
\| \| \| \| \| \|	Previously we did not optimize a struct or an array with a packed field. As a result a single packed field in a struct prevented the entire struct from being localized, which this fixes. This is also useful for arrays as packed arrays are common (e.g. for string data).
*	Heap2Local: Optimize Arrays in addition to Structs (#6478)	Alon Zakai	2024-04-09	1	-21/+322
\| \| \| \| \| \| \| \| \| \| \| \|	To keep things simple, this adds a Array2Struct component to the pass. When we find a non-escaping array, we run that to turn it into a struct, and then run the existing Struct2Local to convert that to locals. This avoids refactoring Struct2Local to handle both structs and arrays (with the downside of making the optimization of arrays a little less efficient, but they are rarer, I suspect - that is certainly the case in Java output I've seen). The core EscapeAnalyzer logic is generalized to handle both arrays and structs, but the changes there are thankfully quite minor.
*	Asyncify: Fix nondeterminism in verbose logging (#6479)	Alon Zakai	2024-04-09	2	-4/+24
\| \| \| \|	#6457 added a test that exposed existing nondeterminism.
*	Handle return calls in CodeFolding (#6474)	Thomas Lively	2024-04-08	1	-1/+21
\| \| \| \|	Treat them the same as returns and test that they can be folded out of try-catch blocks because they do not have throws effects.
*	Handle return calls correctly	Thomas Lively	2024-04-08	7	-226/+476
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a combined commit covering multiple PRs fixing the handling of return calls in different areas. The PRs are all landed as a single commit to ensure internal consistency and avoid problems with bisection. Original PR descriptions follow: * Fix inlining of `return_call` (#6448) Previously we transformed return calls in inlined function bodies into normal calls followed by branches out to the caller code. Similarly, when inlining a `return_call` callsite, we simply added a `return` after the body inlined at the callsite. These transformations would have been correct if the semantics of return calls were to call and then return, but they are not correct for the actual semantics of returning and then calling. The previous implementation is observably incorrect for return calls inside try blocks, where the previous implementation would run the inlined body within the try block, but the proper semantics would be to run the inlined body outside the try block. Fix the problem by transforming inlined return calls to branches followed by calls rather than as calls followed by branches. For the case of inlined return call callsites, insert branches out of the original body of the caller and inline the body of the callee as a sibling of the original caller body. For the other case of return calls appearing in inlined bodies, translate the return calls to branches out to calls inserted as siblings of the original inlined body. In both cases, it would have been convenient to use multivalue block return to send call parameters along the branches to the calls, but unfortunately in our IR that would have required tuple-typed scratch locals to unpack the tuple of operands at the call sites. It is simpler to just use locals to propagate the operands in the first place. Fix interpretation of `return_call` (#6451) We previously interpreted return calls as calls followed by returns, but that is not correct both because it grows the size of the execution stack and because it runs the called functions in the wrong context, which can be observable in the case of exception handling. Update the interpreter to handle return calls correctly by adding a new `RETURN_CALL_FLOW` that behaves like a return, but carries the arguments and reference to the return-callee rather than normal return values. `callFunctionInternal` is updated to intercept this flow and call return-called functions in a loop until a function returns with some other kind of flow. Pull in the upstream spec tests return_call.wast, return_call_indirect.wast, and return_call_ref.wast with light editing so that we parse and validate them successfully. Handle return calls in wasm-ctor-eval (#6464) When an evaluated export ends in a return call, continue evaluating the return-called function. This requires propagating the parameters, handling the case that the return-called function might be an import, and fixing up local indices in case the final function has different parameters than the original function. * Update effects.h to handle return calls correctly (#6470) As far as their surrounding code is concerned return calls are no different from normal returns. It's only from a caller's perspective that a function containing a return call also has the effects of the return-callee. To model this more precisely in EffectAnalyzer, stash the throw effect of return-callees on the side and only merge it in at the end when analyzing the effects of a full function body.
*	Asyncify-verbose: Show all reasons why a function is instrumented (#6457)	Dannii Willis	2024-04-08	3	-16/+24
\| \| \| \|	Helps emscripten-core/emscripten#17380 by logging all the reasons why we instrument a function, and not just the first as we did before.
*	[NFC] Refactor Heap2Local logic (#6473)	Alon Zakai	2024-04-06	3	-355/+402
\| \| \| \| \| \|	Separate out an EscapeAnalyzer class that does the escape analysis, and a Struct2Local one that does the optimization. Also make a few things const here to be safer.
*	[NFC] Remove unused variables (#6475)	Thomas Lively	2024-04-05	1	-2/+2
\| \| \|	These were causing build failures on the Emscripten builder.
*	Typed continuations: nocont and cont basic heap types (#6468)	Frank Emrich	2024-04-04	9	-6/+103
\| \| \| \| \| \| \| \|	This PR is part of a series that adds basic support for the typed continuations/wasmfx proposal. This particular PR adds cont and nocont as top and bottom types for continuation types, completely analogous to func and nofunc for function types (also: exn and noexn).
*	[NFC] Generalize and simplify wasm-delegations-fields.h (#6465)	Alon Zakai	2024-04-03	3	-762/+601
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This removes the hard-coded generation of a switch and cases, and allows the user to define the boilerplate at the start and end of the main output, and of what is generated for each expression. By default we still emit a switch and cases. Also standardize the output by never emitting ; unnecessarily, which we were inconsistent about. This serves two goals: First, it will make using embind on Binaryen simpler as embind needs to generate C++ template logic for each expression, and not a switch (and we cannot have extra ; in embind notation). Second, this makes the format much simple to parse, which is a stepping stone for #6460, e.g. before we had case Expression::Id::LoopId: { DELEGATE_START(Loop); DELEGATE_FIELD_CHILD(Loop, body); DELEGATE_FIELD_SCOPE_NAME_DEF(Loop, name); DELEGATE_END(Loop); break; } and now we have DELEGATE_FIELD_CASE_START(Loop) DELEGATE_FIELD_CHILD(Loop, body) DELEGATE_FIELD_SCOPE_NAME_DEF(Loop, name) DELEGATE_FIELD_CASE_END(Loop) The main part of this diff was autogenerated by this python: for l in x.splitlines(): if l.startswith(' case'): id = l.split(':')[4][:-2] print(f'DELEGATE_FIELD_CASE_START({id})') if l.startswith(' DELEGATE_FIELD'): print(l) if l.startswith(' DELEGATE_END'): id = l[17:-2] print(f'DELEGATE_FIELD_CASE_END({id})') print()
*	Fix writing of data segment names in name section (#6462)	Jérôme Vouillon	2024-04-02	1	-2/+2
\| \| \| \|	- Output segment names even when no memory is declared. - Only write explicit names.
*	Add an Asyncify option to propagate the addList (#5935)	かめのこにょこにょこ	2024-04-01	1	-12/+42
\| \| \| \| \|	The new asyncify flag --pass-arg=asyncify-propagate-addlist changes the behavior of --pass-arg=asyncify-addlist : with it, callers of functions in the asyncify-addlist will be also instrumented.
*	[Strings] string.new_wtf16_array should trap if the end index is less than ↵	Alon Zakai	2024-04-01	1	-1/+2
\| \| \| \|	the start (#6459)
*	GUFA: Fix hashing of GlobalInfo's type (#6455)	Alon Zakai	2024-03-29	1	-2/+6
\| \| \| \| \| \| \| \| \|	For a global we store the name and a type, and the type may be more precise than the global's type in the wasm. As a result, when hashing, it is not enough to hash only the name, so hash the type as well. Also add a random TODO as a comment.
*	GUFA: Fix nondeterminism in roots (#6456)	Alon Zakai	2024-03-29	1	-1/+4
\| \| \| \| \|	Found by the fuzzer. We already processed the work queue in a deterministic order, but the roots were unordered. The work queue's initial state is filled by the roots, so we must process the roots deterministically as well.
*	Report timeout in interpretation of AtomicWait (#6452)	Thomas Lively	2024-03-29	1	-1/+1
\| \| \| \| \| \| \|	To avoid slow-running fuzz cases, we report a host limit when interpreting atomic.wait with any non-zero timeout. However, in the allowed case where the timeout is zero, we were incorrectly interpreting the wait as returning 0, meaning that it was woken up, instead of 2, meaning that the timeout expired. Fix it to return 2.
*	Remove the TRAVERSE_CALLS option in the ConstantExpressionRunner (#6449)	Thomas Lively	2024-03-29	4	-44/+0
\| \| \| \| \| \| \| \|	The implementation of calls with this option was incorrect because it cleared the locals before evaluating the call arguments. The likely explanation for why this was never noticed is that there are no users of this option, especially since it is exposed in the C and JS APIs but not used internally. Rather than try to fix the implementation, just remove the option.
*	[Strings] Lower string.concat in StringLowering (#6453)	Thomas Lively	2024-03-29	1	-0/+9
\|
*	wasm-merge: Check that the types of imports and exports match (#6437)	Jérôme Vouillon	2024-03-27	1	-0/+108
\|
*	Fix parsing of table imports (#6446)	Jérôme Vouillon	2024-03-27	1	-3/+6
\| \| \|	The types was ignored and funcref was always used instead.
*	Fuzzer HeapType generator: Do not use string types if not allowed (#6447)	Alon Zakai	2024-03-27	1	-1/+1
\|
*	Fuzzer: Work around the lack of wtf8/iter support (#6445)	Alon Zakai	2024-03-27	1	-5/+6
\| \| \| \|	We only have interpreter support for wtf16, so we should not emit operations on the other types, as the interpreter will error.
*	Fix stringview subtyping (#6440)	Thomas Lively	2024-03-26	2	-8/+38
\| \| \| \| \| \|	The stringview types (`stringview_wtf8`, `stringview_wtf16`, and `stringview_iter`) are not subtypes of `any` even though they are supertypes of `none`. This breaks the type system invariant that types share a bottom type iff they share a top type, but we can work around that.
*	Add "interposition" to the fuzzer's mutate() method (#6427)	Alon Zakai	2024-03-26	1	-15/+145
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before this PR we only mutated by replacing an expression with another, which replaced all the children. With this PR we also do these two patterns: (A (B) (C) ) => ;; keep children, replace A (block (drop (B)) (drop (C)) (NEW) ) , (D (A (B) (C) ) ) => ;; keep A, replace it in the parent (D (block (drop (A (B) (C) ) ) (NEW) ) ) We also try to replace onto the new D (either A itself, or A's children).
*	[Strings] Escape strings printed by fuzz-exec (#6441)	Thomas Lively	2024-03-26	1	-6/+5
\| \| \| \| \| \| \| \| \| \| \|	Previously we printed strings as WTF-8 in the output of fuzz-exec, but this could produce invalid unicode output and did not make unprintable characters visible. Fix both these problems by escaping the output, using the JSON string escape procedure since the string to be escaped is WTF-16. Reimplement the same escaping procedure in fuzz_shell.js so that the way we print strings when running on a real JS engine matches the way we print them in our own fuzz-exec interpreter. Fixes #6435.
*	Fuzzer: Implement a few more String TODOs (#6439)	Alon Zakai	2024-03-25	1	-1/+3
\|
*	StringNew: Trap on OOB start index (#6438)	Alon Zakai	2024-03-25	1	-1/+1
\|
*	Generate interesting strings in fuzzer (#6430)	Thomas Lively	2024-03-23	1	-2/+38
\| \| \| \|	Instead of generating exclusively ascii strings, generate empty strings and strings containing various unicode characters and unpaired surrogates as well.
*	Remove "minimal" JS import/export legalization (#6428)	Sam Clegg	2024-03-22	5	-56/+8
\| \| \| \| \| \| \| \| \| \| \| \|	This change removes the "minimal" mode from `LegalizeJSInterface` which was added in #1883. The idea behind this change was to avoid legalizing most function except those we know that JS will be calling. The idea was that for dynamic linking we always want the non-legalized version to be shared between wasm module. These days we solve this problem in a different way with the `legalize-js-interface-export-originals` which exports the original functions alongside the legalized ones. Emscripten then always prefers the `$orig` functions when doing dynamic linking.
*	[Strings] Represent string values as WTF-16 internally (#6418)	Thomas Lively	2024-03-22	15	-142/+294
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	WTF-16, i.e. arbitrary sequences of 16-bit values, is the encoding of Java and JavaScript strings, and using the same encoding makes the interpretation of string operations trivial, even when accounting for non-ascii characters. Specifically, use little-endian WTF-16. Re-encode string constants from WTF-8 to WTF-16 in the parsers, then back to WTF-8 in the writers. Update the constructor for string `Literal`s to interpret the string as WTF-16 and store a sequence of WTF-16 code units, i.e. 16-bit integers. Update `Builder::makeConstantExpression` accordingly to convert from the new `Literal` string representation back to a WTF-16 string. Update the interpreter to remove the logic for detecting non-ascii characters and bailing out. The naive implementations of all the string operations are correct now that our string encoding matches the JS string encoding.
*	Precompute: Mark StringEncode as non-removable, just like ArrayCopy (#6423)	Alon Zakai	2024-03-22	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The only StringEncode we support is the one that writes into an array, so it has the same effects as ArrayCopy. Precompute needs to be made aware of such side effects in a manual manner (as we already do for ArrayCopy etc.): it simply tries to execute code in the interpreter, and if it succeeds it replaces; it does not check for side effects (checking for side effects would prevent optimizing cases where the side effects do not happen, as we check them statically, e.g. dividing by a non-zero constant does not trap but a division would be seen as having a potential trap effect). I verified no other string operation is hit by this: all the others emit or operate on immutable strings; it is just StringEncode that is basically an Array operation that appears in the Strings proposal.)
*	[Strings] Handle overflow in string.encode_wtf16_array (#6422)	Alon Zakai	2024-03-22	1	-2/+5
\|
*	CodeFolding: Fix up old EH when we fold away an If (#6420)	Alon Zakai	2024-03-22	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The pass does (among other things) this: (if condition X X ) => (block (drop condition ) X ;; deduplicated ) After that the condition is now nested in a block, so we may need EH fixups if it contains a pop.
*	Mark non-closed types as requiring GC (#6421)	Thomas Lively	2024-03-21	1	-1/+1
\| \| \|	This omission was able to cause a problem with text round-tripping.
*	[Strings] Implement TODOs in the fuzzer (#6416)	Alon Zakai	2024-03-21	1	-1/+6
\|
*	[Strings] Add (partial) validation for StringNew (#6417)	Alon Zakai	2024-03-21	1	-1/+34
\|
*	[Strings] Emit unreachable when a string instruction cannot be emitted ↵	Alon Zakai	2024-03-21	1	-0/+13
\| \| \| \| \|	properly (#6415) See WebAssembly/stringref#66
*	[Strings] Fix StringSlice end computation (#6414)	Alon Zakai	2024-03-21	1	-3/+2
\| \| \| \| \|	Like JS string slicing, if the end index is out of bounds that is fine, we clamp to the end. This also matches the behavior in V8 and the spec.
*	Revert "Strings: Disable precomputing for now (#6412)" (#6413)	Alon Zakai	2024-03-20	1	-30/+0
\| \| \| \| \| \| \| \|	This reverts commit 70ac213fce134840609190a5d3a18118a089ba8a. Reverts #6412 On second thought we found a way to make fixing this less urgent, and the code size downsides of this are worrying, so let's revert it.
*	Strings: Disable precomputing for now (#6412)	Alon Zakai	2024-03-20	1	-0/+30
\| \| \| \|	Our UTF implementation is still not fully stable it seems as we have reports of issues. Disable it for now.
*	[Strings] Avoid mishandling unicode in StringConcat (#6411)	Roberto Lublinerman	2024-03-19	1	-0/+5
\|
*	Atomics: Handle timeouts in waits in the (single-threaded) interpreter (#6408)	Alon Zakai	2024-03-19	1	-3/+9
\| \| \| \| \| \| \| \| \|	The interpreter does not run multiple threads, and it was returning 0 from atomic.wait, which means it was woken up. But it is more correct for it to return 2, which means it timed out - which is actually the case, as no other thread exists that can wake it up. However, even that is not good for fuzzing as the timeout may be infinite or large, so just emit a host limit error on any timeout for now, until we actually implement threads.
*	[Strings] Implement stringview_wtf16.slice (#6404)	Alon Zakai	2024-03-19	1	-4/+49
\|
*	Typed continuations: suspend instructions (#6393)	Frank Emrich	2024-03-19	25	-6/+202
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This PR is part of a series that adds basic support for the [typed continuations/wasmfx proposal](https://github.com/wasmfx/specfx). This particular PR adds support for the `suspend` instruction for suspending with a given tag, documented [here](https://github.com/wasmfx/specfx/blob/main/proposals/continuations/Overview.md#instructions). These instructions are of the form `(suspend $tag)`. Assuming that `$tag` is defined with _n_ `param` types `t_1` to `t_n`, the instruction consumes _n_ arguments of types `t_1` to `t_n`. Its result type is the same as the `result` type of the tag. Thus, the folded textual representation looks like `(suspend $tag arg1 ... argn)`. Support for the instruction is implemented in both the old and the new wat parser. Note that this PR does not implement validation of the new instruction. This PR also fixes finalization of `cont.new`, `cont.bind` and `resume` nodes in those cases where any of their children are unreachable.
*	[Strings] Avoid mishandling unicode in interpreter (#6405)	Thomas Lively	2024-03-18	1	-0/+34
\| \| \| \| \| \| \|	Our interpreter implementations of `stringview_wtf16.length`, `stringview_wtf16.get_codeunit`, and `string.encode_wtf16_array` are not unicode-aware, so they were previously incorrect in the face of multi-byte code units. As a fix, bail out of the interpretation if there is a non-ascii code point that would make our naive implementation incorrect.
*	[NFC] Fix build error on RISC-V 64 (#6410)	moui0	2024-03-18	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Similar issue as: #6330 FAILED: src/passes/CMakeFiles/passes.dir/Precompute.cpp.o /usr/bin/c++ -I/build/binaryen/src/binaryen-version_117/src -I/build/binaryen/src/binaryen-version_117/third_party/llvm-project/include -I/build/binaryen/src/binaryen-version_117/build -march=rv64gc -mabi=lp64d -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security -fstack-clash-protection -Wp,-D_GLIBCXX_ASSERTIONS -g -ffile-prefix-map=/build/binaryen/src=/usr/src/debug/binaryen -DBUILD_LLVM_DWARF -Wall -Werror -Wextra -Wno-unused-parameter -Wno-dangling-pointer -fno-omit-frame-pointer -fno-rtti -Wno-implicit-int-float-conversion -Wno-unknown-warning-option -Wswitch -Wimplicit-fallthrough -Wnon-virtual-dtor -fPIC -fdiagnostics-color=always -O3 -DNDEBUG -UNDEBUG -std=c++17 -MD -MT src/passes/CMakeFiles/passes.dir/Precompute.cpp.o -MF src/passes/CMakeFiles/passes.dir/Precompute.cpp.o.d -o src/passes/CMakeFiles/passes.dir/Precompute.cpp.o -c /build/binaryen/src/binaryen-version_117/src/passes/Precompute.cpp In file included from /build/binaryen/src/binaryen-version_117/src/wasm-traversal.h:30, from /build/binaryen/src/binaryen-version_117/src/pass.h:24, from /build/binaryen/src/binaryen-version_117/src/ir/intrinsics.h:20, from /build/binaryen/src/binaryen-version_117/src/ir/effects.h:20, from /build/binaryen/src/binaryen-version_117/src/passes/Precompute.cpp:30: In copy constructor ‘wasm::SmallVector<wasm::Expression, 10>::SmallVector(const wasm::SmallVector<wasm::Expression, 10>&)’, inlined from ‘constexpr std::pair<_T1, _T2>::pair(const _T1&, const _T2&) [with _U1 = wasm::Select* const; _U2 = wasm::SmallVector<wasm::Expression, 10>; typename std::enable_if<(std::_PCC<true, _T1, _T2>::_ConstructiblePair<_U1, _U2>() && std::_PCC<true, _T1, _T2>::_ImplicitlyConvertiblePair<_U1, _U2>()), bool>::type <anonymous> = true; _T1 = wasm::Select const; _T2 = wasm::SmallVector<wasm::Expression, 10>]’ at /usr/include/c++/13.2.1/bits/stl_pair.h:559:21, inlined from ‘T& wasm::InsertOrderedMap<Key, T>::operator[](const Key&) [with Key = wasm::Select; T = wasm::SmallVector<wasm::Expression, 10>]’ at /build/binaryen/src/binaryen-version_117/src/support/insert_ordered.h:112:29: /build/binaryen/src/binaryen-version_117/src/support/small_vector.h:42:38: error: ‘<unnamed>.wasm::SmallVector<wasm::Expression, 10>::fixed’ is used uninitialized [-Werror=uninitialized] 42 \| template<typename T, size_t N> class SmallVector { \| ^~~~~~~~~~~ In file included from /build/binaryen/src/binaryen-version_117/src/passes/Precompute.cpp:38: /build/binaryen/src/binaryen-version_117/src/support/insert_ordered.h: In function ‘T& wasm::InsertOrderedMap<Key, T>::operator[](const Key&) [with Key = wasm::Select; T = wasm::SmallVector<wasm::Expression, 10>]’: /build/binaryen/src/binaryen-version_117/src/support/insert_ordered.h:112:29: note: ‘<anonymous>’ declared here 112 \| std::pair<const Key, T> kv = {k, {}}; \| ^~