forks/binaryen.git -

	Commit message (Collapse)	Author	Age	Files	Lines
*	[Wasm GC] Fix TypeRefining on fallthrough values via tee (#4900)	Alon Zakai	2022-08-18	1	-0/+178
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A rather tricky corner case: we normally look at fallthrough values for copies of fields, so when we try to refine a field, we ignore stuff like this: a.x = b.x; That copies the same field on the same type to itself, so refining is not limited by it. But if we have something else in the middle, and that thing cannot change type, then it is a problem, like this: (struct.set (..ref..) (local.tee $temp (struct.get))) tee has the type of the local, which does not change in this pass. So we can't look at just the fallthrough here and skip the tee: after refining the field, the tee's old type might not fit in the field's new type. We could perhaps add casts to fix things up, but those may have too big a cost. For now, just ignore the fallthrough.
*	Fix DeadArgumentElimination + TrapsNeverHappen to not leave stale types (#4910)	Alon Zakai	2022-08-18	2	-4/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DAE will normally not remove an unreachable parameter, because it checks for effects there. But in TrapsNeverHappen mode, we assume that an unreachable is an effect we can remove, so we are willing to remove it: (func $foo (param $unused i32) ;; never use $unused ) (func $bar (call $foo (unreachable))) ;;=> dae+tnh (func $foo ) (func $bar (call $foo)) But that transformation is invalid: the call's type was unreachable before but no longer is. What went wrong here is that, yes, it is valid to remove an unreachable, but we may need to update types while doing so, which we were not doing. This wasn't noticed before due to a combination of unfortunate factors: The main reason is that this only happens in TrapsNeverHappens mode. We don't fuzz that, because it's difficult: that mode can assume a trap never happens, so a trap is undefined behavior really. On real-world code this is great, but in the fuzzer it means that the output can seem to change after optimizations. The validator happened to be missing an error for a call that has type unreachable but shouldn't: Validator: Validate unreachable calls more carefully #4909 . Without that, we'd only get an error if the bad type influenced a subsequent pass in a confusing way - which is possible, but difficult to achieve (what ended up happening in practice is that SignatureRefining on J2Wasm relied on the unreachable and refined a type too much). Even with that fix, for the problem to be detected we'd need for the validation error to happen in the final output, after running all the passes. In practice, though, that's not likely, since other passes tend to remove unreachables etc. Pass-debug mode is very useful for finding stuff like this, as it validates after every individual pass. Sadly it turns out that global validation was off there: Validator: Validate globally by default #4906 (so it was catching the 99% of validation errors that are local, but this particular error was in the remaining 1%...). As a fix, simply ignore this case. It's not really worth the effort to optimize it, since DCE will just remove unreachables like that anyhow. So if we run again after a DCE we'd get a chance to optimize. This updates some existing tests to avoid (unreachable). That was used as an example of something with effects, but after this change it is treated more carefully. Replace those things with something else that has effects (a call).
*	Avoid emitting a block in the binary format when it has no name (#4912)	Alon Zakai	2022-08-18	1	-13/+6
\| \| \| \| \| \| \| \| \| \|	We already did this if the block was a child of a control flow structure, which is the common case (see the new added comment around that code, which clarifies why). This does the same for all other blocks. This is simple to do and a minor optimization, but the main benefit from this is just to make our handling of blocks uniform: after this, we never emit a block with no name. This will make 1a non- nullable locals easier to handle (since they will be able to assume that property; and not emitting such blocks avoids some work to handle non-nullable locals in them).
*	Restore the `extern` heap type (#4898)	Thomas Lively	2022-08-17	25	-508/+511
\| \| \| \| \| \| \|	The GC proposal has split `any` and `extern` back into two separate types, so reintroduce `HeapType::ext` to represent `extern`. Before it was originally removed in #4633, externref was a subtype of anyref, but now it is not. Now that we have separate heaptype type hierarchies, make `HeapType::getLeastUpperBound` fallible as well.
*	Mutli-Memories Support in IR (#4811)	Ashley Nelson	2022-08-17	49	-591/+3103
\| \| \| \| \| \| \|	This PR removes the single memory restriction in IR, adding support for a single module to reference multiple memories. To support this change, a new memory name field was added to 13 memory instructions in order to identify the memory for the instruction. It is a goal of this PR to maintain backwards compatibility with existing text and binary wasm modules, so memory indexes remain optional for memory instructions. Similarly, the JS API makes assumptions about which memory is intended when only one memory is present in the module. Another goal of this PR is that existing tests behavior be unaffected. That said, tests must now explicitly define a memory before invoking memory instructions or exporting a memory, and memory names are now printed for each memory instruction in the text format. There remain quite a few places where a hardcoded reference to the first memory persist (memory flattening, for example, will return early if more than one memory is present in the module). Many of these call-sites, particularly within passes, will require us to rethink how the optimization works in a multi-memories world. Other call-sites may necessitate more invasive code restructuring to fully convert away from relying on a globally available, single memory pointer.
*	[Strings] Fix up strings.wast (#4899)	Alon Zakai	2022-08-16	1	-12/+14
\|
*	Validator: Validate intrinsics (#4880)	Alon Zakai	2022-08-16	1	-0/+23
\| \| \| \| \| \| \| \| \| \|	call.without.effects has a specific form, where the last parameter is a function reference, and that function reference must have the right type for the other parameters if called with them: (call $call.without.effects (..i32..) (..f64..) (..function reference, which takes params i32 and f64..)
*	LegalizeJSInterface: Look for get/setTempRet0 as exports (#4881)	Sam Clegg	2022-08-15	1	-0/+73
\| \| \| \| \| \|	This allows emscripten to move these helper functions from JS library imports to native wasm exports. See https://github.com/emscripten-core/emscripten/issues/7273
*	Fix name of port_passes_tests_to_lit.py script. NFC (#4902)	Sam Clegg	2022-08-12	68	-68/+68
\| \| \|	I was reading these tests and failing to find the names script.
*	Port test/passes/legalize-js-interface* to lit (#4903)	Sam Clegg	2022-08-12	9	-291/+322
\| \| \| \| \|	Also, add support for the `--binaryen-bin` flag to `scripts/port_passes_tests_to_lit.py`. This is needed for folks who don't do in-tree builds.
*	[EH] Pop should be supertype of tag type (#4901)	Heejin Ahn	2022-08-11	4	-13/+13
\| \| \| \|	`pop`s type should be a supertype, not a subtype, of the tag's type within `catch`.
*	[Strings] Fix string.new_wtf16_array (#4894)	Alon Zakai	2022-08-10	1	-0/+2
\| \| \|	Like the 8-bit array variants, it takes 3 parameters.
*	[Strings] Linear memory string operations should emit a memory index (#4893)	Alon Zakai	2022-08-10	1	-2/+6
\| \| \| \| \| \| \|	For now this index is always 0, but we must emit it. Also clean up the wat test a little - we don't have validation yet, but we should not validate without a memory in that file.
*	[Strings] string.new.array methods have start:end arguments (#4888)	Alon Zakai	2022-08-09	1	-0/+14
\|
*	RedundantSetElimination: ReFinalize when needed (#4877)	Alon Zakai	2022-08-09	1	-0/+34
\|
*	SimplifyLocals: ReFinalize when needed (#4878)	Alon Zakai	2022-08-09	1	-0/+38
\|
*	[Wasm GC] Fix SignaturePruning on CallWithoutEffects (#4882)	Alon Zakai	2022-08-08	1	-0/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	call.without.effects will turn into a normal call of the last parameter later, (call $call.without.effects A B (ref.func $foo) ) ;; => intrinsic lowering (call $foo A B ) SignaturePruning needs to be aware of that: we can't remove a parameter from $foo without also updating relevant calls to $call.without.effects. Rather than handle that, just skip such cases, and leave them to be optimized after intrinsics are lowered away.
*	[GUFA] Fix readFromData on a function literal (#4883)	Alon Zakai	2022-08-08	1	-0/+54
\| \| \| \| \| \| \|	A function literal (ref.func) should never reach a struct or array get, but if there is a cast then it can look like they might arrive. We filter in ref.cast which avoids that (since casting a function to a data type will trap), but there is also br_on_cast which is not yet optimized. This PR adds code to avoid an assert in readFromData in that case.
*	[Optimize Instructions] Fold eqz(eqz(x)) to not-equal of zero (#4855)	Max Graey	2022-08-08	1	-8/+26
\| \| \| \| \| \|	eqz(eqz(i32(x))) -> i32(x) != 0 eqz(eqz(i64(x))) -> i64(x) != 0 Only when shrinkLevel == 0 (prefer speed over binary size).
*	Remove metadata generation from wasm-emscripten-finalize (#4863)	Sam Clegg	2022-08-07	41	-1082/+1
\| \| \| \|	This is no longer needed by emscripten as of: https://github.com/emscripten-core/emscripten/pull/16529
*	wasm-emscripten-finalize: Remove em_js/em_asm start/stop symbols when ↵	Sam Clegg	2022-08-05	1	-2/+0
\| \| \| \| \| \| \| \|	stripping data segments. (#4876) This avoid a fatal crash in `--post-emscripten` where it tries to remove data that is no longer part of the file. This fixes bug introduced by #4871 that causes emscripten tests to fail.
*	Remove RTTs (#4848)	Thomas Lively	2022-08-05	44	-2957/+798
\| \| \| \| \| \| \|	RTTs were removed from the GC spec and if they are added back in in the future, they will be heap types rather than value types as in our implementation. Updating our implementation to have RTTs be heap types would have been more work than deleting them for questionable benefit since we don't know how long it will be before they are specced again.
*	Cleanup em_asm/em_js strings as part of PostEmscripten (#4871)	Sam Clegg	2022-08-04	13	-72/+93
\| \| \| \|	Rather than doing it as a side effect of dumping the metadata in wasm-emscripten-finalize.
*	[C-API] Add type builder C-API (#4803)	dcode	2022-08-04	2	-0/+193
\| \| \|	Introduces the necessary APIs to use the type builder from C. Enables construction of compound heap types (arrays, structs and signatures) that may be recursive, including assigning concrete names to the built types and, in case of structs, their fields.
*	Remove support for parsing `let` (#4864)	Thomas Lively	2022-08-03	3	-122/+0
\| \| \| \| \|	It has been removed from the typed function references proposal, so we no longer need to support it. Maintaining the test for `let` was difficult because Binaryen could not emit either text or binary that actually used it.
*	Re-run scripts/test/generate_lld_tests.py. NFC (#4861)	Sam Clegg	2022-08-02	18	-104/+132
\|
*	[Optimize Instructions] Refactor squared rules (#4840)	Max Graey	2022-08-02	1	-17/+242
\| \| \| \| \| \| \| \| \| \| \| \|	+ Move these rules to separate function; + Refactor them to use matches; + Add comments; + Handle rotational shifts as well; + Handle overflows for `<<`, `>>`, `>>>` shifts; + Add mixed rotate rules: ```rust rotl(rotr(x, C1), C2) => rotr(x, C1 - C2) rotr(rotl(x, C1), C2) => rotl(x, C1 - C2) ```
*	Update reference type Literal constructors to use HeapType (#4857)	Thomas Lively	2022-08-01	1	-1/+1
\| \| \| \| \| \|	We already require non-null literals to have non-null types, but with this change we can enforce that constraint by construction. Also remove the default behavior of creating a function reference literal with heap type `func`, since there is always a more specific function type to use.
*	Add interpreter support for intrinsics (#4851)	Alon Zakai	2022-08-01	1	-0/+24
\| \| \|	This can give us some chance to catch bugs like #4839 in the fuzzer.
*	[GUFA] Handle GUFA + Intrinsics (#4839)	Alon Zakai	2022-08-01	1	-0/+146
\| \| \| \| \| \| \| \|	Like RemoveUnusedModuleElements, places that build graphs of function reachability must special-case the call-without-effects intrinsic. Without that, it looks like a call to an import. Normally a call to an import is fine - it makes us be super-pessimistic, as we think things escape all the way out - but in GC for now we are assuming a closed world, and so we end up broken. To fix that, properly handle the intrinsic case.
*	[C/JS API] Add string reference types (#4810)	dcode	2022-07-27	4	-0/+56
\|
*	[Strings] Add interpreter stubs for string instructions (#4835)	Alon Zakai	2022-07-26	1	-1/+4
\| \| \| \| \| \| \| \| \|	The stubs let precompute skip over them without erroring. With this PR we can run the optimizer on strings code. We still can't run --fuzz-exec though, so we can't run the fuzzer. Also simplify the error strings in the earlier part of the file. All other code just has "unimp" so we might as well do the same and not mention full names there.
*	Make `GlobalTypeRewriter` work for isorecursive types (#4829)	Thomas Lively	2022-07-26	2	-0/+348
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are two new potential problems that `GlobalTypeRewriter` can run into when working with isorecursive types instead of nominal types. First, the refined types may have replaced generic references with references to specific other types, potentially creating new recursions and making the existing recursion groups insufficient. Second, distinct types may be refined to structurally identical types and those distinct input types may map the same output type, potentially changing cast behavior. Both of these problems are solved by putting all the new types in a single large recursion group. We do not currently account for the fact that types may be used in the external interface of the module, but when we do, externalized types will be excluded from optimizations and will not be affected by the creation of this single large rec group. Fixes #4816.
*	[C/JS API] Expose string reference feature (#4831)	Max Graey	2022-07-26	2	-0/+2
\|
*	[OptimizeInstructions] Add folding for mixed left shift and mul with ↵	Max Graey	2022-07-26	1	-0/+60
\| \| \| \| \| \|	constants on RHS (#4808) (x * C1) << C2 -> x * (C1 << C2) (x << C1) * C2 -> x * (C2 << C1)
*	[Wasm GC] i31get can trap (#4825)	Alon Zakai	2022-07-25	1	-0/+21
\|
*	[wasm-split] Add --print-profile option (#4771)	sps-gold	2022-07-25	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are several reasons why a function may not be trained in deterministically. So to perform quick validation we need to inspect profile.data (another ways requires split to be performed). However as profile.data is a binary file and is not self sufficient, so we cannot currently use it to perform such validation. Therefore to allow quick check on whether a particular function has been trained in, we need to dump profile.data in a more readable format. This PR, allows us to output, the list of functions to be kept (in main wasm) and those split functions (to be moved to deferred.wasm) in a readable format, to console. Added a new option `--print-profile` - input path to orig.wasm (its the original wasm file that will be used later during split) - input path to profile.data that we need to output optionally pass `--unescape` to unescape the function names Usage: ``` binaryen\build>bin\wasm-split.exe test\profile_data\MY.orig.wasm --print-profile=test\profile_data\profile.data > test\profile_data\out.log ``` note: meaning of prefixes `+` => fn to be kept in main wasm `-` => fn to be split and moved to deferred wasm
*	[Wasm GC] Properly represent nulls in i31 (#4819)	Alon Zakai	2022-07-25	2	-1/+107
\| \| \| \| \|	The encoding here is simple: we store i31 values in the literal.i32 field. The top bit says if a value exists, which means literal.i32 == 0 is the same as null.
*	[C API] Fix printf-related warnings when compiling C tests (NFC) (#4821)	dcode	2022-07-25	2	-15/+15
\|
*	Grand Unified Flow Analysis (GUFA) (#4598)	Alon Zakai	2022-07-22	8	-0/+7486
\| \| \| \| \| \| \| \| \| \| \| \| \|	This tracks the possible contents in the entire program all at once using a single IR. That is in contrast to say DeadArgumentElimination of LocalRefining etc., all of whom look at one particular aspect of the program (function params and returns in DAE, locals in LocalRefining). The cost is to build up an entire new IR, which takes a lot of new code (mostly in the already-landed PossibleContents). Another cost is this new IR is very big and requires a lot of time and memory to process. The benefit is that this can find opportunities that are only obvious when looking at the entire program, and also it can track information that is more specialized than the normal type system in the IR - in particular, this can track an ExactType, which is the case where we know the value is of a particular type exactly and not a subtype.
*	[Test] Refactor C API kitchen sink test's Memory usage (#4815)	Alon Zakai	2022-07-21	2	-19/+19
\|
*	[Strings] GC variants for string.encode (#4817)	Alon Zakai	2022-07-21	1	-2/+55
\|
*	Remove basic reference types (#4802)	Thomas Lively	2022-07-20	17	-240/+205
\| \| \| \| \| \| \| \| \|	Basic reference types like `Type::funcref`, `Type::anyref`, etc. made it easy to accidentally forget to handle reference types with the same basic HeapTypes but the opposite nullability. In principle there is nothing special about the types with shorthands except in the binary and text formats. Removing these shorthands from the internal type representation by removing all basic reference types makes some code more complicated locally, but simplifies code globally and encourages properly handling both nullable and non-nullable reference types.
*	[Strings] Add string.new GC variants (#4813)	Alon Zakai	2022-07-19	1	-1/+51
\|
*	[Strings] stringview_wtf16.length (#4809)	Alon Zakai	2022-07-18	1	-0/+21
\| \| \| \|	This measures the length of a view, so it seems simplest to make it a sub-operation of the existing measure instruction.
*	[Strings] stringview_*.slice (#4805)	Alon Zakai	2022-07-15	1	-2/+50
\| \| \| \| \| \| \|	Unfortunately one slice is the same as python [start:end], using 2 params, and the other slice is one param, [CURR:CURR+num] (where CURR is implied by the current state in the iter). So we can't use a single class here. Perhaps a different name would be good, like slice vs substring (like JS does), but I picked names to match the current spec.
*	[Wasm GC] Check if ref.eq inputs can possibly be the same (#4780)	Alon Zakai	2022-07-14	1	-4/+290
\| \| \| \| \|	For them to be the same we must have a value that can appear on both sides. If the heap types disallow that, then only null is possible, and if that is impossible as well then the result must be 0.
*	[C-API] Add utility to go between types and heap types (#4792)	dcode	2022-07-14	1	-0/+9
\|
*	[Wasm GC] GTO should not reorder trapping of removed sets (#4801)	Alon Zakai	2022-07-13	1	-18/+124
\| \| \| \| \|	Minor fuzz bug. When we replace a struct.set with its children we also add a ref.as_non_null on the reference, but that must not occur before effects in the other child.
*	[Strings] stringview access operations (#4798)	Alon Zakai	2022-07-13	1	-2/+73
\|