forks/binaryen.git -

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Fix a fuzz issue with scanning heap read types (#5184)	Alon Zakai	2022-11-01	1	-0/+45
\| \| \| \| \| \| \| \| \|	If a heap type only ever appears as the result of a read, we must include it in the analysis in ModuleUtils, even though it isn't written in the binary format. Otherwise analyses using ModuleUtils can error on not finding all types in the list of types. Fixes #5180
*	Fix br_if fallthrough value (#5200)	Alon Zakai	2022-10-31	1	-0/+47
\| \| \| \| \| \| \|	The fallthrough there is trickier because the value is evaluated before the condition. Unlike other fallthroughs, the value is not last, so we need to check if the condition (which is after it) interferes with it.
*	Move removable code in CodePushing (#5187)	Alon Zakai	2022-10-25	1	-0/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is safe since we "partially remove" it: we don't move it to a place it might execute more, but make it possibly execute less. See the new comment for more details. Motivated by wasm GC but this can help wasm MVP as well. In both cases loads from memory can trap, which limits what the VM can do to optimize them past conditions, but in trapsNeverHappens we can do that at the toolchain level: x = read(); if () { .. } use(x); => if () { .. } x = read(); // moved to here, and might not execute if the if did a break/return use(x);
*	[Wasm GC] Support BrOn* in CodePushing (#5177)	Alon Zakai	2022-10-21	1	-0/+80
\|
*	Fix fuzzer to ignore externalize/internalize (#5176)	Alon Zakai	2022-10-21	2	-54/+59
\| \| \| \| \|	The fuzzer started to fail on the recent externalize/internalize test that was added in #5175 as we lack interpreter support. Move that to a separate file and ignore it in the fuzzer for now.
*	[Wasm GC] Externalize/Internalize allow nulls (#5175)	Alon Zakai	2022-10-21	1	-0/+54
\| \| \| \| \|	These are encoded as RefAs operations, and we have optimizations that assume those trap on null, but Externalize/Internalize do not. Skip them there to avoid an error on the type being incorrect later.
*	[Wasm GC] Use Cones in GUFA data reads and writes (#5157)	Alon Zakai	2022-10-19	1	-14/+4
\| \| \| \| \| \| \| \| \| \| \|	When we read from a struct/array using a cone type, read from the types in the cone and nothing else. Previously we used the declared type in the wasm, which might be larger (both in the base type and the depth). Likewise, in a write. To do this, this extends ConeReadLocation with a depth (previously the depth there was assumed to be infinite, and now it is to a potentially limited depth). After this we are fully utilizing cone types in GUFA, as the test changes show (or at least I can't think of any other uses of cones).
*	[Wasm GC] Filter GUFA expression locations by their type (#5149)	Alon Zakai	2022-10-18	1	-0/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that we have a cone type, we are able to represent in PossibleContents the natural content of a wasm location: a type or any of its subtypes. This allows us to enforce the wasm typing rules, that is, to filter the data arriving at a location by the wasm type of the location. Technically this could be unnecessary if we had full implementations of flowFoo and so forth, that is, tailored code for each wasm expression that makes sure we only contain and flow content that fits in the wasm type. Atm we don't have that, and until the wasm spec stabilizes it's probably not worth the effort. Instead, simply filter based on the type, which gives the same result (though it does take a little more work; I measured it at 3% or so of runtime). While doing so normalize cones to their actual maximum depth, which simplifies things and will help more later as well.
*	Parse and emit `array.len` without a type annotation (#5151)	Thomas Lively	2022-10-18	2	-3/+3
\| \| \|	Test that we can still parse the old annotated form as well.
*	[GUFA] Add some tests for #5142 (#5146)	Alon Zakai	2022-10-17	1	-0/+130
\|
*	[Wasm GC][GUFA] Avoid Many in roots (#5142)	Alon Zakai	2022-10-13	1	-13/+13
\| \| \|	Instead of Many, use a proper Cone Type for the data, as appropriate.
*	[Wasm GC] Add GUFA tests for struct reads from cones (#5135)	Alon Zakai	2022-10-13	1	-0/+631
\| \| \|	We had cone tests for ref.eq and ref.cast etc. but not for struct.get.
*	[Wasm GC] [GUFA] Add initial ConeType support (#5116)	Alon Zakai	2022-10-11	1	-69/+306
\| \| \| \| \| \| \| \| \| \| \|	A cone type is a PossibleContents that has a base type and a depth, and it contains all subtypes up to that depth. So depth 0 is an exact type from before, etc. This only adds cone type computations when combining types, that is, when we combine two exact types we might get a cone, etc. This does not yet use the cone info in all places (like struct gets and sets), and it does not yet define roots of cone types, all of which is left for later. IOW this is the MVP of cone types that is just enough to add them + pass tests + test the new functionality.
*	GUFA: Use SSA-style information (#5121)	Alon Zakai	2022-10-07	4	-51/+147
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously we treated each local index as a location, and every local.set to that index could be read by every local.get. With this we connect only relevant sets to gets. Practically speaking, this removes LocalLocation which is what was just described, and instead there is ParamLocation for incoming parameter values. And local.get/set use normal ExpressionLocations to connect a set to a get. I was worried this would be slow, since computing LocalGraph takes time, but it actually more than makes up for itself on J2Wasm and we are faster actually rocket I guess since we do less updating after local.sets. This makes a noticeable change on the J2Wasm binary, and perhaps will help with benchmarks.
*	Implement bottom heap types (#5115)	Thomas Lively	2022-10-07	40	-1164/+1328
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These types, `none`, `nofunc`, and `noextern` are uninhabited, so references to them can only possibly be null. To simplify the IR and increase type precision, introduce new invariants that all `ref.null` instructions must be typed with one of these new bottom types and that `Literals` have a bottom type iff they represent null values. These new invariants requires several additional changes. First, it is now possible that the `ref` or `target` child of a `StructGet`, `StructSet`, `ArrayGet`, `ArraySet`, or `CallRef` instruction has a bottom reference type, so it is not possible to determine what heap type annotation to emit in the binary or text formats. (The bottom types are not valid type annotations since they do not have indices in the type section.) To fix that problem, update the printer and binary emitter to emit unreachables instead of the instruction with undetermined type annotation. This is a valid transformation because the only possible value that could flow into those instructions in that case is null, and all of those instructions trap on nulls. That fix uncovered a latent bug in the binary parser in which new unreachables within unreachable code were handled incorrectly. This bug was not previously found by the fuzzer because we generally stop emitting code once we encounter an instruction with type `unreachable`. Now, however, it is possible to emit an `unreachable` for instructions that do not have type `unreachable` (but are known to trap at runtime), so we will continue emitting code. See the new test/lit/parse-double-unreachable.wast for details. Update other miscellaneous code that creates `RefNull` expressions and null `Literals` to maintain the new invariants as well.
*	Strip em_js_deps exports (#5109)	Sam Clegg	2022-10-04	1	-12/+28
\| \| \|	These are only needed for the metadata extraction in emcc.
*	Simplify and fix heap type counting (#5110)	Thomas Lively	2022-10-04	7	-24/+30
\| \| \| \| \|	Annotations on array.get and array.set were not being counted and the code could generally be simplified since `count` already ignores types that don't need to be counted.
*	Make Asyncify work with wasm64 (#5105)	Sam Clegg	2022-10-04	1	-0/+1335
\| \| \| \| \|	The emscripten side is a little tricky but I've got some tests passing. Currently blocked on: https://github.com/emscripten-core/emscripten/issues/17969
*	Fix handling of unreachable selects in Directize (#5098)	Alon Zakai	2022-09-30	1	-0/+36
\| \| \| \|	We ignored only unreachable conditions, but we must ignore the arms as well, or else we could error.
*	[GUFA] Add some tests (#5090)	Alon Zakai	2022-09-28	1	-2/+197
\| \| \| \| \| \| \|	These just add more test coverage of situations I've seen during some recent debugging, that I realized we lack coverage for. The first function here would have detected the bug fixed in #5089
*	[GUFA] Fix haveIntersection on comparing nullable with non-nullable (#5089)	Alon Zakai	2022-09-28	1	-0/+11
\| \| \| \| \|	We compared types and not heap types, so a difference in nullability confused us. But at that point in the code, we've ruled out nulls, so we should focus on heap types only.
*	[GUFA] Optimize functions not taken by reference better (#5085)	Alon Zakai	2022-09-26	1	-2/+34
\| \| \| \| \| \| \| \| \|	This moves the logic to add connections from signatures to functions from the top level into the RefFunc logic. That way we only add those connections to functions that actually have a RefFunc, which avoids us thinking that a function without one can be reached by a call_ref of its type. Has a small but non-zero benefit on j2wasm.
*	[GUFA] Infer a RefEq value of 0 when possible (#5081)	Alon Zakai	2022-09-26	1	-0/+291
\| \| \| \|	If the PossibleContents for the two sides have no possible intersection then the result must be 0.
*	Emit call_ref with a type annotation (#5079)	Thomas Lively	2022-09-23	13	-43/+45
\| \| \| \| \| \| \|	Emit call_ref instructions with type annotations and a temporary opcode. Also implement support for parsing optional type annotations on call_ref in the text and binary formats. This is part of a multi-part graceful update to switch Binaryen and all of its users over to using the type-annotated version of call_ref without there being any breakage.
*	Add a type annotation to return_call_ref (#5068)	Thomas Lively	2022-09-22	3	-15/+27
\| \| \| \| \| \|	The GC spec has been updated to have heap type annotations on call_ref and return_call_ref. To avoid breaking users, we will have a graceful, multi-step upgrade to the annotated version of call_ref, but since return_call_ref has no users yet, update it in a single step.
*	Correctly handle escapes in string constants (#5070)	Thomas Lively	2022-09-22	1	-1/+1
\| \| \| \| \| \| \|	Previously when we parsed `string.const` payloads in the text format we were using the text strings directly instead of un-escaping them. Fix that parsing, and while we're editing the code, also add support for the `\r` escape allowed by the spec. Remove a spurious nested anonymous namespace and spurious `static`s in Print.cpp as well.
*	[GUFA] Optimize ref.test (#5067)	Alon Zakai	2022-09-22	1	-2/+166
\| \| \| \| \| \|	Similar to ref.cast slightly, but simpler. Also update some TODO text.
*	[OptimizeInstruction] Prevent reordering for rule in #5034 (#5066)	Max Graey	2022-09-21	1	-0/+12
\|
*	Add wasm64 support in OptimizeAddedConstants (#5043)	Axis	2022-09-21	1	-0/+57
\| \| \|	This lets that pass optimize 64-bit offsets on memory64 loads and stores.
*	[OptimizeInstructions] Simplify add / sub with negative on LHS or RHS for ↵	Max Graey	2022-09-20	1	-0/+445
\| \| \| \| \| \| \| \| \|	floating points (#5034) ``` (-x) + y -> y - x x + (-y) -> x - y x - (-y) -> x + y ```
*	[Strings] Add missing String effects + tests (#5057)	Alon Zakai	2022-09-19	1	-0/+549
\| \| \|	Also fix some formatting issue in the file.
*	Vacuum: Ignore effects at the entire function scope when possible (#5053)	Alon Zakai	2022-09-19	4	-11/+115
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recently we added logic to ignore effects that don't "escape" past the function call. That is, e.g. local.set only affects the current function scope, and once the call stack is unwound it no longer matters as an effect. This moves that logic to a shared place, and uses it in the core Vacuum logic. The new constructor in EffectAnalyzer receives a function and then scans it as a whole. This works just like e.g. scanning a Block as a whole (if we see a break in the block, that has an effect only inside it, and the Block + children doesn't have a branch effect). Various tests are updated so they don't optimize away trivially, by adding new return values for them.
*	Effects: Clarify trap effect meaning, and consider infinite loops to trap ↵	Alon Zakai	2022-09-16	1	-31/+181
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	due to timeout (#5039) I think this simplifies the logic behind what we consider to trap. Before we had kind of a hack in visitLoop that now has a more clear reasoning behind it: we consider as trapping things that trap in all VMs all the time, or will eventually. So a single allocation doesn't trap, but an unbounded amount can, and an infinite loop is considered to trap as well (a timeout in a VM will be hit eventually, somehow). This means we cannot optimize way a trivial infinite loop with no effects in it, while (1) {} But we can optimize it out in trapsNeverHappen mode. In any event, such a loop is not a realistic situation; an infinite loop with some other effect in it, like a call to an import, will not be optimized out, of course. Also clarify some other things regarding traps and trapsNeverHappen following recent discussions in https://github.com/emscripten-core/emscripten/issues/17732 Specifically, TNH will never be allowed to remove calls to imports.
*	JPSI - Support re-entering a suspended Wasm module. (#5044)	Brendan Dahl	2022-09-16	1	-3/+45
\| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes: https://github.com/emscripten-core/emscripten/issues/17846 More detailed explanation of the issue from Thibaud: - A promising export is entered, generating a suspender s1, which is stored in the global - The wasm code calls a wrapped import, passing it the value in the global (s1) and suspends - Another export is entered, generating suspender s2, which is stored in the global - We call another wrapped import, which suspends s2 (so far so good) - We return to the event loop and s1 is resumed And now we are in an inconsistent state: the active suspender is "s1", but the object in the global is "s2". So the next time we call a wrapped import, there is a mismatch, which is what this runtime error reports.
*	Vacuum trivial trys (#5046)	Alon Zakai	2022-09-16	1	-0/+39
\| \| \| \|	A try whose body throws, and does nothing else, and the try catches that exception, can be removed.
*	Allow optimizing with global function effects (#5040)	Alon Zakai	2022-09-16	1	-0/+323
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a map of function name => the effects of that function to the PassOptions structure. That lets us compute those effects once and then use them in multiple passes afterwards. For example, that lets us optimize away a call to a function that has no effects: (drop (call $nothing)) [..] (func $nothing ;; .. lots of stuff but no effects, only a returned value .. ) Vacuum will remove that dropped call if we tell it that the called function has no effects. Note that a nice result of adding this to the PassOptions struct is that all passes will use the extra info automatically. This is not enabled by default as the benefits seem rather minor, though it does help in a small but noticeable way on J2Wasm code, where we use call.without.effects and have situations like this: (func $foo (call $bar) ) (func $bar (call.without.effects ..) ) The call to bar looks like it has effects, normally, but with global effect info we know it actually doesn't. To use this, one would do --generate-global-effects [.. some passes that use the effects ..] --discard-global-effects Discarding is not necessary, but if there is a pass later that adds effects, then not discarding could lead to bugs, since we'd think there are fewer effects than there are. (However, normal optimization passes never add effects, only remove them.) It's also possible to call this multiple times: --generate-global-effects -O3 --generate-global-effects -O3 That computes affects after the first -O3, and may find fewer effects than earlier. This doesn't compute the full transitive closure of the effects across functions. That is, when computing a function's effects, we don't look into its own calls. The simple case so far is enough to handle the call.without.effects example from before (though it may take multiple optimization cycles).
*	[OptimizeInstructions] More canonizations for floating points (#5033)	Max Graey	2022-09-15	1	-6/+103
\| \| \| \| \| \| \| \|	x - C -> x + (-C) min(C, x) -> min(x, C) max(C, x) -> max(x, C) And remove redundant rules
*	[Exceptions] Optimize in CodePushing even with exceptions thrown (#5028)	Alon Zakai	2022-09-13	1	-1/+70
\| \| \| \| \| \| \| \| \| \|	We had some concerns about this not working in the past, but thinking about it now, I believe it is safe to do. Specifically, a throw is either like a break or a return - either it jumps out to an outer scope (like a break) or it jumps out of the function (like a return), and both breaks and returns have already been handled here. This change has some nice effects on J2Wasm output, where there are quite a lot of throws, which we can now optimize around.
*	OptimizeInstructions: Use min/max bits in comparisons (#5035)	Alon Zakai	2022-09-13	1	-53/+443
\| \| \| \| \| \| \|	When we see e.g. x < y and x has fewer bits set, we can infer a result. Helps #5010. As mentioned there, this is one of the top superoptimizer findings. On j2wasm it ends up removing a few hundred binary operations for example.
*	[OptimizeInstructions] Simplify floating point ops with NaN on right side ↵	Max Graey	2022-09-12	1	-16/+198
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(#4985) x + nan -> nan' x - nan -> nan' x * nan -> nan' x / nan -> nan' min(x, nan) -> nan' max(x, nan) -> nan' where nan' is canonicalized nan of rhs x != nan -> 1 x == nan -> 0 x >= nan -> 0 x <= nan -> 0 x > nan -> 0 x < nan -> 0
*	Remove typed-function-references feature (#5030)	Thomas Lively	2022-09-09	3	-19/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In practice typed function references will not ship before GC and is not independently useful, so it's not necessary to have a separate feature for it. Roll the functionality previously enabled by --enable-typed-function-references into --enable-gc instead. This also avoids a problem with the ongoing implementation of the new GC bottom heap types. That change will make all ref.null instructions in Binaryen IR refer to one of the bottom heap types. But since those bottom types are introduced in GC, it's not valid to emit them in binaries unless unless GC is enabled. The fix if only reference types is enabled is to emit (ref.null func) instead of (ref.null nofunc), but that doesn't always work if typed function references are enabled because a function type more specific than func may be required. Getting rid of typed function references as a separate feature makes this a nonissue.
*	OptimizeInstructions: Optimize comparisons with an added offset (#5025)	Alon Zakai	2022-09-09	1	-0/+357
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	E.g. x + C1 > C2 ==> x > (C2-C1) We do need to be careful of overflows in either the add on the left or the proposed subtract on the right. In the latter case, we can at least do x + C1 > C2 ==> x + (C1-C2) > 0 Helps #5008 (but more patterns remain). Found by the superoptimizer #4994. This was the top suggestion for Java and Dart.
*	[Effects] Fix hasAnything on mutable global state (#5026)	Alon Zakai	2022-09-08	1	-0/+52
\| \| \| \| \|	We explicitly wrote out memory, table, and globals, but did not add structs. This switches us to use readsMutableGlobalState which has the full list of all relevant global state, including the memory, table, and globals as well as structs.
*	Switch to i32 operations when heading to a wrap anyhow (#5022)	Alon Zakai	2022-09-07	1	-48/+173
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	E.g. if we just do addition etc., then any higher bits will be wrapped out anyhow: int32_t(int64_t(x) + int64_t(10)) => x + int32_t(10) Found by the superoptimizer #4994 . This is by far the most promising suggestion it had. Interestingly, it mainly helps Go, where it removes 20% of all Unary operations (the extends and wraps), and Rust, where it removes 3%. Helps #5004. This handles the common cases I see in the superoptimizer output, but there are more that could be handled.
*	[Wasm GC] Fix GlobalTypeOptimization fuzz bug on replacing unreachable ↵	Alon Zakai	2022-09-06	1	-2/+31
\| \| \| \| \| \|	struct.set (#5021) We replaced an unreachable struct.set with something reachable, which can break validation in corner cases.
*	[OptimizeInstructions] Simplify two binary expressions with asymmetric ↵	Max Graey	2022-09-06	2	-36/+270
\| \| \| \| \| \| \| \| \| \| \| \|	shifts and same constant (#4996) (x >> C) << C -> x & -(1 << C) (x >>> C) << C -> x & -(1 << C) (x << C) >>> C -> x & (-1 >>> C) // (x << C) >> C doesn't support Found by the superoptimizer #4994 Fixes #5012
*	Add JavaScript promise integration (JSPI) pass. (#4961)	Brendan Dahl	2022-09-02	1	-0/+124
\| \| \| \| \| \| \|	Add a pass that wraps all imports and exports with functions that handle storing and passing along the suspender externref needed for JSPI. https://github.com/WebAssembly/js-promise-integration/blob/main/proposals/js-promise-integration/Overview.md
*	OptimizeInstructions: Select => and/or in more cases (#4154)	Alon Zakai	2022-09-01	3	-16/+125
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	x ? 0 : y ==> z & y where z = !x x ? y : 1 ==> z \| y where z = !x Only do this when we have z = !x, that is, we can invert x without adding an actual eqz (which would add work). To do this, canonicalize selects to prefer to flip the arms, when possible, if it would move a constant to a location that the existing optimizations already turn into an and/or. That is, x >= 5 ? 0 : y != 42 would be canonicalized into x < 5 ? y != 42 : 0 and existing opts turn that into (x < 5) & (y != 42) The canonicalization does not always help this optimization, as we need the values to be boolean to do this, but canonicalizing is still nice to get more regular code which might compress slightly better.
*	[Wasm GC] Support non-nullable locals in the "1a" form (#4959)	Alon Zakai	2022-08-31	20	-227/+619
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	An overview of this is in the README in the diff here (conveniently, it is near the top of the diff). Basically, we fix up nn locals after each pass, by default. This keeps things easy to reason about - what validates is what is valid wasm - but there are some minor nuances as mentioned there, in particular, we ignore nameless blocks (which are commonly added by various passes; ignoring them means we can keep more locals non-nullable). The key addition here is LocalStructuralDominance which checks which local indexes have the "structural dominance" property of 1a, that is, that each get has a set in its block or an outer block that precedes it. I optimized that function quite a lot to reduce the overhead of running that logic after each pass. The overhead is something like 2% on J2Wasm and 0% on Dart (0%, because in this mode we shrink code size, so there is less work actually, and it balances out). Since we run fixups after each pass, this PR removes logic to manually call the fixup code from various places we used to call it (like eh-utils and various passes). Various passes are now marked as requiresNonNullableLocalFixups => false. That lets us skip running the fixups after them, which we normally do automatically. This helps avoid overhead. Most passes still need the fixups, though - any pass that adds a local, or a named block, or moves code around, likely does. This removes a hack in SimplifyLocals that is no longer needed. Before we worked to avoid moving a set into a try, as it might not validate. Now, we just do it and let fixups happen automatically if they need to: in the common code they probably don't, so the extra complexity seems not worth it. Also removes a hack from StackIR. That hack tried to avoid roundtrip adding a nondefaultable local. But we have the logic to fix that up now, and opts will likely keep it non-nullable as well. Various tests end up updated here because now a local can be non-nullable - previous fixups are no longer needed. Note that this doesn't remove the gc-nn-locals feature. That has been useful for testing, and may still be useful in the future - it basically just allows nn locals in all positions (that can't read the null default value at the entry). We can consider removing it separately. Fixes #4824
*	Fix test comments (#4992)	Max Graey	2022-08-30	1	-2/+2
\| \| \|	Followup to #4282