forks/binaryen.git -

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[WasmGC] Heap2Local: Optimize RefIs and RefTest (#6705)	Alon Zakai	2024-07-11	1	-13/+376
\|
*	[StackIR] Allow StackIR to be disabled from the commandline (#6725)	Alon Zakai	2024-07-10	1	-0/+66
\| \| \| \| \| \| \| \| \|	Normally we use it when optimizing (above a certain level). This lets the user prevent it from being used even then. Also add optimization options to wasm-metadce so that this is possible there as well and not just in wasm-opt (this also opens the door to running more passes in metadce, which may be useful later).
*	StackIR: Optimize away a drop before an unreachable (#6719)	Alon Zakai	2024-07-08	1	-0/+118
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Anything else right before an unreachable is removed by the main DCE pass anyhow, but because of the structured form of BinaryenIR we can't remove a drop. That is, this is the difference between (i32.eqz (i32.const 42) (unreachable) ) and (drop (call $foo) ) (unreachable) In both cases the unreachable is preceded by something we don't need, but in the latter case it must remain in BinaryenIR for validation. To optimize this, add a rule in StackIR. Fixes #6715
*	Rename external conversion instructions (#6716)	Jérôme Vouillon	2024-07-08	5	-36/+36
\| \| \| \| \| \| \| \| \|	Rename instructions `extern.internalize` into `any.convert_extern` and `extern.externalize` into `extern.convert_any` to follow more closely the spec. This was changed in https://github.com/WebAssembly/gc/issues/432. The legacy name is still accepted in text inputs and in the C and JS APIs.
*	[DebugInfo] Add debug info to the values emitted in GlobalStructInference ↵	Alon Zakai	2024-07-02	1	-0/+73
\| \| \| \| \| \| \| \| \|	(#6709) Previously the replacement select got the debug info, but we should also copy it to the values, as often optimizations lead to one of those values remaining by itself. Similar to #6652 in general form.
*	ConstantFieldPropagation: Add a variation that picks between 2 values using ↵	Alon Zakai	2024-06-27	2	-0/+1462
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	RefTest (#6692) CFP focuses on finding when a field always contains a constant, and then replaces a struct.get with that constant. If we find there are two constant values, then in some cases we can still optimize, if we have a way to pick between them. All we have is the struct.get and its reference, so we must use a ref.test: (struct.get $T x (..ref..)) => (select (..constant1..) (..constant2..) (ref.test $U (..ref..)) ) This is valid if, of all the subtypes of $T, those that pass the test have constant1 in that field, and those that fail the test have constant2. For example, a simple case is where $T has two subtypes, $T is never created itself, and each of the two subtypes has a different constant value. This is a somewhat risky operation, as ref.test is not necessarily cheap. To mitigate that, this is a new pass, --cfp-reftest that is not run by default, and also we only optimize when we can use a ref.test on what we think will be a final type (because ref.test on a final type can be faster in VMs).
*	[WasmGC] Heap2Local: Optimize RefEq (#6703)	Alon Zakai	2024-06-26	1	-8/+310
\| \| \| \| \|	If an allocation does not escape, then we can compute ref.eq for it: when compared to itself the result is 1, and when compared to anything else it is 0 (since it did not escape, anything else must be different).
*	[threads] Validate shared-to-unshared edges in heap types (#6698)	Thomas Lively	2024-06-25	1	-4/+4
\| \| \|	Add spec tests checking validation for structs and arrays.
*	[WasmGC] Add missing ArrayNew variants to Properties::isGenerative (#6691)	Alon Zakai	2024-06-24	1	-5/+87
\| \| \|	Fixes #6690
*	Add TraceCalls pass (#6619)	Marcin Kolny	2024-06-21	2	-0/+165
\| \| \| \| \| \| \|	This pass receives a list of functions to trace, and then wraps them in calls to imports. This can be useful for tracing malloc/free calls, for example, but is generic. Fixes #6548
*	GlobalStructInference: Un-nest struct.news in globals when that is helpful ↵	Alon Zakai	2024-06-20	1	-5/+412
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(#6688) If we have (global $g (struct.new $S (i32.const 1) (struct.new $T ..) (ref.func $f) )) then before this PR if we wanted to read the middle field we'd stop, as it is non-constant. However, we can un-nest it, making it constant: (global $g.unnested (struct.new $T ..)) (global $g (struct.new $S (i32.const 1) (global.get $g.unnested) (ref.func $f) )) Now the field is a global.get of an immutable global, which is constant. Using this technique we can handle anything in a struct field, constant or not. The cost of adding a global is likely offset by the benefit of being able to refer to it directly, as that opens up more opportunities later. Concretely, this replaces the constant values we look for in GSI with a variant over constants or expressions (we do still want to group constants, as multiple globals with the same constant field can be treated as a whole). And we note cases where we need to un-nest, and handle those at the end.
*	[threads] Shared basic heap types (#6667)	Thomas Lively	2024-06-19	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \|	Implement binary and text parsing and printing of shared basic heap types and incorporate them into the type hierarchy. To avoid the massive amount of code duplication that would be necessary if we were to add separate enum variants for each of the shared basic heap types, use bit 0 to indicate whether the type is shared and replace `getBasic()` with `getBasic(Unshared)`, which clears that bit. Update all the use sites to record whether the original type was shared and produce shared or unshared output without code duplication.
*	GlobalStructInference: Optimize globals too (#6674)	Alon Zakai	2024-06-17	1	-0/+51
\| \| \| \|	This is achieved by simply replacing the Literal with PossibleConstantValues, which supports both Literals and Globals.
*	[Parser] Update requirements for implicit type uses (#6665)	Thomas Lively	2024-06-14	2	-74/+76
\| \| \| \| \| \| \|	As an abbreviation, a `typeuse` can be given as just a list of parameters and results, in which case it corresponds to the index of the first function type with the same parameters and results. That function type must also be an MVP function type, i.e. it cannot have a nontrivial rec group, be non-final, or have a declared supertype. The parser did not previously implement all of these rules.
*	[threads] Binary reading and writing of shared composite types (#6664)	Thomas Lively	2024-06-14	1	-3/+6
\| \| \| \|	Also update the parser so that implicit type uses are not matched with shared function types.
*	[threads] Parse, build, and print shared composite types (#6654)	Thomas Lively	2024-06-12	1	-0/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Parse the text format for shared composite types as described in the shared-everything thread proposal. Update the parser to use 'comptype' instead of 'strtype' to match the final GC spec and add the new syntactic class 'sharecomptype'. Update the type canonicalization logic to take sharedness into account to avoid merging shared and unshared types. Make the same change in the TypeMerging pass. Ensure that shared and unshared types cannot be in a subtype relationship with each other. Follow-up PRs will add shared abstract heap types, binary parsing and emitting for shared types, and fuzzer support for shared types.
*	[DebugInfo] Copy debug info in call-utils.h (#6652)	Alon Zakai	2024-06-12	1	-2/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We automatically copy debuginfo in replaceCurrent(), but there are a few places that do other operations than simple replacements. call-utils.h will turn a call_ref with a select target into two direct calls, and we were missing the logic to copy debuginfo from the call_ref to the calls. To make this work, refactor out the copying logic from wasm-traversal, into debuginfo.h, and use it in call-utils.h. debuginfo.h itself is renamed from debug.h (as now this needs to be included from wasm-traversal, which nearly everything does, and it turns out some files have internal stuff like a debug() helper that ends up conflicing with the old debug namespace). Also rename the old copyDebugInfo function to copyDebugInfoBetweenFunctions which is more explicit. That is also moved from the header to a cpp file because it depends on wasm-traversal (so we'd end up with recursive headers otherwise). That is fine, as that method is called after copying a function, which is not that frequent. The new copyDebugInfoToReplacement (which was refactored out of wasm-traversal) is in the header because it can be called very frequently (every single instruction we optimize) and we want it to get inlined.
*	[Strings] Keep public and private types separate in StringLowering (#6642)	Alon Zakai	2024-06-10	2	-28/+102
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need StringLowering to modify even public types, as it must replace every single stringref with externref, even if that modifies the ABI. To achieve that we told it that all string-using types were private, which let TypeUpdater update them, but the problem is that it moves all private types to a new single rec group, which meant public and private types ended up in the same group. As a result, a single public type would make it all public, preventing optimizations and breaking things as in #6630 #6640. Ideally TypeUpdater would modify public types while keeping them in the same rec groups, but this may be a very specific issue for StringLowering, and that might be a lot of work. Instead, just make StringLowering handle public types of functions in a manual way, which is simple and should handle all cases that matter in practice, at least in J2Wasm.
*	Effects: Add missing combining logic for MayNotReturn (#6635)	Alon Zakai	2024-06-03	1	-81/+295
\| \| \| \| \| \| \| \| \| \|	Without that logic we could end up dropping that particular effect. This actually made a test pass when it should not: the modified test here has a function with effects that are ok to remove, but it had a loop which adds MayNotReturn which we should actually not remove, so it was removed erroneously. To fix the test, add other effects there (local ones) that we can see are removable. Also add a function with a loop to test that we do not remove an infinite loop, which adds coverage for the fix here.
*	Optimize ReorderGlobals ordering with a new algorithm (#6625)	Alon Zakai	2024-05-31	3	-29/+1437
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The old ordering in that pass did a topological sort while sorting by uses both within topological groups and between them. That could be unoptimal in some cases, however, and actually on J2CL output this pass made the binary larger, which is how we noticed this. The problem is that such a toplogical sort keeps topological groups in place, but it can be useful to interleave them sometimes. Imagine this: $c - $a / $e \ $d - $b Here $e depends on $c, etc. The optimal order may interleave the two arms here, e.g. $a, $b, $d, $c, $e. That is because the dependencies define a partial order, and so the arms here are actually independent. Sorting by toplogical depth first might help in some cases, but also is not optimal in general, as we may want to mix toplogical depths: $a, $c, $b, $d, $e does so, and it may be the best ordering. This PR implements a natural greedy algorithm that picks the global with the highest use count at each step, out of the set of possible globals, which is the set of globals that have no unresolved dependencies. So we start by picking the first global with no dependencies and add at at the front; then that unlocks anything that depended on it and we pick from that set, and so forth. This may also not be optimal, but it is easy to make it more flexible by customizing the counts, and we consider 4 sorts here: * Set all counts to 0. This means we only take into account dependencies, and we break ties by the original order, so this is as close to the original order as we can be. * Use the actual use counts. This is the simple greedy algorithm. * Set the count of each global to also contain the counts of its children, so the count is the total that might be unlocked. This gives more weight to globals that can unlock more later, so it is less greedy. * As last, but weight children's counts lower in an exponential way, which makes sense as they may depend on other globals too. In practice it is simple to generate cases where 1, 2, or 3 is optimal (see new tests), but on real-world J2CL I see that 4 (with a particular exponential coefficient) is best, so the pass computes all 4 and picks the best. As a result it will never worsen the size and it has a good chance of improving. The differences between these are small, so in theory we could pick any of them, but given they are all modifications of a single algorithm it is very easy to compute them all with little code complexity. The benefits are rather small here, but this can save a few hundred bytes on a multi-MB Java file. This comes at a tiny compile time cost, but seems worth it for the new guarantee to never regress size.
*	LogExecution: Optionally take a module name for the logger function (#6629)	YAMAMOTO Takashi	2024-05-31	1	-0/+24
\| \| \| \| \| \| \| \| \|	--log-execution=NAME will use NAME as the module for the logger function import, rather than infer it. If the name is not provided (--log-execution as before this PR) then we will try to automatically decide which to use ("env", unless we see another module name is used, which can be the case in optimized modules).
*	Avoid duplicate type names (#6633)	Alon Zakai	2024-05-30	1	-1/+2
\| \| \| \| \|	If we replace a type with another, use the original name for the new type, and give the old a unique name (for the rare cases in which it has uses).
*	[NFC] Port LogExecution test to lit (#6634)	Alon Zakai	2024-05-30	1	-0/+146
\|
*	Fix Vacuuming of code leading up to an infinite loop (#6632)	Alon Zakai	2024-05-29	1	-22/+41
\| \| \| \| \|	We had that logic right in other places, but the specific part of Vacuum that looks at code that leads up to an unreachable did not check for infinite loops, so it could remove them.
*	SignaturePruning: Properly handle public types (#6630)	Alon Zakai	2024-05-29	1	-0/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The SignaturePruning pass optimizes away parameters that it proves are safe to remove. It turns out that that does not always match the definition of private types, which is more restrictive. Specifically, if say all the types are in one big rec group and one of them is used on an exported function then all of them are considered public (as the rec group is). However, in closed world, it would be ok to leave that rec group unchanged but to create a pruned version of that type and use it, in cases where we see it is safe to remove a parameter. (See the testcase for a concrete example.) To put it another way, SignaturePruning already proves that a parameter is safe to remove in all the ways that matter. Before this PR, however, the testcase in this PR would error - so this PR is not an optimization but a bugfix, really - because SignaturePruning would see that a parameter is safe to remove but then TypeUpdating would see the type is public and so it would leave it alone, leading to a broken module. This situation is in fact not that rare, and happens on real-world Java code. The reason we did not notice it before is that typically there are no remaining SignaturePruning opportunities late in the process (when other closed world optimizations have typically led to a single big rec group). The concrete fix here is to add additionalPrivateTypes to a few more places in TypeUpdating. We already supported that for cases where a pass knew better than the general logic what can be modified, and this adds that ability to the signature-rewriting logic there. Then SignaturePruning can send in all the types it has proven are safe to modify. * Also necessary here is to only add from additionalPrivateTypes if the type is not already in our list (or we'd end up with duplicates in the final rec group). * Also move newSignatures in SignaturePruning out of the top level, which was confusing (the pass has multiple iterations, and we want each to have a fresh instance).
*	Remove obsolete parser code (#6607)	Thomas Lively	2024-05-29	1	-1/+0
\| \| \| \| \|	Remove `SExpressionParser`, `SExpressionWasmBuilder`, and `cashew::Parser`. Simplify gen-s-parser.py. Remove the --new-wat-parser and --deprecated-wat-parser flags.
*	OptimizeInstructions: Push StructNew down to help it fold away StructSets ↵	Roberto Lublinerman	2024-05-28	1	-12/+54
\| \| \| \| \| \| \| \| \|	(#6584) Heap stores (struct.set) are optimized into the struct.new when they are adjacent in a statement list. Pushing struct.new down past irrelevant instructions increases the likelihood that it ends up adjacent to sets.
*	[EH] Rename old EH tests from -old to -legacy (#6627)	Heejin Ahn	2024-05-28	17	-0/+0
\| \| \| \|	This renames old EH tests in the form of `-eh-old.wast` to `-eh-legacy.wast`, to be clearer in names.
*	SimplifyGlobals: Do not switch a get to use a global of another type (#6605)	Alon Zakai	2024-05-20	1	-0/+24
\| \| \| \|	If we wanted to switch types in such cases we'd need to refinalize (which is likely worth doing, though other passes should refine globals anyhow).
*	Fix generate-dyncalls and directize passed under table64 (#6604)	Sam Clegg	2024-05-18	2	-0/+85
\|
*	Fix GlobalRefining's handling of gets in module code and add missing ↵	Alon Zakai	2024-05-17	1	-0/+28
\| \| \| \| \| \| \| \| \| \| \|	validation (#6603) GlobalRefining did not traverse module code, so it did not update global.gets in other globals. Add missing validation that actually errors on that: We did not check global.get types. These could be separate PRs but it would be difficult to test them separately.
*	[Table64Lowering] Don't assume that all segments are from 64-bit tables (#6599)	Sam Clegg	2024-05-16	1	-11/+17
\| \| \| \| \| \| \| \| \|	This allows modules to contains both 32-bit and 64-bit segment. In order to check the table/memory state when visiting segments we need to ensure that memories/tables are visited only after their segments. The comments in visitTable/visitMemory already assumed this but it wasn't true in practice.
*	Fix FlatTable for table64 (#6598)	Sam Clegg	2024-05-15	1	-4/+12
\|
*	Add table64 lowering pass (#6595)	Sam Clegg	2024-05-15	1	-0/+64
\| \| \| \| \|	Changes to wasm-validator.cpp here are mostly for consistency between elem and data segment validation.
*	OptimizeInstructions: Add missing invalidation check in consecutive equality ↵	Alon Zakai	2024-05-15	1	-17/+245
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	test (#6596) This existed before #6495 but became noticeable there. We only looked at the fallthrough values in the later part of areConsecutiveInputsEqual, but there can be invalidation due to the non-fallthrough part: (i32.add (local.get $x) (block (local.set $x ..) (local.get $x) ) ) The set can cause the local.get to differ the second time. To fix this, check if the non-fallthrough part invalidates the fallthrough (but only on the right hand side). Fixes #6593
*	[EH] Rename option/pass names for new EH (exnref) (#6592)	Heejin Ahn	2024-05-15	3	-30/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We settled on the name `WASM_EXNREF` for the new setting in Emscripten for the name for the new EH option. https://github.com/emscripten-core/emscripten/blob/2bc5e3156f07e603bc4f3580cf84c038ea99b2df/src/settings.js#L782-L786 "New EH" sounds vague and I'm not sure if "experimental" is really necessary anyway, given that the potential users of this option is aware that this is a new spec that has been adopted recently. To make the option names consistent, this renames `--translate-to-eh` (the option that only runs the translator) to `--translate-to-exnref`, and `--experimental-new-eh` to `--emit-exnref` (the option that runs the translator at the end of the whole pipeline), and renames the pass and variable names in the code accordingly as well. In case anyone is using the old option names (and also to make the Chromium CI pass), this does not delete the old options.
*	[Strings] Remove operations not included in imported strings (#6589)	Thomas Lively	2024-05-15	1	-378/+26
\| \| \| \| \| \|	The stringref proposal has been superseded by the imported JS strings proposal, but the former has many more operations than the latter. To reduce complexity, remove all operations that are part of stringref but not part of imported strings.
*	[Strings] Remove stringview types and instructions (#6579)	Thomas Lively	2024-05-15	5	-264/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The stringview types from the stringref proposal have three irregularities that break common invariants and require pervasive special casing to handle properly: they are supertypes of `none` but not subtypes of `any`, they cannot be the targets of casts, and they cannot be used to construct nullable references. At the same time, the stringref proposal has been superseded by the imported strings proposal, which does not have these irregularities. The cost of maintaing and improving our support for stringview types is no longer worth the benefit of supporting them. Simplify the code base by entirely removing the stringview types and related instructions that do not have analogues in the imported strings proposal and do not make sense in the absense of stringviews. Three remaining instructions, `stringview_wtf16.get_codeunit`, `stringview_wtf16.slice`, and `stringview_wtf16.length` take stringview operands in the stringref proposal but cannot be removed because they lower to operations from the imported strings proposal. These instructions are changed to take stringref operands in Binaryen IR, and to allow a graceful upgrade path for users of these instructions, the text and binary parsers still accept but ignore `string.as_wtf16`, which is the instruction used to convert stringrefs to stringviews. The binary writer emits code sequences that use scratch locals and `string.as_wtf16` to keep the output valid. Future PRs will further align binaryen with the imported strings proposal instead of the stringref proposal, for example by making `string` a subtype of `extern` instead of a subtype of `any` and by removing additional instructions that do not have analogues in the imported strings proposal.
*	LocalCSE: Fix regression from #6587 by accumulating generativity (#6591)	Alon Zakai	2024-05-15	2	-5/+69
\| \| \| \| \| \| \| \| \| \| \| \| \|	#6587 was incorrect: It checked generativity early in an incremental manner, but it did not accumulate that information as we do with hashes. As a result we could end up optimizing something with a generative child, and sadly we lacked testing for that case. This adds incremental generativity computation alongside hashes. It also splits out this check from isRelevant. Also add a test for nested effects (as opposed to generativity), but that already worked before this PR (as we compute effects and invalidation as we go, already).
*	LocalCSE: Ignore traps of code in between (#6588)	Alon Zakai	2024-05-14	1	-0/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Given: (ORIGINAL) (in between) (COPY) We want to change that to (local.tee $temp (ORIGINAL)) (in between) (local.get $temp) It is fine if "in between" traps: then we never reach the new local.get. This is a safer situation than most optimizations because we are not reordering anything, only replacing known-equivalent code.
*	LocalCSE: Check effects/generativity early (#6587)	Alon Zakai	2024-05-14	1	-1/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously we checked late, and as a result might end up failing to optimize when a sub-pattern could have worked. E.g. (call (A) ) (call (A) ) The call cannot be optimized, but the A pattern repeats. Before this PR we'd greedily focus on the entire call and then fail. After this PR we skip the call before we commit to which patterns to try to optimize, so we succeed. Add a isShallowlyGenerative helper here as we compute this step by step as we go. Also remove a parameter to the generativity code (it did not use the features it was passed).
*	[StackIR] Run StackIR during binary writing and not as a pass (#6568)	Alon Zakai	2024-05-09	14	-72/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously we had passes --generate-stack-ir, --optimize-stack-ir, --print-stack-ir that could be run like any other passes. After generating StackIR it was stashed on the function and invalidated if we modified BinaryenIR. If it wasn't invalidated then it was used during binary writing. This PR switches things so that we optionally generate, optimize, and print StackIR only during binary writing. It also removes all traces of StackIR from wasm.h - after this, StackIR is a feature of binary writing (and printing) logic only. This is almost NFC, but there are some minor noticeable differences: 1. We no longer print has StackIR in the text format when we see it is there. It will not be there during normal printing, as it is only present during binary writing. (but --print-stack-ir still works as before; as mentioned above it runs during writing). 2. --generate/optimize/print-stack-ir change from being passes to being flags that control that behavior instead. As passes, their order on the commandline mattered, while now it does not, and they only "globally" affect things during writing. 3. The C API changes slightly, as there is no need to pass it an option "optimize" to the StackIR APIs. Whether we optimize is handled by --optimize-stack-ir which is set like other optimization flags on the PassOptions object, so we don't need the old option to those C APIs. The main benefit here is simplifying the code, so we don't need to think about StackIR in more places than just binary writing. That may also allow future improvements to our usage of StackIR.
*	[J2Cl] Make J2clOpts more effective with transitive deps in constant ↵	Roberto Lublinerman	2024-05-09	2	-22/+176
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	intialization (#6571) Constants that need to be hoisted sometimes are initialized by calling getters of other constants that need to be hoisted. These getters are non-trivial, e.g. (func $getConst1_<once>_@X (result (ref null $A)) (block (result (ref null $A)) (if (i32.eqz (ref.is_null (global.get $$const1@X))) (then (return (global.get $$const1@X)) ) ) (global.set $$const1@X (struct.new $A (i32.const 2))) (global.get $$const1@X) ) (func $getConst2_<once>_@X (result (ref null $A)) (block (result (ref null $A)) (if (i32.eqz (ref.is_null (global.get $$const2@X))) (then (return (global.get $$const2@X)) ) ) (global.set $$const2@X .... expression with (call $getConst1_<once>_@X) ....)) (global.get $$const2@X) ) and can only be simplified after the constants they initialize are hoisted. After the constant is hoisted the getter can be inlined and constants that depend on it for their initialization can now be hoisted. Before this pass, inlining would happen after the pass was run by a subsequent run of the inliner (likely as part of -O3), requiring as many runs of this pass, interleaved with the inliner, as the depth in the call sequence. By having a simpler inliner run as part of the loop in this pass, the pass becomes more effective and more independent of the call depths.
*	J2CLOpts: Add "precompute" and "remove-unused-brs" as additional cleanup	Roberto Lublinerman	2024-04-30	1	-0/+52
\| \| \| \|	This makes the cleanup of bodies of functions that have had constants hoisted from them more effective.
*	[Parser] Re-use blocks instead of wrapping where possible (#6552)	Thomas Lively	2024-04-29	2	-24/+20
\| \| \| \| \| \| \|	When the input has branches to block scope, IR builder generally has to add a wrapper block with a label name for the branch to target. To reduce the parsed IR size, add a special case for when the wrapped expression is already an unnamed block. In that case we can simply add the label to the existing block instead of creating a new wrapper block.
*	[Strings] Work around ref.cast not working on string views, and add fuzzing ↵	Alon Zakai	2024-04-29	1	-0/+66
\| \| \| \| \| \| \| \| \| \| \| \| \|	(#6549) As suggested in #6434 (comment) , lower ref.cast of string views to ref.as_non_null in binary writing. It is a simple hack that avoids the problem of V8 not allowing them to be cast. Add fuzzing support for the last three core string operations, after which that problem becomes very frequent. Also add yet another makeTrappingRefUse that was missing in that fuzzer code.
*	[Strings] Fix effects of string.compare and add fuzzing (#6547)	Alon Zakai	2024-04-25	1	-0/+84
\| \| \| \| \| \| \| \|	We added string.compare late in the spec process, and forgot to add effects for it. Unlike string.eq, it can trap. Also use makeTrappingRefUse in recent fuzzer string generation places that I forgot, which should reduce the amount of traps in fuzzer output.
*	[Parser] Enable the new text parser by default (#6371)	Thomas Lively	2024-04-25	43	-679/+287
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new text parser is faster and more standards compliant than the old text parser. Enable it by default in wasm-opt and update the tests to reflect the slightly different results it produces. Besides following the spec, the new parser differs from the old parser in that it: - Does not synthesize `loop` and `try` labels unnecessarily - Synthesizes different block names in some cases - Parses exports in a different order - Parses `nop`s instead of empty blocks for empty control flow arms - Does not support parsing Poppy IR - Produces different error messages - Cannot parse `pop` except as the first instruction inside a `catch`
*	GUFA: Handle bottom types in filterDataContents() (#6545)	Alon Zakai	2024-04-25	1	-0/+38
\| \| \| \| \| \|	Normally a bottom type cannot reach there, as we ignore unreachable GC operations early on. However, we can infer a bottom type later during the flow, so we need to handle that (just not error on it, and for clarity during debugging we also clear the contents).
*	[Strings] Do not reuse mutable globals in StringGathering (#6531)	Alon Zakai	2024-04-24	1	-0/+67
\| \| \| \| \|	We were reusing mutable globals in StringGathering, which meant that we'd use a global to represent a particular string but if it was mutated then it could contain a different string during execution.