forks/binaryen.git -

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	SubtypingDiscoverer: Differentiate non-flow subtyping constraints (#6344)	Alon Zakai	2024-02-27	1	-0/+103
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we do a local.set of a value into a local then we have both a subtyping constraint - for the value to be valid to put in that local - and also a flow of a value, which can then reach more places. Such flow then interacts with casts in Unsubtyping, since it needs to know what can flow where in order to know how casts force us to keep subtyping relations. That regressed in the not-actually-NFC #6323 in which I added the innocuous lines to add subtyping constraints in ref.eq. It seems fine to require that the arms of a RefEq must be of type eqref, but Unsubtyping then assuming those arms flowed into a location of type eqref... which means casts might force us to not optimize some things. To fix this, differentiate the rare case of non-flowing subtyping constraints, which is basically only RefEq. There are perhaps a few more cases (like i31 operations) but they do not matter in practice for Unsubtyping anyhow; I suggest we land this first to undo the regression and then at our leisure investigate the other instructions.
*	[StringLowering] Lower `stringview_wtf16.get_codeunit` to `charCodeAt` (#6353)	Thomas Lively	2024-02-26	2	-4/+4
\| \| \| \|	Previously we lowered this to `getCodePointAt`, which has different semantics around surrogate pairs.
*	[Parser] Parse annotations, including source map comments (#6345)	Thomas Lively	2024-02-26	1	-0/+66
\| \| \| \| \| \| \| \| \| \|	Parse annotations using the standards-track `(@annotation ...)` format as well as the `;;@ source-map:0:1` format. Have the lexer implicitly collect annotations while it skips whitespace and add lexer APIs to access the annotations since the last token was parsed. Collect annotations before parsing each instruction and pass the annotations explicitly to the parser and parser context functions for instructions. Add an API to `IRBuilder` to set a debug location to be attached to the next visited or created instruction and use it from the parser.
*	Implement dropping of active Element Segments (#6343)	Alon Zakai	2024-02-23	1	-0/+104
\| \| \| \|	Also rename the existing droppedSegments to droppedDataSegments for clarity.
*	Typed continuations: cont.new instructions (#6308)	Frank Emrich	2024-02-22	2	-10/+91
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This PR is part of a series that adds basic support for the [typed continuations/wasmfx proposal](https://github.com/wasmfx/specfx). This particular PR adds support for the `cont.new` instruction for creating continuations, documented [here(https://github.com/wasmfx/specfx/blob/main/proposals/continuations/Overview.md#instructions). In short, these instructions are of the form `(cont.new $ct)` where `$ct` must be a continuation type. The instruction takes a single (nullable) function reference as its argument, which means that the folded representation of the instruction is of the form `(cont.new $ct (foo ...))`. Support for the instruction is implemented in both the old and the new wat parser. Note that this PR does not implement validation of the new instruction.
*	Fuzzer: Match the logging of i31ref between JS and C++ (#6335)	Alon Zakai	2024-02-22	1	-0/+29
\| \| \| \| \| \| \| \| \| \| \| \| \|	JS engines print i31ref as just a number, so we need a small regex to standardize the representation (similar to what we do for funcrefs on the code above). On the C++ side, make it actually print the i31ref rather than treat it like a generic reference (for whom we only print "object"). To do that we must unwrap an externalized i31 as necessary, and add a case for i31 in the printing logic. Also move that printing logic to its own function, as it was starting to get quite long.
*	Validator: ArrayNew\|InitData require Bulk Memory (#6331)	Alon Zakai	2024-02-21	2	-0/+52
\| \| \| \| \|	Those instructions refer to a data segment, which mean the DataCount section must be emitted before them (so that, per the spec, they can be validated by looking only at previous sections), which implies bulk-memory is needed.
*	Improve JSON string encoding (#6328)	Thomas Lively	2024-02-21	1	-5/+16
\| \| \| \| \| \| \| \|	Catch and report all kinds of WTF-8 encoding errors in the source strings, including invalid leading bytes, invalid trailing bytes, unexpected ends of strings, and invalid surrogate sequences. Insert replacement characters into the output as necessary. Add a TODO about minimizing size by escaping only those code points mandated to be escaped by the JSON spec. Generally improve readability of the code.
*	[NFC] DeNaN: Avoid calls on constants (#6326)	Alon Zakai	2024-02-21	2	-4/+37
\| \| \| \| \|	A constant is either fixed up immediately, or does not need a call. This makes us slightly faster in the fuzzer, but does not change behavior as before those calls all ended up doing nothing (as the numbers were not nans).
*	Fuzzer: Add a pass to prune illegal imports and exports for JS (#6312)	Alon Zakai	2024-02-20	3	-0/+218
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We already have passes to legalize i64 imports and exports, which the fuzzer will run so that we can run wasm files in JS VMs. SIMD and multivalue also pose a problem as they trap on the boundary. In principle we could legalize them as well, but that is substantial effort, so instead just prune them: given a wasm module, remove any imports or exports that use SIMD or multivalue (or anything else that is not legal for JS). Running this in the fuzzer will allow us to not skip running v8 on any testcase we enable SIMD and multivalue for. (Multivalue is allowed in newer VMs, so that part of this PR could be removed eventually.) Also remove the limitation on running v8 with multimemory (v8 now supports that).
*	Fuzzer: Add SIMD support to DeNaN (#6318)	Alon Zakai	2024-02-20	1	-0/+131
\|
*	[NFC] Use SubtypingDiscoverer in StringLowering (#6325)	Alon Zakai	2024-02-20	1	-0/+4
\| \| \| \| \| \| \| \|	This replaces horrible hacks to find which nulls need to switch (from none to noext) with general code using SubtypingDiscoverer. That helper is aware of where each expression is written, so we can find those nulls trivially. This is NFC on existing usage but should fix any remaining bugs with null constants.
*	Fuzzer: Remove --emit-js-shell logic and reuse fuzz_shell.js instead (#6310)	Alon Zakai	2024-02-20	8	-211/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We had two JS files that could run a wasm file for fuzzing purposes: * --emit-js-shell, which emitted a custom JS file that runs the wasm. * scripts/fuzz_shell.js, which was a generic file that did the same. Both of those load the wasm and then call the exports in order and print out logging as it goes of their return values (if any), exceptions, etc. Then the fuzzer compares that output to running the same wasm in another VM, etc. The difference is that one was custom for the wasm file, and one was generic. Aside from that they are similar and duplicated a bunch of code. This PR improves things by removing 1 and using 2 in all places, that is, we now use the generic file everywhere. I believe we added 1 because we thought a generic file can't do all the things we need, like know the order of exports and the types of return values, but in practice there are ways to do those things: The exports are in fact in the proper order (JS order of iteration is deterministic, thankfully), and for the type we don't want to print type internals anyhow since that would limit fuzzing --closed-world. We do need to be careful with types in JS (see notes in the PR about the type of null) but it's not too bad. As for the types of params, it's fine to pass in null for them all anyhow (null converts to a number or a reference without error).
*	Validate function imports (#6315)	Alon Zakai	2024-02-20	1	-0/+9
\| \| \| \| \| \| \|	We validate functions in parallel, but function-parallel passes do not run on imports, so we did not issue a validation error on an import using a disallowed type, for example. All the changes in visitFunction are just to group all the parts using body to the end, and putting them behind a check for body.
*	StringLowering: Escape the JSON in the custom section (#6316)	Alon Zakai	2024-02-20	2	-4/+36
\| \| \| \|	Also add an end-to-end test using node to verify we can parse the escaped content properly using TextDecoder+JSON.parse.
*	[Parser] Simplify the lexer interface (#6319)	Thomas Lively	2024-02-20	1	-1432/+831
\| \| \| \| \| \| \| \| \| \| \|	The lexer was previously an iterator over tokens, but that expressivity is not actually used in the parser. Instead, we have `input.h` that adapts the token iterator interface into an iterface that is actually useful. As a first step toward simplifying the lexer implementation to no longer be an iterator over tokens, update its interface by moving the adaptation from input.h to the lexer itself. This requires extensive changes to the lexer unit tests, which will not have to change further when we actually simplify the lexer implementation.
*	StringLowering: Lower nulls in call params (#6317)	Alon Zakai	2024-02-20	1	-30/+44
\|
*	StringLowering: Properly handle nullable inputs to StringAs (#6307)	Alon Zakai	2024-02-14	1	-5/+36
\| \| \|	StringAs's output must be non-nullable, so add a cast.
*	StringLowering: Fix up nulls written to struct.new fields (#6306)	Alon Zakai	2024-02-14	1	-30/+75
\|
*	Strings: Add some interpreter support (#6304)	Alon Zakai	2024-02-14	1	-0/+31
\| \| \| \| \| \| \|	This adds just enough support to be able to --fuzz-exec a small but realistic fuzz testcase from Java. To that end, just implement the minimal ops we need, which are all related to JS-style strings.
*	StringLowering: Use an array16 type in its own rec group (#6302)	Alon Zakai	2024-02-13	1	-43/+118
\| \| \| \| \| \| \| \| \| \| \| \|	The input module might use an array of 16-bit elements type that is somewhere in a giant rec group, but that is not valid for imported strings: that array type is now on an import and must match the expected ABI, which is to be in its own personal rec group. The old array16 type remains in the module after this transformation, but all uses of it are replaced with uses of the new array16 type. Also move makeImports to after updateTypes: there are no types to update in the new imports. That does not matter but it can make debugging less pleasant, so improve it.
*	Fix --spill-pointers for the stack growing down (#6294)	YAMAMOTO Takashi	2024-02-13	1	-190/+266
\| \| \| \|	The LLVM wasm backend grows the stack downwards, and this pass did not fully account for that before.
*	StringLowering: Hack around if issue with bottom types (#6303)	Alon Zakai	2024-02-13	1	-28/+77
\| \| \| \| \|	Replacing the string heap type with extern is dangerous as they do not share top/bottom types. In practice this works out almost everywhere except for a few ifs, which we can fix up as a hack for now.
*	StringLowering: Modify string=>extern also in public types (#6301)	Alon Zakai	2024-02-13	1	-31/+73
\| \| \| \|	We want to actually remove all stringref appearances, in both public and private types.
*	Precompute: Optimize array.len (#6299)	Alon Zakai	2024-02-12	3	-21/+103
\| \| \|	Arrays have immutable length, so we can optimize them like immutable fields.
*	[Parser] Parse `resume` (#6295)	Thomas Lively	2024-02-09	1	-142/+202
\|
*	[Parser] Support references to struct fields by name (#6293)	Thomas Lively	2024-02-08	1	-88/+114
\| \| \| \|	Construct a mapping from heap type and field name to field index, then use it while parsing instructions.
*	Update lit tests to parse with the new parser (#6290)	Thomas Lively	2024-02-08	38	-84/+128
\| \| \| \| \| \| \| \| \|	Get as many of the lit tests as possible to parse with the new parser, mostly by moving declared module items to be after imports. Also fix a bug in the new parser's pop validation to allow supertypes of the expected type. The two big issues that still prevent some lit tests from working correctly under the new parser are missing support for symbolic field names and missing support for source map annotations.
*	Remove support for legacy stringref text syntax (#6289)	Thomas Lively	2024-02-08	1	-189/+9
\| \| \| \|	Removing support for the legacy syntax will allow us to avoid implementing support for it in the new text parser.
*	Add package.json for unit tests (#6245)	Alon Zakai	2024-02-08	2	-0/+4
\| \| \| \| \| \| \| \| \|	The JS there is not an ES6 module, so declare it so (otherwise a package.json in a parent, perhaps in folders outside of our own project that we are pasted in, can cause an error, as require does not work in ES6 modules and we might be forced to be seen as one). Fixes #6240
*	Add a pass to propagate global constants to other globals (#6287)	Alon Zakai	2024-02-08	4	-2/+119
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SimplifyGlobals already does this, so this is a subset of that pass, and does not add anything new. It is useful for testing, however. In particular it allows testing that we propagate subsequent globals in a single pass, that is if one global reads from another and becomes constant, then it can be propagated as well. SimplifyGlobals runs multiple passes so this always worked, but with this pass we can test that we do it efficiently in one pass. This will also be useful for comparing stringref to imported strings, as it allows gathered strings to be propagated to other globals (possible with stringref, but not imported strings) but not anywhere else (which might have downsides as it could lead to more allocations). Also add an additional test for simplify-globals that we do not get confused by an unoptimizable global.get in the middle (see last part).
*	StringLowering: Lower all remaining important string operations (#6283)	Alon Zakai	2024-02-08	2	-31/+183
\| \| \|	All those in the list from #6271 (comment)
*	[Parser] Do not involve IRBuilder for imported functions (#6286)	Thomas Lively	2024-02-07	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	We previously had a bug where we would begin and end an IRBuilder context for imported functions even though they don't have bodies. For functions that return results, ending this empty scope should have produced an error except that we had another bug where we only produced that error for multivalue functions. We did not previously have imported multivalue functions in wat-kitchen-sink.wast, so both of these bugs went undetected. Fix both bugs and update the test to include an imported multivalue function so that it would have failed without this fix.
*	SimplifyGlobals: Propagate constant globals into nested gets in other ↵	Alon Zakai	2024-02-07	1	-0/+27
\| \| \| \| \|	globals (#6285) Before we propagated to the top level, but not to anything interior.
*	Get more tests working with the new text parser (#6284)	Thomas Lively	2024-02-07	27	-57/+61
\| \| \| \| \| \| \| \|	The new parser enforces the rule that imports must come before declarations (except for type declarations). The old parser does not enforce this rule, so many of our tests did not follow it. Fix them to follow that rule and fix other invalid syntax. Also add missing finalization of Load expressions in wasm-builder.h that was causing a test to fail under the new parser and guard against an error case in wasm-ir-builder.cpp that used to cause a segfault.
*	StringLowering: Start to lower instructions (#6281)	Alon Zakai	2024-02-06	2	-0/+115
\|
*	Properly stringify names in tests (#6279)	Thomas Lively	2024-02-06	12	-183/+213
\| \| \| \| \| \| \| \| \| \| \| \| \|	Update identifiers used in tests to use a format supported by the new text parser, i.e. either the standard format with its limited set of allowed characters or the non-standard `$"..."` format. Notably, any name containing square or curly braces now uses the string format. Input automatically updated with this script: https://gist.github.com/tlively/4e22311736661849e641d02e521a0748 The printer is updated to properly escape names in more places as well. The logic for escaping names is moved to a common location so that the type printing logic in wasm-type.cpp can use it as well.
*	[Parser] Support string-style identifiers (#6278)	Thomas Lively	2024-02-06	2	-3/+35
\| \| \| \| \| \| \| \| \| \|	In addition to normal identifiers, support parsing identifiers of the format `$"..."`. This format is not yet allowed by the standard, but it is a popular proposed extension (see https://github.com/WebAssembly/spec/issues/617 and https://github.com/WebAssembly/annotations/issues/21). Binaryen has historically allowed a similar format and has supported arbitrary non-standard identifier characters, so it's much easier to support this extended syntax than to fix everything to use the restricted standard syntax.
*	Make `array.new_fixed` length annotations mandatory (#6277)	Thomas Lively	2024-02-06	1	-36/+0
\| \| \| \| \|	They were previously optional to ease the transition to the standard text format, but now we can make them mandatory to match the spec. This will simplify the new text parser as well.
*	[EH] Add --experimental-new-eh option to wasm-opt (#6270)	Heejin Ahn	2024-02-06	2	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds `--experimental-new-eh` option to `wasm-opt`. The difference between this and `--translate-to-new-eh` is, `--translate-to-new-eh` just runs `TranslateToNewEH` pass, while `--experimental-new-eh` attaches `TranslateToNewEH` pass at the end of the whole optimization pipeline. So if no other passes or optimization options (`-On`) are specified, it is equivalent to `--translate-to-new-eh`. If other optimization passes are specified, it runs them and at the end run the translator to ensure the new EH instructions are emitted. The reason we are doing this this way is that the optimization pipeline as a whole does not support the new EH instruction yet, but we would like to provide an option to emit a reasonably OK code with the new EH instructions. This also means when the optimization level > 3, it will also run the StackIR + local2stack optimization after the translation. Not sure how to test the output of this option, given that there is not much point in testing the default optimization passes, and it is also not clear how to print the stack IR if the stack ir generation and optimization runs as a part of the pipeline and not the explicit command line options. This is created in favor of #6267, which added the option to `optimization-options.h`. It had a problem of running the translator multiple times when `-On` was given multiple times in the command line, which I learned was rather a common usage. This adds the option directly to `wasm-opt.cpp`, which avoids the problem. With this, it is still possible to create and optimize Stack IR unnecessarily, but that feels a better alternative.
*	StringLowering pass (#6271)	Alon Zakai	2024-02-05	4	-0/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This extends StringGathering by replacing the gathered string globals to imported globals. It adds a custom section with the strings that the imports are expected to provide. It also replaces the string type with extern. This is a complete lowering of strings, except for string operations that are a TODO. After running this, no strings remain in the wasm, and the outside JS is expected to provide the proper imports, which it can do by processing the JSON of the strings in the custom section "string.consts", which looks like ["foo", "bar", ..] That is, an array of strings, which are imported as (import "string.const" "0" (global $string.const_foo (ref extern))) ;; foo (import "string.const" "1" (global $string.const_bar (ref extern))) ;; bar
*	wasm-ctor-eval: Properly eval strings (#6276)	Alon Zakai	2024-02-05	1	-4/+3
\| \| \| \| \| \| \|	#6244 tried to do this but was not quite right. It treated a string like an array or a struct, which means create a global for it. But just creating a global isn't enough, as it needs to also be sorted in the right place etc. which requires changes in other places. But there is a much simpler solution here: string constants are just constants, which we can emit in-line, so do that.
*	[Parser] Parse v128.const (#6275)	Thomas Lively	2024-02-05	1	-2/+37
\|
*	[Parser] Templatize lexing of integers (#6272)	Thomas Lively	2024-02-05	1	-166/+166
\| \| \| \| \| \|	Have a single implementation for lexing each of unsigned, signed, and uninterpreted integers, each generic over the bit width of the integer. This reduces duplication in the existing code and it will make it much easier to support lexing more 8- and 16-bit integers.
*	MemoryPacking: Handle non-empty trapping segments (#6261)	Alon Zakai	2024-02-01	1	-0/+85
\| \| \|	Followup to #6243 which handled empty ones.
*	JSON: Add simple printing and creation (#6265)	Alon Zakai	2024-02-01	2	-0/+17
\|
*	C API: Use segment names (#6254)	ericvergnaud	2024-02-01	16	-52/+81
\| \| \| \| \| \| \| \| \|	Move from segment indexes to names. This is a breaking change to make the API more capable and consistent. An effort has been made to reduce the burden on C API users where possible (specifically, you can avoid providing names and let Binaryen make them for you, which will basically be numbers that match the indexes from before). Fixes #6247
*	GUFA: Propagate string literals (#6262)	Alon Zakai	2024-02-01	1	-0/+59
\| \| \|	We only noted the type but not the literal value.
*	[EH] Test StackIR's local2stack on translator output (#6264)	Heejin Ahn	2024-01-31	1	-0/+649
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds `STACKIR-OPT` filecheck lines to `translate-to-new-eh.wast` to see if StackIR's `local2stack` optimization successfully removes some of unnecessary `local.set`/`local.get`s. While supporting the whole Binayren optimization pipeline for the new EH instructions is not the goal for the very near-term future, StackIR's `local2stack` optimization can help with a very common pattern generated by this translator, which is: ```wast (try $l (do ... ) (catch_all (call $destructor) (rethrow $l) ) ) ``` is translated to ```wast (block $outer (local.set $exn ;; can be optimized away (block $catch_all (result exnref) (try_table (catch_all_ref $catch_all) ... ) (br $outer) ) ) (call $destructor) (throw_ref (local.get $exn) ;; can be optimized away ) ) ``` Here we don't really need `local.set $exn` and `local.get $exn`, and these can be optimized away using StackIR's local2stack. After optimizing them away in Stack IR, the code can be like ```wast block $outer block $catch_all (result exnref) try_table (catch_all_ref $catch_all) ... end br $outer end call $destructor throw_ref end ``` This optimization alone reduces the code size increased caused by translating significantly. For Adobe Photoshop, the code size increase goes down from 4.2% to 2.8%, and for Binaryen, it goes down from 3.8% to 2.0%.
*	Revert "Stop propagating/inlining string constants (#6234)" (#6258)	Alon Zakai	2024-01-31	1	-5/+3
\| \| \| \| \| \| \| \| \| \|	This reverts commit 9090ce56fcc67e15005aeedc59c6bc6773220f11. This has the effect of once more propagating string constants from globals to other places (and from non-globals too), which is useful for various optimizations even if it isn't useful in the final output. To fix the final output problem, #6257 added a pass that is run at the end to collect string.const to globals, which allows us to once more propagate strings in the optimizer, now without a downside.