forks/binaryen.git -

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[Emscripten port] Fix core count logic for Emscripten+pthreads (#6350)	Alon Zakai	2024-02-26	1	-3/+5
\| \| \| \|	Before this all Emscripten builds would use 1 core, but it is important to allow pthreads builds there to use more.
*	Implement dropping of active Element Segments (#6343)	Alon Zakai	2024-02-23	1	-10/+17
\| \| \| \|	Also rename the existing droppedSegments to droppedDataSegments for clarity.
*	[Parser] Condense redundant pop values (#6339)	Ashley Nelson	2024-02-22	1	-13/+1
\| \| \|	A bit of clean-up, changes getBranchValue to use pop().
*	Typed continuations: cont.new instructions (#6308)	Frank Emrich	2024-02-22	26	-28/+187
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This PR is part of a series that adds basic support for the [typed continuations/wasmfx proposal](https://github.com/wasmfx/specfx). This particular PR adds support for the `cont.new` instruction for creating continuations, documented [here(https://github.com/wasmfx/specfx/blob/main/proposals/continuations/Overview.md#instructions). In short, these instructions are of the form `(cont.new $ct)` where `$ct` must be a continuation type. The instruction takes a single (nullable) function reference as its argument, which means that the folded representation of the instruction is of the form `(cont.new $ct (foo ...))`. Support for the instruction is implemented in both the old and the new wat parser. Note that this PR does not implement validation of the new instruction.
*	Fuzzer: Allow using initial content with V8 (#6327)	Alon Zakai	2024-02-22	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	One problem was that spec testcases had exports with names that are not valid to write as JS exports.name. For example an export with a - in the name would end up as exports.foo-bar etc. Since #6310 that is fixed as we do not emit such JS (we use the generic fuzz_shell.js script which iterates over the keys in exports with exports[name]). Also fix a few trivial fuzzer issues that initial content uncovered: - Ignore a wat file with invalid utf-8. - Print string literals in the same way from JS as from C++. - Enable the stringref flag in V8. - Remove tag imports (the same as we do for global and function and other imports).
*	Fuzzer: Match the logging of i31ref between JS and C++ (#6335)	Alon Zakai	2024-02-22	1	-21/+35
\| \| \| \| \| \| \| \| \| \| \| \| \|	JS engines print i31ref as just a number, so we need a small regex to standardize the representation (similar to what we do for funcrefs on the code above). On the C++ side, make it actually print the i31ref rather than treat it like a generic reference (for whom we only print "object"). To do that we must unwrap an externalized i31 as necessary, and add a case for i31 in the printing logic. Also move that printing logic to its own function, as it was starting to get quite long.
*	[Parser][NFC] Remove `Token` from lexer interface (#6333)	Thomas Lively	2024-02-22	2	-44/+46
\| \| \| \| \| \|	Replace the general `peek` method that returned a `Token` with specific peek methods that look for (but do not consume) specific kinds of tokens. This change is a prerequisite for simplifying the lexer implementation by removing `Token` entirely.
*	[Parser][NFC] Remove parser/input.h (#6332)	Thomas Lively	2024-02-22	6	-111/+31
\| \| \| \|	Remove the layer of abstraction sitting between the parser and the lexer now that the lexer has an interface the parser can use directly.
*	Validator: ArrayNew\|InitData require Bulk Memory (#6331)	Alon Zakai	2024-02-21	1	-0/+8
\| \| \| \| \|	Those instructions refer to a data segment, which mean the DataCount section must be emitted before them (so that, per the spec, they can be validated by looking only at previous sections), which implies bulk-memory is needed.
*	Fix build error on aarch64 [NFC] (#6330)	Darren Worrall	2024-02-21	1	-0/+14
\|
*	Improve JSON string encoding (#6328)	Thomas Lively	2024-02-21	1	-69/+103
\| \| \| \| \| \| \| \|	Catch and report all kinds of WTF-8 encoding errors in the source strings, including invalid leading bytes, invalid trailing bytes, unexpected ends of strings, and invalid surrogate sequences. Insert replacement characters into the output as necessary. Add a TODO about minimizing size by escaping only those code points mandated to be escaped by the JSON spec. Generally improve readability of the code.
*	[EH] Add noexn's opcode (#6329)	Heejin Ahn	2024-02-21	1	-2/+2
\| \| \| \|	We had a temporary value 0xff there, but now it is added: https://github.com/WebAssembly/exception-handling/pull/298
*	[NFC] DeNaN: Avoid calls on constants (#6326)	Alon Zakai	2024-02-21	1	-3/+3
\| \| \| \| \|	A constant is either fixed up immediately, or does not need a call. This makes us slightly faster in the fuzzer, but does not change behavior as before those calls all ended up doing nothing (as the numbers were not nans).
*	Fuzzer: Add a pass to prune illegal imports and exports for JS (#6312)	Alon Zakai	2024-02-20	3	-0/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We already have passes to legalize i64 imports and exports, which the fuzzer will run so that we can run wasm files in JS VMs. SIMD and multivalue also pose a problem as they trap on the boundary. In principle we could legalize them as well, but that is substantial effort, so instead just prune them: given a wasm module, remove any imports or exports that use SIMD or multivalue (or anything else that is not legal for JS). Running this in the fuzzer will allow us to not skip running v8 on any testcase we enable SIMD and multivalue for. (Multivalue is allowed in newer VMs, so that part of this PR could be removed eventually.) Also remove the limitation on running v8 with multimemory (v8 now supports that).
*	Fuzzer: Add SIMD support to DeNaN (#6318)	Alon Zakai	2024-02-20	1	-22/+90
\|
*	[NFC] Use SubtypingDiscoverer in StringLowering (#6325)	Alon Zakai	2024-02-20	2	-55/+66
\| \| \| \| \| \| \| \|	This replaces horrible hacks to find which nulls need to switch (from none to noext) with general code using SubtypingDiscoverer. That helper is aware of where each expression is written, so we can find those nulls trivially. This is NFC on existing usage but should fix any remaining bugs with null constants.
*	Fuzzer: Remove --emit-js-shell logic and reuse fuzz_shell.js instead (#6310)	Alon Zakai	2024-02-20	3	-160/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We had two JS files that could run a wasm file for fuzzing purposes: * --emit-js-shell, which emitted a custom JS file that runs the wasm. * scripts/fuzz_shell.js, which was a generic file that did the same. Both of those load the wasm and then call the exports in order and print out logging as it goes of their return values (if any), exceptions, etc. Then the fuzzer compares that output to running the same wasm in another VM, etc. The difference is that one was custom for the wasm file, and one was generic. Aside from that they are similar and duplicated a bunch of code. This PR improves things by removing 1 and using 2 in all places, that is, we now use the generic file everywhere. I believe we added 1 because we thought a generic file can't do all the things we need, like know the order of exports and the types of return values, but in practice there are ways to do those things: The exports are in fact in the proper order (JS order of iteration is deterministic, thankfully), and for the type we don't want to print type internals anyhow since that would limit fuzzing --closed-world. We do need to be careful with types in JS (see notes in the PR about the type of null) but it's not too bad. As for the types of params, it's fine to pass in null for them all anyhow (null converts to a number or a reference without error).
*	Validate function imports (#6315)	Alon Zakai	2024-02-20	1	-40/+60
\| \| \| \| \| \| \|	We validate functions in parallel, but function-parallel passes do not run on imports, so we did not issue a validation error on an import using a disallowed type, for example. All the changes in visitFunction are just to group all the parts using body to the end, and putting them behind a check for body.
*	subtype-exprs.h additions [NFC] (#6323)	Alon Zakai	2024-02-20	1	-8/+31
\| \| \| \| \| \|	This pulls out the subtype-exprs.h parts of #6108 These are NFC in the current codebase, but are fixes for that unlanded PR, and another unrelated PR that will be opened shortly.
*	StringLowering: Escape the JSON in the custom section (#6316)	Alon Zakai	2024-02-20	4	-12/+104
\| \| \| \|	Also add an end-to-end test using node to verify we can parse the escaped content properly using TextDecoder+JSON.parse.
*	JS Bindings: Use stringToUTF8OnStack instead of deprecated ↵	Alon Zakai	2024-02-20	1	-1/+1
\| \| \| \| \|	allocateUTF8OnStack (#6324) This avoids a warning on recent Emscripten.
*	[Parser] Simplify the lexer interface (#6319)	Thomas Lively	2024-02-20	3	-318/+252
\| \| \| \| \| \| \| \| \| \| \|	The lexer was previously an iterator over tokens, but that expressivity is not actually used in the parser. Instead, we have `input.h` that adapts the token iterator interface into an iterface that is actually useful. As a first step toward simplifying the lexer implementation to no longer be an iterator over tokens, update its interface by moving the adaptation from input.h to the lexer itself. This requires extensive changes to the lexer unit tests, which will not have to change further when we actually simplify the lexer implementation.
*	SetGlobals: Fix segfault on invalid input (#6321)	Nikolay Khitrin	2024-02-20	1	-1/+1
\|
*	StringLowering: Lower nulls in call params (#6317)	Alon Zakai	2024-02-20	1	-0/+10
\|
*	StringLowering: Properly handle nullable inputs to StringAs (#6307)	Alon Zakai	2024-02-14	1	-1/+11
\| \| \|	StringAs's output must be non-nullable, so add a cast.
*	StringLowering: Fix up nulls written to struct.new fields (#6306)	Alon Zakai	2024-02-14	1	-16/+36
\|
*	Strings: Add some interpreter support (#6304)	Alon Zakai	2024-02-14	2	-4/+57
\| \| \| \| \| \| \|	This adds just enough support to be able to --fuzz-exec a small but realistic fuzz testcase from Java. To that end, just implement the minimal ops we need, which are all related to JS-style strings.
*	[NFC] Avoid a warning on an unused var (#6300)	Alon Zakai	2024-02-14	1	-1/+2
\|
*	StringLowering: Use an array16 type in its own rec group (#6302)	Alon Zakai	2024-02-13	1	-9/+25
\| \| \| \| \| \| \| \| \| \| \| \|	The input module might use an array of 16-bit elements type that is somewhere in a giant rec group, but that is not valid for imported strings: that array type is now on an import and must match the expected ABI, which is to be in its own personal rec group. The old array16 type remains in the module after this transformation, but all uses of it are replaced with uses of the new array16 type. Also move makeImports to after updateTypes: there are no types to update in the new imports. That does not matter but it can make debugging less pleasant, so improve it.
*	Fix --spill-pointers for the stack growing down (#6294)	YAMAMOTO Takashi	2024-02-13	1	-11/+11
\| \| \| \|	The LLVM wasm backend grows the stack downwards, and this pass did not fully account for that before.
*	StringLowering: Hack around if issue with bottom types (#6303)	Alon Zakai	2024-02-13	1	-0/+21
\| \| \| \| \|	Replacing the string heap type with extern is dangerous as they do not share top/bottom types. In practice this works out almost everywhere except for a few ifs, which we can fix up as a hack for now.
*	StringLowering: Modify string=>extern also in public types (#6301)	Alon Zakai	2024-02-13	3	-5/+31
\| \| \| \|	We want to actually remove all stringref appearances, in both public and private types.
*	Precompute: Optimize array.len (#6299)	Alon Zakai	2024-02-12	1	-1/+1
\| \| \|	Arrays have immutable length, so we can optimize them like immutable fields.
*	Fuzzer: Do not emit huge and possibly non-validating tables (#6288)	Alon Zakai	2024-02-12	1	-0/+17
\|
*	[Parser] Parse `resume` (#6295)	Thomas Lively	2024-02-09	4	-11/+97
\|
*	[Parser] Support references to struct fields by name (#6293)	Thomas Lively	2024-02-08	2	-11/+28
\| \| \| \|	Construct a mapping from heap type and field name to field index, then use it while parsing instructions.
*	Update lit tests to parse with the new parser (#6290)	Thomas Lively	2024-02-08	1	-1/+1
\| \| \| \| \| \| \| \| \|	Get as many of the lit tests as possible to parse with the new parser, mostly by moving declared module items to be after imports. Also fix a bug in the new parser's pop validation to allow supertypes of the expected type. The two big issues that still prevent some lit tests from working correctly under the new parser are missing support for symbolic field names and missing support for source map annotations.
*	Remove support for legacy stringref text syntax (#6289)	Thomas Lively	2024-02-08	1	-85/+16
\| \| \| \|	Removing support for the legacy syntax will allow us to avoid implementing support for it in the new text parser.
*	[NFC] Add links to specs in StringLowering (#6292)	Alon Zakai	2024-02-08	1	-0/+4
\|
*	Add a pass to propagate global constants to other globals (#6287)	Alon Zakai	2024-02-08	3	-2/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SimplifyGlobals already does this, so this is a subset of that pass, and does not add anything new. It is useful for testing, however. In particular it allows testing that we propagate subsequent globals in a single pass, that is if one global reads from another and becomes constant, then it can be propagated as well. SimplifyGlobals runs multiple passes so this always worked, but with this pass we can test that we do it efficiently in one pass. This will also be useful for comparing stringref to imported strings, as it allows gathered strings to be propagated to other globals (possible with stringref, but not imported strings) but not anywhere else (which might have downsides as it could lead to more allocations). Also add an additional test for simplify-globals that we do not get confused by an unoptimizable global.get in the middle (see last part).
*	StringLowering: Lower all remaining important string operations (#6283)	Alon Zakai	2024-02-08	1	-0/+84
\| \| \|	All those in the list from #6271 (comment)
*	[Parser] Do not involve IRBuilder for imported functions (#6286)	Thomas Lively	2024-02-07	4	-13/+14
\| \| \| \| \| \| \| \| \| \|	We previously had a bug where we would begin and end an IRBuilder context for imported functions even though they don't have bodies. For functions that return results, ending this empty scope should have produced an error except that we had another bug where we only produced that error for multivalue functions. We did not previously have imported multivalue functions in wat-kitchen-sink.wast, so both of these bugs went undetected. Fix both bugs and update the test to include an imported multivalue function so that it would have failed without this fix.
*	SimplifyGlobals: Propagate constant globals into nested gets in other ↵	Alon Zakai	2024-02-07	1	-2/+4
\| \| \| \| \|	globals (#6285) Before we propagated to the top level, but not to anything interior.
*	Get more tests working with the new text parser (#6284)	Thomas Lively	2024-02-07	2	-0/+4
\| \| \| \| \| \| \| \|	The new parser enforces the rule that imports must come before declarations (except for type declarations). The old parser does not enforce this rule, so many of our tests did not follow it. Fix them to follow that rule and fix other invalid syntax. Also add missing finalization of Load expressions in wasm-builder.h that was causing a test to fail under the new parser and guard against an error case in wasm-ir-builder.cpp that used to cause a segfault.
*	[NFC] Move code to string.cpp (#6282)	Thomas Lively	2024-02-06	2	-84/+92
\| \| \| \|	Now that we have a .cpp file, none of the code that was in string.h needs to be in a header any more.
*	StringLowering: Start to lower instructions (#6281)	Alon Zakai	2024-02-06	1	-0/+82
\|
*	Properly stringify names in tests (#6279)	Thomas Lively	2024-02-06	7	-130/+199
\| \| \| \| \| \| \| \| \| \| \| \| \|	Update identifiers used in tests to use a format supported by the new text parser, i.e. either the standard format with its limited set of allowed characters or the non-standard `$"..."` format. Notably, any name containing square or curly braces now uses the string format. Input automatically updated with this script: https://gist.github.com/tlively/4e22311736661849e641d02e521a0748 The printer is updated to properly escape names in more places as well. The logic for escaping names is moved to a common location so that the type printing logic in wasm-type.cpp can use it as well.
*	[Parser] Support string-style identifiers (#6278)	Thomas Lively	2024-02-06	2	-29/+68
\| \| \| \| \| \| \| \| \| \|	In addition to normal identifiers, support parsing identifiers of the format `$"..."`. This format is not yet allowed by the standard, but it is a popular proposed extension (see https://github.com/WebAssembly/spec/issues/617 and https://github.com/WebAssembly/annotations/issues/21). Binaryen has historically allowed a similar format and has supported arbitrary non-standard identifier characters, so it's much easier to support this extended syntax than to fix everything to use the restricted standard syntax.
*	Make `array.new_fixed` length annotations mandatory (#6277)	Thomas Lively	2024-02-06	1	-11/+5
\| \| \| \| \|	They were previously optional to ease the transition to the standard text format, but now we can make them mandatory to match the spec. This will simplify the new text parser as well.
*	[EH] Add --experimental-new-eh option to wasm-opt (#6270)	Heejin Ahn	2024-02-06	1	-2/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds `--experimental-new-eh` option to `wasm-opt`. The difference between this and `--translate-to-new-eh` is, `--translate-to-new-eh` just runs `TranslateToNewEH` pass, while `--experimental-new-eh` attaches `TranslateToNewEH` pass at the end of the whole optimization pipeline. So if no other passes or optimization options (`-On`) are specified, it is equivalent to `--translate-to-new-eh`. If other optimization passes are specified, it runs them and at the end run the translator to ensure the new EH instructions are emitted. The reason we are doing this this way is that the optimization pipeline as a whole does not support the new EH instruction yet, but we would like to provide an option to emit a reasonably OK code with the new EH instructions. This also means when the optimization level > 3, it will also run the StackIR + local2stack optimization after the translation. Not sure how to test the output of this option, given that there is not much point in testing the default optimization passes, and it is also not clear how to print the stack IR if the stack ir generation and optimization runs as a part of the pipeline and not the explicit command line options. This is created in favor of #6267, which added the option to `optimization-options.h`. It had a problem of running the translator multiple times when `-On` was given multiple times in the command line, which I learned was rather a common usage. This adds the option directly to `wasm-opt.cpp`, which avoids the problem. With this, it is still possible to create and optimize Stack IR unnecessarily, but that feels a better alternative.