summaryrefslogtreecommitdiff
path: root/test
Commit message (Collapse)AuthorAgeFilesLines
* Prototype SIMD instructions implemented in LLVM (#3440)Thomas Lively2020-12-115-180/+440
| | | | | | - i64x2.eq (https://github.com/WebAssembly/simd/pull/381) - i64x2 widens (https://github.com/WebAssembly/simd/pull/290) - i64x2.bitmask (https://github.com/WebAssembly/simd/pull/368) - signselect ops (https://github.com/WebAssembly/simd/pull/124)
* [GC] Fix Array optimization issues (#3438)Alon Zakai2020-12-114-89/+179
| | | | | | | | | | | Precompute still tried to precompute a reference because the check was not in the topmost place. Also we truncated i8/i16 values, but did not extend them properly. That was also an issue with structs. The new test replaces the old one by moving from -O1 to -Oz (which runs more opts, and would have noticed this earlier), and adds array operations too, including sign extends.
* [GC] Add ref.test and ref.cast (#3439)Alon Zakai2020-12-114-0/+53
| | | | This adds enough to read and write them and test that, but leaves interpreter support for later.
* TypeBuilder (#3418)Thomas Lively2020-12-102-0/+166
| | | | | | | | | | | | | | | | | Introduce TypeBuilder, a utility for constructing heap types in terms of other heap types that may have not yet been defined. Internally, it works by creating HeapTypes backed by mutable HeapTypeInfos owned by the TypeBuilder. That lets the TypeBuilder create temporary Types that can refer to the TypeBuilder-managed HeapTypes. Those temporary Types can in turn be used to initialize the very HeapTypes they refer to. Since the TypeBuilder-managed HeapTypes are only valid for the lifetime of their TypeBuilder, there is a canonicalization step that converts them into globally interned canonical HeapTypes. This PR allows HeapTypes to be built in terms of as of yet undefined HeapTypes, but it currently errors out in the presence of recursive types. Supporting recursive types will require further work to canonicalize them into finite, acyclic representations. Currently any attempt to compare, print, or otherwise manipulate recursive types would infinitely recurse.
* [GC] Add Array operations (#3436)Alon Zakai2020-12-104-30/+254
| | | | | | | | | | | | | | array.new/get/set/len - pretty straightforward after structs and all the infrastructure for them. Also fixes validation of the unnecessary heapType param in the text and binary formats in structs as well as arrays. Fixes printing of packed types in type names, which emitted i32 for them. That broke when we emitted the same name for an array of i8 and i32 as in the new testing here. Also fix a bug in Field::operator< which was wrong for packed types; again, this was easy to notice with the new testing.
* Read and write data segments names in names section (#3435)Sam Clegg2020-12-094-0/+26
|
* Improve lit support (#3426)Sam Clegg2020-12-092-7/+7
| | | | | | | | This uses the same technique used in llvm-lit to enable running on in-tree tests with out-of-tree builds. So you can run something like this: ../binaryen-out/bin/binaryen-lit test/lit/
* [GC] Add struct.new and start to test interesting execution (#3433)Alon Zakai2020-12-096-0/+141
| | | | | | | | | | | | | | With struct.new read/write support, we can start to do interesting things! This adds a test of creating a struct and seeing that references behave like references, that is, if we write to the value X refers to, and if Y refers to the same thing, when reading from Y's value we see the change as well. The test is run through all of -O1, which uncovered a minor issue in Precompute: We can't try to precompute a reference type, as we can't replace a reference with a value. Note btw that the test shows the optimizer properly running CoalesceLocals on reference types, merging two locals.
* [GC] Add basic RTT support (#3432)Alon Zakai2020-12-086-26/+67
| | | | | | | | | | | | | | | | This adds rtt.canon and rtt.sub together with RTT type support that is necessary for them. Together this lets us test roundtripping the instructions and types. Also fixes a missing traversal over globals in collectHeapTypes, which the example from the GC docs requires, as the RTTs are in globals there. This does not yet add full interpreter support and other things. It disables initial contents on GC in the fuzzer, to avoid the fuzzer breaking. Renames the binary ID for exnref, which is being removed from the spec, and which overlaps with the binary ID for rtt.
* Intern HeapTypes and clean up types code (#3428)Thomas Lively2020-12-0720-71/+71
| | | | | | | | | Interns HeapTypes using the same patterns and utilities already used to intern Types. This allows HeapTypes to efficiently be compared for equality and hashed, which may be important for very large struct types in the future. This change also has the benefit of increasing symmetry between the APIs of Type and HeapType, which will make the developer experience more consistent. Finally, this change will make TypeBuilder (#3418) much simpler because it will no longer have to introduce TypeInfo variants to refer to HeapTypes indirectly.
* [GC] Add struct.set (#3430)Alon Zakai2020-12-074-0/+16
| | | | | | | | | | Mostly straightforward after struct.get. This renames the value field in struct.get to ref. I think this makes more sense because struct.set has both a reference to a thing, and a value to set onto that thing. So calling the former ref seems more consistent, giving us ref, value. This mirrors load/store for example where we use ptr, value, and ref is playing the role of ptr here basically.
* [GC] Add struct.get instruction parsing and execution (#3429)Alon Zakai2020-12-0714-38/+320
| | | | | | | | | | | | | | | | | | | | This is the first instruction that uses a GC Struct or Array, so it's where we start to actually need support in the interpreter for those values, which is added here. GC data is modeled as a gcData field on a Literal, which is just a Literals. That is, both a struct and an array are represented as an array of values. The type which is alongside would indicate if it's a struct or an array. Note that the data is referred to using a shared_ptr so it should "just work", but we'll only be able to really test that once we add struct.new and so can verify that references are by reference and not value, etc. As the first instruction to care about i8/16 types (which are only possible in a Struct or Array) this adds support for parsing and emitting them. This PR includes fuzz fixes for some minor things the fuzzer found, including some bad printing of not having ResultTypeName in necessary places (found by the text format roundtripping fuzzer).
* Make NUM_PARAMS in FuncCastEmulation a configuration option (#3411)Dexter Chua2020-12-072-0/+57
| | | | | | Compiling scipy requires a `NUM_PARAMS` of at least 61 (!) https://github.com/iodide-project/pyodide patches emsdk in order to compile, which this PR can avoid.
* [GC] Support reading and writing of Struct and Array types (#3423)Alon Zakai2020-12-058-22/+103
| | | | | | This adds support in the text and binary format handling, which allows us to have a full test of reading and writing the types. This also adds a "name" field to struct fields, which the text format supports.
* wasm-emscripten-finalize: Support PIC + threads (#3427)Sam Clegg2020-12-041-0/+38
| | | | | | | | | | | With PIC + threads the offset of passive segments is not constant but relative to `__memory_base`. When trying to find passive segment offset based on the target of the `memory.init` instruction we need to consider this possibility as well as the regular constant one. For the llvm side of this that generates the calls to memory.init see: https://reviews.llvm.org/D92620
* Remove legacy DYNAMICTOP_PTR support from SafeHeap (#3425)Sam Clegg2020-12-0415-2199/+261
|
* Don't apply SafeHeap to wasm start function (#3424)Sam Clegg2020-12-043-0/+1944
| | | | | | In relocable code (MAIN/SIDE modules) we use the start function to run `__wasm_init_memory` which loads the data segments into place. We can't call get_sbkr pointer during that function because the sbrk pointer itself lives in static data segment.
* Fix a typo and accidental script change (#3414)Thomas Lively2020-12-021-1/+1
|
* [OptimizeInstructions] Fix fuzz bug with shifts (#3376)Alon Zakai2020-12-022-5/+57
| | | | | | | | | | | | | | | | | | | | | | | The code there looks for a "sign-extend": (x << a) >> b where the right shift is signed. If a = b = 24 for example then that is a sign extend of an 8-bit value (it works by shifting the 8-bit value's sign bit to the position of the 32-bit value's sign bit, then shifting all the way back, which fills everything above 8 bits with the sign bit). The tricky thing is that in some cases we can handle a != b - but we forgot a place to check that. Specifically, a repeated sign-extend is not necessary, but if the outer one has extra shifts, we can't do it. This is annoyingly complex code, but for purposes of reviewing this PR, you can see (unless I messed up) that the only change is to ensure that when we look for a repeated sign extend, then we only optimize that case when there are no extra shifts. And a repeated sign-extend is obviously ok to remove, (((x << a) >> a) << a) >> a => (x << a) >> a This is an ancient bug, showing how hard it can be to find certain patterns either by fuzzing or in the real world... Fixes #3362
* [OptimizeInstructions] Fix a fuzz bug with getting the shifts of an ↵Alon Zakai2020-12-022-0/+26
| | | | unreachable (#3413)
* [wasm-split] Record checksums in profiles (#3412)Thomas Lively2020-12-022-1/+24
| | | | | | | | | | | Calculate a checksum of the original uninstrumented module and emit it as part of the profile data. When reading the profile, compare the checksum it contains to the checksum of the module that is being split. Error out if the module being split is not the same as the module that was originally instrumented. Also fixes a bug in how the profile data was being read. When `char` is signed, bytes read from the profile were being incorrectly sign extended. We had not noticed this before because the profiles we have tested have contained only small-valued counts.
* [module-splitting] Allow splitting with non-const table offsets (#3408)Thomas Lively2020-12-012-0/+405
| | | | | | | | | | Extend the splitting logic to handle splitting modules with a single table segment with a non-const offset. In this situation the placeholder function names are interpreted as offsets from the table base global rather than absolute indices into the table. Since addition is not allowed in segment offset expressions, the secondary module's segment must start at the same place as the first table's segment. That means that some primary functions must be duplicated in the secondary segment to fill any gaps. They are exported and imported as necessary.
* [Printing] Print type names where possible. (#3410)Alon Zakai2020-12-014-30/+30
| | | | | | | | | | | | For a nested type, we used to print e.g. (param $x (ref (func (param i32)))) Instead of expanding the full type inline, which can get long for a deeply nested type, print a name when running the Print pass. In this example that would be something like (param $x (ref $i32_=>_none))
* [OptimizeInstructions] Fix a fuzz bug with comparing signed and unsigned ↵Alon Zakai2020-12-013-48/+176
| | | | values (#3399)
* [TypedFunctionReferences] Enable call_ref in fuzzer, and fix minor misc fuzz ↵Alon Zakai2020-11-2512-9/+282
| | | | | | | | | | | | | | | | | | | | bugs (#3401) * Count signatures in tuple locals. * Count nested signature types (confirming @aheejin was right, that was missing). * Inlining was using the wrong type. * OptimizeInstructions should return -1 for unhandled types, not error. * The fuzzer should check for ref types as well, not just typed function references, similar to what GC does. * The fuzzer now creates a function if it has no other option for creating a constant expression of a function type, then does a ref.func of that. * Handle unreachability in call_ref binary reading. * S-expression parsing fixes in more places, and add a tiny fuzzer for it. * Switch fuzzer test to just have the metrics, and not print all the fuzz output which changes a lot. Also fix noprint handling which only worked on binaries before. * Fix Properties::getLiteral() to use the specific function type properly, and make Literal's function constructor require that, to prevent future bugs. * Turn all input types into nullable types, for now.
* [wasm-split] Read and use profiles (#3400)Thomas Lively2020-11-242-0/+92
| | | | | | Read the profiles produced by wasm-split's instrumentation to guide splitting. In this initial implementation, all functions that the profile shows to have been called are kept in the initial module. In the future, users may be able to tune this so that functions that are run later will still be split out.
* [TypedFunctionReferences] Implement call_ref (#3396)Alon Zakai2020-11-2415-739/+1089
| | | | | | | | Includes minimal support in various passes. Also includes actual optimization work in Directize, which was easy to add. Almost has fuzzer support, but the actual makeCallRef is just a stub so far. Includes s-parser support for parsing typed function references types.
* Fix comment in invalid-options.wast (#3398)Heejin Ahn2020-11-241-1/+1
|
* [TypedFunctionReferences] Add Typed Function References feature and use the ↵Alon Zakai2020-11-2312-903/+863
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | types (#3388) This adds the new feature and starts to use the new types where relevant. We use them even without the feature being enabled, as we don't know the features during wasm loading - but the hope is that given the type is a subtype, it should all work out. In practice, if you print out the internal type you may see a typed function reference-specific type for a ref.func for example, instead of a generic funcref, but it should not affect anything else. This PR does not support non-nullable types, that is, everything is nullable for now. As suggested by @tlively this is simpler for now and leaves nullability for later work (which will apparently require let or something else, and many passes may need to be changed). To allow this PR to work, we need to provide a type on creating a RefFunc. The wasm-builder.h internal API is updated for this, as are the C and JS APIs, which are breaking changes. cc @dcodeIO We must also write and read function types properly. This PR improves collectSignatures to find all the types, and also to sort them by the dependencies between them (as we can't emit X in the binary if it depends on Y, and Y has not been emitted - we need to give Y's index). This sorting ends up changing a few test outputs. InstrumentLocals support for printing function types that are not funcref is disabled for now, until we figure out how to make that work and/or decide if it's important enough to work on. The fuzzer has various fixes to emit valid types for things (mostly whitespace there). Also two drive-by fixes to call makeTrivial where it should be (when we fail to create a specific node, we can't just try to make another node, in theory it could infinitely recurse). Binary writing changes here to replace calls to a standalone function to write out a type with one that is called on the binary writer object itself, which maintains a mapping of type indexes (getFunctionSignatureByIndex).
* [wasm-split] Initial instrumentation (#3389)Thomas Lively2020-11-203-0/+93
| | | | | | | | | | | | | Implement an instrumentation pass that records the timestamp at which each defined function is first called. Timestamps are not actual time, but rather snapshots of a monotonically increasing counter. The instrumentation exports a function that the embedder can call to dump the profile data into a memory buffer at a given offset and size. The function returns the total size of the profile data so the embedder can know how much to read out of the buffer or how much it needs to grow the buffer. Parsing and using the profile is left as future work, as is recording a hash of the input file that will be used to guard against accidentally instrumenting one module and trying to use the resulting profile to split a different module.
* Initial wasm-split tool (#3359)Thomas Lively2020-11-193-0/+219
| | | | | | | | | | | | | Implement an initial version of the wasm-split tool, which splits modules into a primary module and a secondary module that can be instantiated after the primary module. Eventually, this tool will be able to not only split modules, but also instrument modules to collect profiles that will be able to guide later splitting. In this initial version, however, wasm-split can neither perform instrumentation nor consume any kind of profile data. Despite those shortcomings, this initial version of the tool is already able to perform module splitting according to function lists manually provided by the user via the command line. Follow-up PRs will implement the stubbed out instrumentation and profile consumption functionality.
* [effects.h] Add a trap effect for unreachable (#3387)Alon Zakai2020-11-181-0/+14
| | | | | | | | | | | We did not really model the effects of unreachable properly before. It always traps, so it's not an implicit trap, but we didn't do anything but mark it as "branches out", which is not really enough, as while yes it does branch inside the current function, it also traps which is noticeable outside. To fix that, add a trap effect to track this. implicitTrap will set trap as well, automatically, if we do not ignore implicit traps, so it is enough to check just that (unless one cares about the difference between implicit and explicit ones).
* [DeadArgumentElimination] Don't DAE a ref.func-ed class (#3380)Alon Zakai2020-11-182-0/+23
| | | | | | | | | | | If we take a reference of a function, it is dangerous to change the function's type (which removing dead arguments does), as that would be an observable different from the outside - the type changes, and some params are now ignored, and others are reordered. In theory we could find out if the reference does not escape, but that's not trivial. Related to #3378 but not quite the same.
* Introduce lit/FileCheck tests (#3367)Thomas Lively2020-11-189-91/+72
| | | | | | | | | | | | | | | lit and FileCheck are the tools used to run the majority of tests in LLVM. Each lit test file contains the commands to be run for that test, so lit tests are much more flexible and can be more precise than our current ad hoc testing system. FileCheck reads expected test output from comments, so it allows test output to be written alongside and interspersed with test input, making tests more readable and precise than in our current system. This PR adds a new suite to check.py that runs lit tests in the test/lit directory. A few tests have been ported to demonstrate the features of the new test runner. This change is motivated by a need for greater flexibility in testing wasm-split. See #3359.
* Rename atomic.notify and *.atomic.wait (#3353)Heejin Ahn2020-11-1327-92/+92
| | | | | | | | | | | | | | - atomic.notify -> memory.atomic.notify - i32.atomic.wait -> memory.atomic.wait32 - i64.atomic.wait -> memory.atomic.wait64 See WebAssembly/threads#149. This renames instruction name printing but not the internal data structure names, such as `AtomicNotify`, which are not always the same as printed instruction names anyway. This also does not modify C API. But this fixes interface functions in binaryen.js because it seems binaryen.js's interface functions all follow the corresponding instruction names.
* Fix a hashing regression from #3332 (#3349)Alon Zakai2020-11-133-0/+27
| | | | | | | | | | | | | | | We used to check if a load's sign matters before hashing it. If the load does not extend, then the sign doesn't matter, and we ignored the value there. It turns out that value could be garbage, as we didn't assign it in the binary reader, if it wasn't relevant. In the rewrite this was missed, and actually it's not really possible to do, since we have just a macro for the field, but not the object it is on - which there may be more than one. To fix this, just always assign the field. This is simpler anyhow, and avoids confusion not just here but probably when debugging. The testcase here is reduced from the fuzzer, and is not a 100% guarantee to catch a read of uninitialized memory, but it can't hurt, and with ASan it may be pretty consistent.
* Update tests following conflicting landings (#3345)Alon Zakai2020-11-121-1/+1
|
* Module splitting (#3317)Thomas Lively2020-11-122-0/+930
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds the capability to programatically split a module into a primary and secondary module such that the primary module can be compiled and run before the secondary module has been instantiated. All calls to secondary functions (i.e. functions that have been split out into the secondary module) in the primary module are rewritten to be indirect calls through the table. Initially, the table slots of all secondary functions contain references to imported placeholder functions. When the secondary module is instantiated, it will automatically patch the table to insert references to the original functions. The process of module splitting involves these steps: 1. Create the new secondary module. 2. Export globals, events, tables, and memories from the primary module and import them in the secondary module. 3. Move the deferred functions from the primary to the secondary module. 4. For any secondary function exported from the primary module, export in its place a trampoline function that makes an indirect call to its placeholder function (and eventually to the original secondary function), allocating a new table slot for the placeholder if necessary. 5. Rewrite direct calls from primary functions to secondary functions to be indirect calls to their placeholder functions (and eventually to their original secondary functions), allocating new table slots for the placeholders if necessary. 6. For each primary function directly called from a secondary function, export the primary function if it is not already exported and import it into the secondary module. 7. Replace all references to secondary functions in the primary module's table segments with references to imported placeholder functions. 8. Create new active table segments in the secondary module that will replace all the placeholder function references in the table with references to their corresponding secondary functions upon instantiation. Functions can be used or referenced three ways in a WebAssembly module: they can be exported, called, or placed in a table. The above procedure introduces a layer of indirection to each of those mechanisms that removes all references to secondary functions from the primary module but restores the original program's semantics once the secondary module is instantiated. As more mechanisms that reference functions are added in the future, such as ref.func instructions, they will have to be modified to use a similar layer of indirection. The code as currently written makes a few assumptions about the module that is being split: 1. It assumes that mutable-globals is allowed. This could be worked around by introducing wrapper functions for globals and rewriting secondary code that accesses them, but now that mutable-globals is shipped on all browsers, hopefully that extra complexity won't be necessary. 2. It assumes that all table segment offsets are constants. This simplifies the generation of segments to actively patch in the secondary functions without overwriting any other table slots. This assumption could be relaxed by 1) having secondary segments re-write primary function slots as well, 2) allowing addition in segment offsets, or 3) synthesizing a start function to modify the table instead of using segments. 3. It assumes that each function appears in the table at most once. This isn't necessarily true in general or even for LLVM output after function deduplication. Relaxing this assumption would just require slightly more complex code, so it is a good candidate for a follow up PR. Future Binaryen work for this feature includes providing a command line tool exposing this functionality as well as C API, JS API, and fuzzer support. We will also want to provide a simple instrumentation pass for finding dead or late-executing functions that would be good candidates for splitting out. It would also be good to integrate that instrumentation with future function outlining work so that dead or exceptional basic blocks could be split out into a separate module.
* [Fuzzer] Add a chance to pick particularly important initial contents (#3343)Alon Zakai2020-11-122-246/+279
| | | | | | | | | | | | OptimizeInstructions is seeing the most work these days, so it's good for the fuzzer to focus on that some more. Also move some code around in the main test wast: it's useful to put each feature in its own module to maximize the chance of getting them to be used. That is, if a module has a single use of atomics, then if atomics are disabled in the current run, we can't use any of the module and we skip initial contents entirely. Moving each feature to it's own module reduces that risk. (We do pick randomly between the modules, and atm a small module has the same chance as a big one, but this still seems worth it.)
* Some refactorings in addition to #3338 (#3336)Max Graey2020-11-122-0/+104
| | | | See discussion in #3303
* OptimizeInstructions: Fix regression from #3303 / #3275 (#3338)Alon Zakai2020-11-123-9/+102
| | | | | | | | | | | | | | | | | X - Y <= 0 => X <= Y That is true mathematically, but not in the case of an overflow, e.g. X=10, Y=0x8000000000000000. X - Y is a negative number, so X - Y <= 0 is true. But it is not true that X <= Y (as Y is negative, but X is not). See discussion in #3303 (comment) The actual regression was in #3275, but the fuzzer had an easier time finding it due to #3303
* Fix BinaryenFunctionOptimize. (#3339)Alon Zakai2020-11-112-0/+43
| | | | | | We mistakenly tried to run all passes there, but should run only the function ones. Fixes #3333
* wasm2js: Declare data segments before calling asmFunc (#3337)Sam Clegg2020-11-1129-380/+383
| | | | | | | | | This is because we maybe need to reference the segments during the start function. For example in the case of pthreads we conditionally load passive segments during start. Tested in emscripten with: tests/runner.py wasm2js1
* Avoid boilerplate in ExpressionAnalyzer comparing & hashing (#3332)Alon Zakai2020-11-112-0/+143
| | | | | | | | | | | | | | | | | | | | Expands on #3294: * Scope names must be distinguished as either defs or uses. * Error when a core #define is missing, which is less error-prone, as suggested by @tlively * Add DELEGATE_GET_FIELD which lets one define "get the field" once and then all the loops can use it. This helps avoid boilerplate for loops at least in some cases (when there is a single object on which to get the field). With those, it is possible to replace boilerplate in comparisons and hashing logic. This also fixes a bug where BrOnExn::sent was not scanned there. Add some unit tests for hashing. We didn't have any, and hashing can be subtly wrong without observable external effects (just more collisions).
* wasm2js: Support for exported memory (#3323)Sam Clegg2020-11-1029-206/+197
| | | | | The asmFunc now sets the outer scope's `bufferView` variable as well as its own internal views.
* Optimize i32(x) % C_pot in boolean context (#3307)Max Graey2020-11-102-0/+38
| | | | | | bool(i32(x) % C_pot) -> bool(i32(x) & (C_pot - 1)) bool(i32(x) % min_s) -> bool(i32(x) & max_s) For all other situations we already do this for (i32|i64).rem_s
* [wasm2js] Use native JavaScript Math.trunc (#3329)Max Graey2020-11-10117-48/+153
|
* Canonicalize subtraction with constant on the right to addition (#3321)Max Graey2020-11-1023-217/+217
| | | | | | | Using addition in more places is better for gzip, and helps simplify the optimizer as well. Add a FinalOptimizer phase to do optimizations like our signed LEB tweaks, to reduce binary size in the rare case when we do want a subtraction.
* Remove dead code and unused includes. NFC. (#3328)Sam Clegg2020-11-081-7/+7
| | | Specifically try to cleanup use of asm_v_wasm.h and asmjs constants.
* Remove OptimizeCalls from PostEmscripten. NFC. (#3326)Sam Clegg2020-11-062-149/+12
| | | We no longer build modules that import `global.Math`.