summaryrefslogtreecommitdiff
path: root/third_party
Commit message (Collapse)AuthorAgeFilesLines
* Fix the fp16 header include. (#6871)Brendan Dahl2024-08-261-5/+1
|
* [FP16] Implement load and store instructions. (#6796)Brendan Dahl2024-08-066-0/+684
| | | | Specified at https://github.com/WebAssembly/half-precision/blob/main/proposals/half-precision/Overview.md
* Update two files from upstream LLVM, ConvertUTF.h,cpp (#5954)Alon Zakai2023-09-182-32/+100
| | | | | | Almost no actual change in the files except for a license update. The new license is a proper FOSS one, it turns out, see #5947 Fixes #5947
* [DWARF] Warn on unsupport DWARF versions and content (#5120)Alon Zakai2022-10-072-2/+4
| | | | | Unfortunately there isn't a single place where an error may occur. I tested on several files with different flags and added sufficient warnings so that we warn on them all.
* Do not build gtest libs unless BUILD_TESTS is set (#4552)Thomas Lively2022-03-291-7/+4
| | | This was missed in #4536.
* Introduce gtest (#4466)Thomas Lively2022-01-202-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | Add gtest as a git submodule in third_party and integrate it into the build the same way WABT does. Adds a new executable, `binaryen-unittests`, to execute `gtest_main`. As a nontrivial example test, port one of the `TypeBuilder` tests from example/ to gtest/. Using gtest has a number of advantages over the current example tests: - Tests are compiled and linked at build time rather than runtime, surfacing errors earlier and speeding up test execution. - Tests are all built into a single binary, reducing overall link time and further reducing test overhead. - Tests are built from the same CMake project as the rest of Binaryen, so compiler settings (e.g. sanitizers) are applied uniformly rather than having to be separately set via the COMPILER_FLAGS environment variable. - Using the industry-standard gtest rather than our own script reduces our maintenance burden. Using gtest will lower the barrier to writing C++ tests and will hopefully lead to us having more proper unit tests.
* Fix assert on access of empty vector (#4045)Wouter van Oortmerssen2021-08-021-4/+4
| | | | | (on VS2019) Triggered by: wasm-emscripten-finalize --minimize-wasm-changes -g --bigint --no-dyncalls --dwarf test_asan_api.wasm -o test_asan_api.wasm --detect-features
* Ignore missing CUs in DWARF rewriting (#3700)Alon Zakai2021-03-182-0/+3
| | | | | | | | | | | | | | | A recent change in LLVM causes it to sometimes end up with a thing with no parent. That is, a debug_line or a debug_loc that has no CU that refers to it. This is perhaps LLVM DCEing CUs, or something else that changed - not sure. But it seems like valid DWARF we should handle. This PR handles that in our code. Two things broke here. First, locs must be simply ignored when there is no CU. Second, lines are trickier as we used to compute their position by scanning them, and that list contained only ones with a CU. So we missed some and ended up with wrong offsets. To make things simpler and more robust, just track the position of each line table on itself. Fixes #3697
* Fixed .debug_loc parsing for wasm64 files (#3660)Wouter van Oortmerssen2021-03-082-5/+11
| | | | | The address size was hard-coded to 4, it now gets this information from .debug_info. This required changing the parsing order. Also made failure to parse .debug_loc fail the program, as before this error was easy to ignore.
* Remove assertions that prevent non-assertion builds (#3576)Alon Zakai2021-02-171-2/+0
| | | And fix errors from such a build.
* Remove exnref and br_on_exn (#3505)Heejin Ahn2021-01-221-2/+0
| | | This removes `exnref` type and `br_on_exn` instruction.
* [wasm64] fix for Memory64Lowering affecting DWARF data (#3348)Wouter van Oortmerssen2020-11-132-1/+4
| | | | We change the AddrSize which causes all DW_FORM_addr to be written differently. Depends on https://reviews.llvm.org/D91395
* DWARF: Fix abbreviation lookups, they are relative to 1 (#3158)Alon Zakai2020-09-221-6/+6
| | | | | | Apparently I misunderstood the DWARF spec on this. Abbreviation offsets are all relative to 1 (as 0 is not a valid number). For some reason I thought the first DIE's index was the "base", as in practice LLVM always emits the lowest index there, and that's what the LLVM YAML code suggested to me.
* Add mozjs, V8 and WABT setup script (#3053)Daniel Wirtz2020-09-148-0/+705
| | | Adds a new script `./third_party/setup.py` to conveniently install necessary dependencies for testing and fuzzing, including the SpiderMonkey JS shell (mozjs), the V8 JS shell and WABT. Other scripts now automatically pick these up when installed and fall back to look for the tools in PATH like before.
* DWARF: Optimize abbreviation index/offset computation (#3033)Alon Zakai2020-08-181-8/+32
|
* DWARF: Fix debug_info references to the abbreviations section (#2997)Alon Zakai2020-08-073-7/+28
| | | | | | | | | | | | | | | | The previous code assumed that each compile unit had its own abbreviation section, and they are all in order. That's normally how LLVM emits things, but in #2992 there is a testcase in which linking of object files with IR files somehow ends up with a different order. The proper fix is to track the binary offsets of abbreviations in the abbreviation section. That section is comprised of null-terminated lists, which each CU has an offset to the beginning of. With those offsets, we can match things properly. Add a testcase that crashes without this, to prevent regressions. Fixes #2992 Fixes #3007
* Fix DWARF location list updating with nonzero compilation unit base addr ↵Paolo Severini2020-05-272-0/+4
| | | | | | | | | | | | | | | | (#2862) In the .debug_loc section the Start/End address offsets in a location list are relative to the address of the compilation unit that refers that location list. There is a problem in function wasm::Debug:: updateLoc(), which compares these offsets with the actual module addresses of expressions and functions, causing the generation of invalid location lists. The fix is not trivial, because the DWARF debug_loc section does not specify which is the compilation unit associated to each location list entry. A simple workaround is to store, in LocationUpdater, a map of location list offsets to the base address of the compilation units referencing them, and that can be easily calculated in updateDIE().
* DWARF: Ignore a compile unit with no abbreviations (#2678)Alon Zakai2020-03-041-0/+5
| | | | | | | | | | Such a module can't have valid DIEs, since we have no way to interpret them. Also check if DWARF sections from LLVM have contents - when they are empty the section may exist but have a null for its data. Fixes #2673
* On OpenBSD (6.6) libc++ fileno is defined as a MACRO which doesn't work with ↵osen2020-02-271-1/+1
| | | | :: (#2669)
* DWARF: Fix debug_abbrev section (#2630)Alon Zakai2020-01-283-10/+31
| | | | | | | | | Each compilation unit's abbreviations must be terminated by a zero, so that we use the right abbreviations. This adds that support to the YAML layer, both adding the zeros and parsing them to look in the right abbreviation section at the right time. Also add two large testcases, zlib and cubescript, which crash without this and the last PR.
* DWARF: Update DW_AT_stmt_list which are offsets into the debug_line section ↵Alon Zakai2020-01-282-2/+26
| | | | | | | | (#2628) The debug_line section is the only one in which we change sizes and so must update offsets. It turns out that there are such offsets, DW_AT_stmt_list, so without updating them we can't handle multi-unit dwarf files.
* DWARF: Fix emitting of DW_FORM_sdata (#2627)Alon Zakai2020-01-272-1/+19
|
* DWARF: Update .debug_loc (#2616)Alon Zakai2020-01-234-0/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for that section to the YAML layer, and add code to update it. The updating is slightly tricky - unlike .debug_ranges, the size of entries is not fixed. So we can't just skip entries, as the end marker is smaller than a normal entry. Instead, replace now-invalid segments with (1, 1) which is of size 0 and so should be ignored by the debugger (we can't use (0, 0) as that would be an end marker, and (-1, *) is the special base marker). In the future we probably do want to do this in a more sophisticated manner, completely rewriting the indexes into the section as well. For now though this should be enough for when binaryen does not optimize (as we don't move/reorder anything). Note that this doesn't update the location description (like where on the wasm expression stack the value is). Again, that is correct for when binaryen doesn't optimize, but for fully optimized builds we would need to track things (which would be hard!). Also clean up some code that uses "Extra" instead of "Delimiter" that was missed before, and shorten some unnecessarily long names.
* Handle an invalid AbbrCode in DWARF handling (#2607)Alon Zakai2020-01-211-0/+7
| | | | | | | | | | | | | Fixes the testcase in #2343 (comment) Looks like that's from Rust. Not sure why it would have an invalid abbreviation code, but perhaps the LLVM there emits dwarf differently than we've tested on so far. May be worth investigating further, but for now emit a warning, skip that element, and don't crash. Also fix valgrind warnings about Span values not being initialized, which was invalid and bad as well (wasted memory in our maps, and might have overlapped with real values), and interfered with figuring this out.
* Update LLVM to support WASM_location (#2596)Alon Zakai2020-01-163-1/+6
| | | | From llvm/llvm-project@adf7a0a
* Fix emitting of .debug_abbrev (#2582)Alon Zakai2020-01-101-0/+4
| | | | | gimli-rs and perhaps also the spec require a final 0 to terminate the list. LLVM itself is fine without that.
* DWARF support for multiple line tables (#2557)Alon Zakai2020-01-096-8/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Multiple tables appear to be emitted when linking files together. This fixes our support for that, which did not update their size properly. This required patching the YAML emitting code from LLVM in order to measure the size and then emit it, as that code is apparently not designed to handle changes in line table contents. Other minor fixes: * Set the flags for our dwarfdump command to emit the same as llvm-dwarfdump does with -v -all. * Add support for a few more opcodes, set_discriminator, set_basic_block, fixed_advance_pc, set_isa. * Handle a compile unit without abbreviations in the YAML code (again, apparently not something this LLVM code was intended to do). * Handle a compile unit with zero entries in the YAML code (ditto). * Properly set the AddressSize - we use the DWARFContext in a different way than LLVM expects, apparently. With this the emscripten test suite passes with -gforce_dwarf without crashing. My overall impression so from the the YAML code is that it probably isn't a long-term solution for us. Perhaps it may end up being scaffolding, that is, we can replace it with our own code eventually that is based on it, and remove most of the LLVM code. Before deciding that we should get everything working first, and this seems like the quickest path there.
* Fix debug build (#2561)Alon Zakai2020-01-061-46/+1
| | | | This requires removing some more LLVM code, which was only pulled in a debug build.
* DWARF debug line updating (#2545)Alon Zakai2019-12-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With this, we can update DWARF debug line info properly as we write a new binary. To do that we track binary locations as we write. Each instruction is mapped to the location it is written to. We must also adjust them as we move code around because of LEB optimization (we emit a function or a section with a 5-byte LEB placeholder, the maximal size; later we shrink it which is almost always possible). writeDWARFSections() now takes a second param, the new locations of instructions. It then maps debug line info from the original offsets in the binary to the new offsets in the binary being written. The core logic for updating the debug line section is in wasm-debug.cpp. It basically tracks state machine logic both to read the existing debug lines and to emit the new ones. I couldn't find a way to reuse LLVM code for this, but reading LLVM's code was very useful here. A final tricky thing we need to do is to update the DWARF section's internal size annotation. The LLVM YAML writing code doesn't do that for us. Luckily it's pretty easy, in fixEmittedSection we just update the first 4 bytes in place to have the section size, after we've emitted it and know the size. This ignores debug lines with a 0 in the line, col, or addr, see WebAssembly/debugging#9 (comment) This ignores debug line offsets into the middle of instructions, which LLVM sometimes emits for some reason, see WebAssembly/debugging#9 (comment) Handling that would likely at least double our memory usage, which is unfortunate - we are run in an LTO manner, where the entire app's DWARF is present, and it may be massive. I think we should see if such odd offsets are a bug in LLVM, and if we can fix or prevent that. This does not emit "special" opcodes for debug lines. Those are purely an optimization, which I wanted to leave for later. (Even without them we decrease the size quite a lot, btw, as many lines have 0s in them...) This adds some testing that shows we can load and save fib2.c and fannkuch.cpp properly. The latter includes more than one function and has nontrivial code. To actually emit correct offsets a few minor fixes are done here: * Fix the code section location tracking during reading - the correct offset we care about is the body of the code section, not including the section declaration and size. * Fix wasm-stack debug line emitting. We need to update in BinaryInstWriter::visit(), that is, right before writing bytes for the instruction. That differs from * BinaryenIRWriter::visit which is a recursive function that also calls the children - so the offset there would be of the first child. For some reason that is correct with source maps, I don't understand why, but it's wrong for DWARF... * Print code section offsets in hex, to match other tools. Remove DWARFUpdate pass, which was useful for testing temporarily, but doesn't make sense now (it just updates without writing a binary). cc @yurydelendik
* DWARF parsing and writing support using LLVM (#2520)Alon Zakai2019-12-19287-0/+90576
This imports LLVM code for DWARF handling. That code has the Apache 2 license like us. It's also the same code used to emit DWARF in the common toolchain, so it seems like a safe choice. This adds two passes: --dwarfdump which runs the same code LLVM runs for llvm-dwarfdump. This shows we can parse it ok, and will be useful for debugging. And --dwarfupdate writes out the DWARF sections (unchanged from what we read, so it just roundtrips - for updating we need #2515). This puts LLVM in thirdparty which is added here. All the LLVM code is behind USE_LLVM_DWARF, which is on by default, but off in JS for now, as it increases code size by 20%. This current approach imports the LLVM files directly. This is not how they are intended to be used, so it required a bunch of local changes - more than I expected actually, for the platform-specific stuff. For now this seems to work, so it may be good enough, but in the long term we may want to switch to linking against libllvm. A downside to doing that is that binaryen users would need to have an LLVM build, and even in the waterfall builds we'd have a problem - while we ship LLVM there anyhow, we constantly update it, which means that binaryen would need to be on latest llvm all the time too (which otherwise, given DWARF is quite stable, we might not need to constantly update). An even larger issue is that as I did this work I learned about how DWARF works in LLVM, and while the reading code is easy to reuse, the writing code is trickier. The main code path is heavily integrated with the MC layer, which we don't have - we might want to create a "fake MC layer" for that, but it sounds hard. Instead, there is the YAML path which is used mostly for testing, and which can convert DWARF to and from YAML and from binary. Using the non-YAML parts there, we can convert binary DWARF to the YAML layer's nice Info data, then convert that to binary. This works, however, this is not the path LLVM uses normally, and it supports only some basic DWARF sections - I had to add ranges support, in fact. So if we need more complex things, we may end up needing to use the MC layer approach, or consider some other DWARF library. However, hopefully that should not affect the core binaryen code which just calls a library for DWARF stuff. Helps #2400