diff options
author | Alon Zakai <alonzakai@gmail.com> | 2015-11-12 12:45:34 -0800 |
---|---|---|
committer | Alon Zakai <alonzakai@gmail.com> | 2015-11-12 12:45:34 -0800 |
commit | 3a2768127856e7113317e5d907ead6cc41f60299 (patch) | |
tree | 6aa85f7676dc23c7086c729a9b1049c7a9187a9a | |
parent | 7725a4b87feead9416e419b3c95228c4515813da (diff) | |
parent | afce1de1495cc5782ec55b016d4e864b316d9920 (diff) | |
download | binaryen-3a2768127856e7113317e5d907ead6cc41f60299.tar.gz binaryen-3a2768127856e7113317e5d907ead6cc41f60299.tar.bz2 binaryen-3a2768127856e7113317e5d907ead6cc41f60299.zip |
Merge branch 'binaryen'
-rw-r--r-- | .gitignore | 2 | ||||
-rw-r--r-- | README.md | 79 | ||||
-rwxr-xr-x | build.sh | 8 | ||||
-rwxr-xr-x | check.py | 16 | ||||
-rw-r--r-- | src/asm2wasm.h | 2 | ||||
-rw-r--r-- | src/binaryen-shell.cpp (renamed from src/wasm-shell.cpp) | 45 | ||||
-rw-r--r-- | src/pretty_printing.h | 2 | ||||
-rw-r--r-- | src/wasm.h | 23 | ||||
-rw-r--r-- | test/example/find_div0s.cpp | 56 | ||||
-rw-r--r-- | test/example/find_div0s.txt | 10 |
10 files changed, 188 insertions, 55 deletions
diff --git a/.gitignore b/.gitignore index 9ecac78d7..30ae4229f 100644 --- a/.gitignore +++ b/.gitignore @@ -1,6 +1,6 @@ +bin/binaryen-shell bin/asm2wasm bin/wasm.js -bin/wasm-shell *~ *.diff a.* @@ -1,28 +1,49 @@ -# wasm-emscripten +# Binaryen -This repository contains tools to compile C/C++ to WebAssembly s-expressions, using [Emscripten](http://emscripten.org/), by translating Emscripten's asm.js output into WebAssembly, as well as a WebAssembly interpreter that can run the translated code. +Binaryen is a C++ library for WebAssembly. It can: -More specifically, this repository contains: + * **Interpret** WebAssembly. It passes 100% of the spec test suite. + * **Compile** asm.js to WebAssembly, which together with [Emscripten](http://emscripten.org), gives you a complete compiler toolchain from C and C++ to WebAssembly (Emscripten compiles C and C++ to asm.js, Binaryen compile that to WebAssembly). + * **Polyfill** WebAssembly, by running it in the interpreter compiled to JavaScript, if the browser does not yet have native support. - * **asm2wasm**: An asm.js-to-WebAssembly compiler, built on Emscripten's asm optimizer infrastructure. That can directly compile asm.js to WebAssembly. You can use Emscripten to build C++ into asm.js, and together the two tools let you compile C/C++ to WebAssembly. - * **wasm.js**: A polyfill for WebAssembly support in browsers. It receives an asm.js module, parses it using `asm2wasm`, and runs the resulting WebAssembly in a WebAssembly interpreter. It provides what looks like an asm.js module, while running WebAssembly inside. - * **wasm-shell**: A WebAssembly interpreter that can parse S-Expression format and run the spec tests. +To provide those capabilities, Binaryen has a simple and flexible API for **representing and processing** WebAssembly modules. The interpreter, validator, pretty-printer, etc. are built on that foundation. The core of this is in [wasm.h](https://github.com/WebAssembly/binaryen/blob/master/src/wasm.h), which contains classes that define a WebAssembly module, and tools to process those. For a simple example of how to use Binaryen, see [test/example/find_div0s.cpp](https://github.com/WebAssembly/binaryen/blob/master/test/example/find_div0s.cpp), which creates a module and then searches it for a specific pattern. -## Building asm2wasm +## Tools + +This repository contains code that builds the following tools in `bin/`: + + * **binaryen-shell**: A shell that can load and interpret WebAssembly code in S-Expression format, and can run the spec test suite. + * **asm2wasm**: An asm.js-to-WebAssembly compiler, built on Emscripten's asm optimizer infrastructure. That can directly compile asm.js to WebAssembly. + * **wasm.js**: A polyfill for WebAssembly support in browsers. It receives an asm.js module, parses it using an internal build of `asm2wasm`, and runs the resulting WebAssembly in a WebAssembly interpreter. It provides what looks like an asm.js module, while running WebAssembly inside. + +Usage instructions for each are below. + +## Building ``` $ ./build.sh ``` -* `asm2wasm` and `wasm-shell` require a C++11 compiler. If you also want to compile C/C++ to asm.js and then to WebAssembly (and not just asm.js to WebAssembly), you'll need Emscripten (the [stable SDK (or normal manual install)](http://kripken.github.io/emscripten-site/docs/getting_started/downloads.html) is fine). -* `wasm.js` requires Emscripten, using the `incoming` branch in the `emscripten`, `emscripten-fastcomp` and `emscripten-fastcomp-clang` repos (the stable SDK, which is enough for `asm2wasm`, is not enough for `wasm.js`). -* Older versions of gcc hit [this bug](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51048). You probably need gcc 5.0 or later, or a recent version of clang. If you have Emscripten, you can just use the native clang++ from that, it is recent enough. +* `binaryen-shell` and `asm2wasm` require a C++11 compiler. +* If you also want to compile C/C++ to WebAssembly (and not just asm.js to WebAssembly), you'll need Emscripten. You'll need the `incoming` branch there (which you can get via [the SDK](http://kripken.github.io/emscripten-site/docs/getting_started/downloads.html)). +* `wasm.js` also requires Emscripten. ## Running +### binaryen-shell + +Run + +```` +bin/binaryen-shell [.wast file] [--print-before] +```` + + * `--print-before` will print the module before running it. + * Setting `BINARYEN_DEBUG=1` in the env will emit a lot of debugging info. + ### asm2wasm -Just run +run ``` bin/asm2wasm [input.asm.js file] @@ -59,7 +80,7 @@ Set `ASM2WASM_DEBUG=1` in the env to see debug info, about asm.js functions as t Run ``` -./emcc_to_wasm.js.sh [filename.c ; whatever other emcc flags you want] +./emcc_to_wasm.js.sh [.c or .cpp file] [whatever other emcc flags you want] ``` That will call `emcc` and then emit `a.normal.js`, a normal asm.js build for comparison purposes, and `a.wasm.js`, which contains the entire polyfill (`asm2wasm` translator + `wasm.js` interpreter). @@ -74,28 +95,18 @@ emcc src.cpp -o a.html --separate-asm That will emit `a.html`, `a.js`, and `a.asm.js`. That last file is the asm.js module, which you can pass into `asm2wasm`. -For basic tests, that command should work, but in general you need a few more arguments to emcc, see emcc's usage in `emcc_to_wasm.js.sh`, specifically +For basic tests, that command should work, but in general you need a few more arguments to emcc, see emcc's usage in `emcc_to_wasm.js.sh`, specifically: * `ALIASING_FUNCTION_POINTERS=0` because WebAssembly does not allow aliased function pointers (there is a single table). * `GLOBAL_BASE=1000` because WebAssembly lacks global variables, so `asm2wasm` maps them onto addresses in memory. This requires that you have some reserved space for those variables. With that argument, we reserve the area up to `1000`. -### wasm-shell - -Run - -```` -bin/wasm-shell file.wast [print AST before running] -```` - - * Setting `WASM_SHELL_DEBUG=1` in the env will emit a lot of debugging info. - ## Testing ``` ./check.py ``` -(or `python check.py`) will run `asm2wasm` and `wasm.js` on the testcases in `test/`, and verify their outputs. +(or `python check.py`) will run `binaryen-shell`, `asm2wasm`, and `wasm.js` on the testcases in `test/`, and verify their outputs. The `check.py` script supports some options: @@ -107,20 +118,26 @@ The `check.py` script supports some options: * If tests are provided, we run exactly those. If none are provided, we run them all. * `asm2wasm` tests require no dependencies. `wasm.js` tests require `emcc` and `nodejs` in the path. -## FAQ - - * How does this relate to the new WebAssembly backend which is being developed in upstream LLVM? - * This is separate from that. This project focuses on compiling asm.js to WebAssembly, as emitted by Emscripten's asm.js backend. This is useful because while in the long term Emscripten hopes to use the new WebAssembly backend, the `asm2wasm` route is a very quick and easy way to generate WebAssembly output. It will also be useful for benchmarking the new backend as it progresses. - ## License & Contributing Same as Emscripten: MIT license. -(parts of `src/` are synced with `tools/optimizer/` in the main emscripten repo, for convenience) +(`src/emscripten-optimizer` is synced with `tools/optimizer/` in the main emscripten repo, for convenience) ## TODO - * Waiting for switch to stablize on the spec repo; switches are Nop'ed. * Reference interpreter lacks module importing support; imports are Nop'ed in native builds, but enabled in emcc builds (so wasm.js works). * Memory section needs the right size. +## FAQ + +* How does `asm2wasm` relate to the new WebAssembly backend which is being developed in upstream LLVM? + +This is separate from that. `asm2wasm` focuses on compiling asm.js to WebAssembly, as emitted by Emscripten's asm.js backend. This is useful because while in the long term Emscripten hopes to use the new WebAssembly backend, the `asm2wasm` route is a very quick and easy way to generate WebAssembly output. It will also be useful for benchmarking the new backend as it progresses. + +* Why the weird name for the project? + +"Binaryen" is a combination of **binary** - since WebAssembly is a binary format for the web - and **Emscripten** - with which it can integrate in order to compile C and C++ all the way to WebAssembly, via asm.js. Binaryen began as Emscripten's WebAssembly processing library (`wasm-emscripten`). + +"Binaryen" is pronounced [in the same manner](http://www.makinggameofthrones.com/production-diary/2011/2/11/official-pronunciation-guide-for-game-of-thrones.html) as "[Targaryen](https://en.wikipedia.org/wiki/List_of_A_Song_of_Ice_and_Fire_characters#House_Targaryen)": *bi-NAIR-ee-in*. Valar Morcodeis. + @@ -1,8 +1,8 @@ +echo "building binaryen shell" +g++ -O2 -std=c++11 src/binaryen-shell.cpp -g -o bin/binaryen-shell echo "building asm2wasm" -g++ -O2 -std=c++11 src/asm2wasm-main.cpp src/emscripten-optimizer/parser.cpp src/emscripten-optimizer/simple_ast.cpp src/emscripten-optimizer/optimizer-shared.cpp -g -o bin/asm2wasm -Isrc/emscripten-optimizer +g++ -O2 -std=c++11 src/asm2wasm-main.cpp src/emscripten-optimizer/parser.cpp src/emscripten-optimizer/simple_ast.cpp src/emscripten-optimizer/optimizer-shared.cpp -g -o bin/asm2wasm echo "building interpreter/js" -em++ -std=c++11 src/wasm-js.cpp src/emscripten-optimizer/parser.cpp src/emscripten-optimizer/simple_ast.cpp src/emscripten-optimizer/optimizer-shared.cpp -o bin/wasm.js -s MODULARIZE=1 -s 'EXPORT_NAME="WasmJS"' --memory-init-file 0 -s DEMANGLE_SUPPORT=1 -O3 -profiling -s TOTAL_MEMORY=67108864 -s SAFE_HEAP=1 -s ASSERTIONS=1 -Isrc/emscripten-optimizer #-DWASM_JS_DEBUG #-DWASM_INTERPRETER_DEBUG +em++ -std=c++11 src/wasm-js.cpp src/emscripten-optimizer/parser.cpp src/emscripten-optimizer/simple_ast.cpp src/emscripten-optimizer/optimizer-shared.cpp -o bin/wasm.js -s MODULARIZE=1 -s 'EXPORT_NAME="WasmJS"' --memory-init-file 0 -s DEMANGLE_SUPPORT=1 -O3 -profiling -s TOTAL_MEMORY=67108864 -s SAFE_HEAP=1 -s ASSERTIONS=1 #-DWASM_JS_DEBUG #-DWASM_INTERPRETER_DEBUG cat src/js/post.js >> bin/wasm.js -echo "building wasm shell" -g++ -O2 -std=c++11 src/wasm-shell.cpp -g -o bin/wasm-shell -Isrc/emscripten-optimizer @@ -64,20 +64,28 @@ for asm in tests: raise Exception('wasm interpreter error: ' + err) # failed to pretty-print raise Exception('wasm interpreter error') -print '\n[ checking wasm-shell testcases... ]\n' +print '\n[ checking binaryen-shell testcases... ]\n' for t in tests: if t.endswith('.wast') and not t.startswith('spec'): print '..', t t = os.path.join('test', t) - actual, err = subprocess.Popen([os.path.join('bin', 'wasm-shell'), t, 'print-wasm'], stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate() + actual, err = subprocess.Popen([os.path.join('bin', 'binaryen-shell'), t, '--print-before'], stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate() assert err == '', 'bad err:' + err expected = open(t).read() if actual != expected: fail(actual, expected) -print '\n[ checking wasm-shell spec testcases... ]\n' +print '\n[ checking example testcases... ]\n' + +subprocess.check_call(['g++', '-std=c++11', os.path.join('test', 'example', 'find_div0s.cpp'), '-Isrc', '-g']) +actual = subprocess.Popen(['./a.out'], stdout=subprocess.PIPE).communicate()[0] +expected = open(os.path.join('test', 'example', 'find_div0s.txt')).read() +if actual != expected: + fail(actual, expected) + +print '\n[ checking binaryen-shell spec testcases... ]\n' if len(requested) == 0: BLACKLIST = [] @@ -89,7 +97,7 @@ for t in spec_tests: if t.startswith('spec') and t.endswith('.wast'): print '..', t wast = os.path.join('test', t) - proc = subprocess.Popen([os.path.join('bin', 'wasm-shell'), wast], stdout=subprocess.PIPE, stderr=subprocess.PIPE) + proc = subprocess.Popen([os.path.join('bin', 'binaryen-shell'), wast], stdout=subprocess.PIPE, stderr=subprocess.PIPE) actual, err = proc.communicate() assert proc.returncode == 0, err diff --git a/src/asm2wasm.h b/src/asm2wasm.h index 61c67c5d8..ce5b669d1 100644 --- a/src/asm2wasm.h +++ b/src/asm2wasm.h @@ -5,7 +5,7 @@ // #include "wasm.h" -#include "optimizer.h" +#include "emscripten-optimizer/optimizer.h" #include "mixed_arena.h" namespace wasm { diff --git a/src/wasm-shell.cpp b/src/binaryen-shell.cpp index 701b6b6c2..e0f524122 100644 --- a/src/wasm-shell.cpp +++ b/src/binaryen-shell.cpp @@ -144,14 +144,43 @@ struct Invocation { // int main(int argc, char **argv) { - debug = getenv("WASM_SHELL_DEBUG") ? getenv("WASM_SHELL_DEBUG")[0] - '0' : 0; + debug = getenv("BINARYEN_DEBUG") ? getenv("BINARYEN_DEBUG")[0] - '0' : 0; - char *infile = argv[1]; - bool print_wasm = argc >= 3; // second arg means print it out + char *infile = nullptr; + bool print_before = false; + + for (size_t i = 1; i < argc; i++) { + char* curr = argv[i]; + if (curr[0] == '-') { + std::string arg = curr; + if (arg == "--print-before") { + print_before = true; + } else { + if (infile) { + printf("error: unrecognized argument: %s\n", curr); + exit(1); + } + } + } else { + if (infile) { + printf("error: too many input files provided.\n"); + exit(1); + } + infile = curr; + } + } + + if (!infile) { + printf("error: no input file provided.\n"); + exit(1); + } if (debug) std::cerr << "loading '" << infile << "'...\n"; FILE *f = fopen(argv[1], "r"); - assert(f); + if (!f) { + printf("error: could not open input file: %s\n", infile); + exit(1); + } fseek(f, 0, SEEK_END); int size = ftell(f); char *input = new char[size+1]; @@ -170,6 +199,7 @@ int main(int argc, char **argv) { if (debug) std::cout << root << '\n'; // A .wast may have multiple modules, with some asserts after them + bool checked = false; size_t i = 0; while (i < root.size()) { if (debug) std::cerr << "parsing s-expressions to wasm...\n"; @@ -180,7 +210,7 @@ int main(int argc, char **argv) { auto interface = new ShellExternalInterface(); auto instance = new ModuleInstance(wasm, interface); - if (print_wasm) { + if (print_before) { if (debug) std::cerr << "printing...\n"; std::cout << wasm; } @@ -190,6 +220,7 @@ int main(int argc, char **argv) { Element& curr = *root[i]; IString id = curr[0]->str(); if (id == MODULE) break; + checked = true; Colors::red(std::cerr); std::cerr << i << '/' << (root.size()-1); Colors::green(std::cerr); @@ -240,10 +271,10 @@ int main(int argc, char **argv) { } } - if (debug) { + if (checked) { Colors::green(std::cerr); Colors::bold(std::cerr); - std::cerr << "\ndone.\n"; + std::cerr << "all checks passed.\n"; Colors::normal(std::cerr); } } diff --git a/src/pretty_printing.h b/src/pretty_printing.h index cc2d88272..88028fdec 100644 --- a/src/pretty_printing.h +++ b/src/pretty_printing.h @@ -5,7 +5,7 @@ #include <ostream> -#include "colors.h" +#include "emscripten-optimizer/colors.h" std::ostream &doIndent(std::ostream &o, unsigned indent) { for (unsigned i = 0; i < indent; i++) { diff --git a/src/wasm.h b/src/wasm.h index fcee2915c..6fc4ca5aa 100644 --- a/src/wasm.h +++ b/src/wasm.h @@ -24,7 +24,7 @@ #include <map> #include <vector> -#include "simple_ast.h" +#include "emscripten-optimizer/simple_ast.h" #include "pretty_printing.h" namespace wasm { @@ -263,12 +263,10 @@ public: }; Id _id; - Expression() : _id(InvalidId) {} - Expression(Id id) : _id(id) {} - WasmType type; // the type of the expression: its *output*, not necessarily its input(s) - std::ostream& print(std::ostream &o, unsigned indent); // avoid virtual here, for performance + Expression() : _id(InvalidId), type(none) {} + Expression(Id id) : _id(id), type(none) {} template<class T> bool is() { @@ -280,6 +278,12 @@ public: return _id == T()._id ? (T*)this : nullptr; } + std::ostream& print(std::ostream &o, unsigned indent); // avoid virtual here, for performance + + friend std::ostream& operator<<(std::ostream &o, Expression* expression) { + return expression->print(o, 0); + } + static std::ostream& printFullLine(std::ostream &o, unsigned indent, Expression *expression) { doIndent(o, indent); expression->print(o, indent); @@ -704,6 +708,11 @@ public: printFullLine(o, indent, right); return decIndent(o, indent); } + + // the type is always the type of the operands + void finalize() { + type = left->type; + } }; class Compare : public Expression { @@ -849,6 +858,8 @@ public: Name type; // if null, it is implicit in params and result Expression *body; + Function() : result(none) {} + std::ostream& print(std::ostream &o, unsigned indent) { printOpening(o, "func ", true) << name; if (params.size() > 0) { @@ -921,7 +932,7 @@ public: size_t initial, max; std::vector<Segment> segments; - Memory() : initial(0), max(-1) {} + Memory() : initial(0), max((uint32_t)-1) {} }; class Module { diff --git a/test/example/find_div0s.cpp b/test/example/find_div0s.cpp new file mode 100644 index 000000000..60eed5f62 --- /dev/null +++ b/test/example/find_div0s.cpp @@ -0,0 +1,56 @@ + +// +// Tiny example, using Binaryen to walk a WebAssembly module in search +// for direct integer divisions by zero. To do so, we inherit from +// WasmWalker, and implement visitBinary, which is called on every +// Binary node in the module's functions. +// + +#include <ostream> +#include <wasm.h> +#include <wasm-s-parser.h> + +using namespace wasm; + +int main() { + // A simple WebAssembly module in S-Expression format. + char input[] = + "(module" + " (func $has_div_zero" + " (i32.div_s" + " (i32.const 5)" + " (i32.const 0)" + " )" + " )" + ")"; + + // Parse the S-Expression text, and prepare to build a WebAssembly module. + SExpressionParser parser(input); + Element& root = *parser.root; + Module module; + + // The parsed code has just one element, the module. Build the module + // from that (and abort on any errors, but there won't be one here). + SExpressionWasmBuilder builder(module, *root[0], [&]() { abort(); }); + + // Print it out + std::cout << module; + + // Search it for divisions by zero: Walk the module, looking for + // that operation. + struct DivZeroSeeker : public WasmWalker { + void visitBinary(Binary* curr) { + // In every Binary, look for integer divisions + if (curr->op == BinaryOp::DivS || curr->op == BinaryOp::DivU) { + // Check if the right operand is a constant, and if it is 0 + auto right = curr->right->dyn_cast<Const>(); + if (right && right->value.getInteger() == 0) { + std::cout << "We found that " << curr->left << " is divided by zero\n"; + } + } + } + }; + DivZeroSeeker seeker; + seeker.startWalk(&module); +} + diff --git a/test/example/find_div0s.txt b/test/example/find_div0s.txt new file mode 100644 index 000000000..554790493 --- /dev/null +++ b/test/example/find_div0s.txt @@ -0,0 +1,10 @@ +(module + (memory 0 4294967295) + (func $has_div_zero + (i32.div_s + (i32.const 5) + (i32.const 0) + ) + ) +) +We found that (i32.const 5) is divided by zero |