From b0e999a2b8841d8be21cbcdc84cbc1d6469e36d7 Mon Sep 17 00:00:00 2001
From: Alon Zakai <azakai@google.com>
Date: Tue, 19 Nov 2024 09:28:01 -0800
Subject: Fuzzing: ClusterFuzz integration (#7079)

The main addition here is a bundle_clusterfuzz.py script which will package up
the exact files that should be uploaded to ClusterFuzz. It also documents the
process and bundling and testing. You can do

bundle.py OUTPUT_FILE.tgz

That bundles wasm-opt from ./bin., which is enough for local testing. For
actually uploading to ClusterFuzz, we need a portable build, and @dschuff
had the idea to reuse the emsdk build, which works nicely. Doing

bundle.py OUTPUT_FILE.tgz --build-dir=/path/to/emsdk/upstream/

will bundle wasm-opt (+libs) from the emsdk. I verified that those builds
work on ClusterFuzz.

I added several forms of testing here. First, our main fuzzer fuzz_opt.py now
has a ClusterFuzz testcase handler, which simulates a ClusterFuzz environment.
Second, there are smoke tests that run in the unit test suite, and can also be
run separately:

python -m unittest test/unit/test_cluster_fuzz.py

Those unit tests can also run on a given bundle, e.g. one created from an
emsdk build, for testing right before upload:

BINARYEN_CLUSTER_FUZZ_BUNDLE=/path/to/bundle.tgz python -m unittest test/unit/test_cluster_fuzz.py

A third piece of testing is to add a --fuzz-passes test. That is a mode for
-ttf (translate random data into a valid wasm fuzz testcase) that uses random
data to pick and run a set of passes, to further shape the wasm. (--fuzz-passes
had no previous testing, and this PR fixes it and tidies it up a little, adding some
newer passes too).

Otherwise this PR includes the key run.py script that is bundled and then
executed by ClusterFuzz, basically a python script that runs wasm-opt -ttf [..]
to generate testcases, sets up their JS, and emits them.

fuzz_shell.js, which is the JS to execute testcases, will now check if it is
provided binary data of a wasm file. If so, it does not read a wasm file from
argv[1]. (This is needed because ClusterFuzz expects a single file for the
testcase, so we make a JS file with bundled wasm inside it.)
---
 scripts/fuzz_shell.js | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

(limited to 'scripts/fuzz_shell.js')

diff --git a/scripts/fuzz_shell.js b/scripts/fuzz_shell.js
index d9a994896..ce817646e 100644
--- a/scripts/fuzz_shell.js
+++ b/scripts/fuzz_shell.js
@@ -25,14 +25,18 @@ if (typeof process === 'object' && typeof require === 'function') {
   };
 }
 
-// We are given the binary to run as a parameter.
-var binary = readBinary(argv[0]);
+// The binary to be run. This may be set already (by code that runs before this
+// script), and if not, we get the filename from argv.
+var binary;
+if (!binary) {
+  binary = readBinary(argv[0]);
+}
 
 // Normally we call all the exports of the given wasm file. But, if we are
 // passed a final parameter in the form of "exports:X,Y,Z" then we call
 // specifically the exports X, Y, and Z.
 var exportsToCall;
-if (argv[argv.length - 1].startsWith('exports:')) {
+if (argv.length > 0 && argv[argv.length - 1].startsWith('exports:')) {
   exportsToCall = argv[argv.length - 1].substr('exports:'.length).split(',');
   argv.pop();
 }
-- 
cgit v1.2.3