Fuzzing Julia functions (for security hardening)

This came up when I was asking whether Meta.parse is safe to use. While the function itself is supposed to be safe in principle, it has not been hardened (audited for security bugs). Fuzzing seems like one of the standard way to start doing that, so I was hopeful there might be people interested in setting up the appropriate fuzzer for this goal. Here I have a naive implementation of a fuzzer target, and comments from more competent people would be very much appreciated.

This is the fuzzer target I am currently running:

#include <julia.h>

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
    jl_init();

    jl_module_t* meta = (jl_module_t*)jl_eval_string("Meta");
    jl_function_t *jparse = jl_get_function(meta, "parse");
    jl_value_t *getstr = jl_eval_string("getstr(ptr, len)  = transcode(String, unsafe_wrap(Array, Ptr{UInt8}(ptr), len))");
    jl_value_t *str = jl_call2(getstr, jl_box_voidpointer((void *)Data), jl_box_uint64(Size));
    jl_value_t *parsed = jl_call1(jparse, str);

    jl_atexit_hook(0);
    return 0;
}

It takes some random-ish Data of length Size, it uses Julia’s transcode and unsafe_wrap to turn it into a Julia String and then calls Meta.parse on it. Am I missing something?

Any suggestions for anything else that should be high priority for fuzzing?

To run libFuzzer on this, first compile the target:

clang -g -O1 -fsanitize=fuzzer -fPIC -I$JULIA_DIR/include/julia -L$JULIA_DIR/lib -Wl,-rpath,$JULIA_DIR/lib fuzz_target.cc -ljulia -o fuzz_target

create an initial fuzzing corpus:

mkdir corpus; cd corpus
cp MY_FAVORITE_JULIA_LIBRARY.jl .
split -l 1 MY_FAVORITE_JULIA_LIBRARY.jl
cd ..

and run the fuzzer for a few days(?)

./fuzz_target CORPUS/ -jobs=10

All of this comes with the caveat that I read the fuzzer documentation and the Julia embedding documentation barely an hour ago and might have terribly misunderstood how things work.

4 Likes

This increased the speed of fuzzing by a couple of orders of magnitude (the getstr function is not compiled from scratch each time):

#include <julia.h>

jl_value_t * DoInitialization() {
    jl_init();
    jl_module_t* meta = (jl_module_t*)jl_eval_string("Meta");
    jl_function_t *jparse = jl_get_function(meta, "parse");
    jl_value_t *getstr = jl_eval_string("getstr(ptr, len)  = transcode(String, unsafe_wrap(Array, Ptr{UInt8}(ptr), len))");
    return getstr;
}

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
    static jl_value_t *getstr = DoInitialization();

    jl_module_t* meta = (jl_module_t*)jl_eval_string("Meta");
    jl_function_t *jparse = jl_get_function(meta, "parse");
    //jl_value_t *getstr = jl_eval_string("getstr(ptr, len)  = transcode(String, unsafe_wrap(Array, Ptr{UInt8}(ptr), len))");
    jl_value_t *str = jl_call2(getstr, jl_box_voidpointer((void *)Data), jl_box_uint64(Size));
    jl_value_t *parsed = jl_call1(jparse, str);

    //jl_atexit_hook(0);
    return 0;
}

1 Like

So, I don’t want to curb your enthusiasm too much, but let me give you a very short outline of fuzzing theory after AFL (american fuzzy lop) revolutionized the field ca 2014. As a disclaimer, I’m out of the industry for some years, so I’m not up to current state of the art, nor do I know a lot about libfuzzer.

The general problem that fuzzers try to solve is: Given a program P, explore what the program can do when fed with different inputs. Take the dual viewpoint: Your input is the program running on the Weird machine - Wikipedia defined by P, and you’re trying to explore what you can do with this machine.

It is clear that just throwing random (independently distributed) bytes at your machine won’t explore a lot of interesting behavior: The vast majority leads to uninteresting parse-errors. The way modern fuzzers work is that you don’t just consider the visible output behavior; instead you instrument the code, and consider internal states of the program. Precisely what internal states you take into account is a little bit of an art (that the fuzzer authors engage in); code coverage is almost always part of it. Given “instrumentation”, it is clear why llvm – a compiler library! – is in a good position to help with this.

Let’s take a very simple example:

check(input::Vector{UInt8}) = input[1]==0x68 && input[2] ==0x65 && input[3]==0x6c && input[4]==0x6c && input[5]==0x6c

Let N=5 be the number of comparisons. If you want to discover an input where check returns true, you would need 2^(8*N) many random attempts. This is clearly infeasible.

Now you take into account internal program state, namely code coverage. Imagine the following (super naive!) fuzzer: You keep a library of (program output, set of executed instructions in the program) → Set of samples that produced the pair.

In order to keep memory consumption acceptable, you use e.g. reservoir sampling on the sample sets.

At each step, you choose a uniformly random (output, coverage) pair, and then a random sample that produced it. Then, you mutate a single random byte in this input, and record the result.

How long until you find an input that passes the check, when searching length-L inputs? Suppose you already passed a length N check. Then for the next one, you need N * L * 256 many steps (256 because your mutation needs to get the correct character, L because you need to select the right character for mutation, and N because you need to select a sample that passes N checks). Sum this up over N, and you need subexponential time to discover the magic sample that passes the checks.

It is hard to overstate what a big deal this is: Subexponential! AFL famously managed to invent valid jpeg images when fuzzing a jpeg parser in less than a day, without any initial corpus cf lcamtuf's blog: Pulling JPEGs out of thin air

You would never observe monkeys on a typewriter creating valid jpeg or SQL or julia. But if the fuzzing target is not a black box, if it is instrumented, then you don’t need proper understanding of your target program.


So, now you’re fuzzing the julia parser. Super cool. However, the julia parser is written in femtolisp; and the femtolisp interpreter is written in C. The problem is that the crucial program trace data doesn’t know about abstractions like “lisp”. Libfuzzer and AFL are made for fuzzing C programs (or more generally: For compiled programs, not for interpreted programs).

And without good program traces, fancy fuzzers are no better than monkeys hammering on a typewriter.

What I’m saying here is: Look at the test cases that the fuzzer stores. If these look like interesting julia, then whatever traces come out are probably good enough to guide the exploration of state space. Otherwise, you may need to add extra instrumentation to the lisp code (to the parser code written in femtolisp – libfuzzer can probably instrument the femtolisp interpreter, but that is primarily useful for fuzzing the femtolisp parser/interpreter, and that’s not your goal).

Simpler said: Unless you do something with the femtolisp interpreter, the fuzzer will use coverage info of the “femtolisp interpreter”, instead of the interpreted program. You care about the interpreted program, so you probably need a way to inform your fuzzer about which parts of the interpreted program (i.e. the julia parser) are executed.

13 Likes

Thank you, this was extremely informative and well written!

The verdict is that libFuzzer definitely does not work here, at least in the way I am using it. I checked by using the merge command on a corpus made out of the complete code of the Julia standard library (chopped in many different ways, not just as complete files). The “reduced” corpus that libFuzzer created based on the available traces was a single one-line file with a boring variable definition statement.

Is there a trivial trick where I just compile Julia with “debug symbols” (I am cargo-culting here, I barely know what debug symbols means) that takes care of the instrumentation automagically?

1 Like