Code review of the Jackalope fuzzer
Introduction
After the past month of mostly looking at CVEs, it’s time to go back to basics.
This post will be an analysis and code review of the fuzzing tool - Jackalope, designed by ProjectZero as a customizable and scalable blackbox fuzzer.
Let’s be honest, WinAFL and other AFL forks are quite nasty to read and modify.
Jackalope written in C++ offers a much cleaner codebase and greater extensibility.
Another reason is because WinAFL uses Dynamorio, which is akin to a heavy rocket launcher for program instrumentation.
For fuzzing that’s kind of an overkill, so Jackalope is shipped with ProjectZero’s own lightweight instrumentation library, TinyInst.
In this post we’ll go through the core components of a modern mutation based coverage guided fuzzer and how Jackalope implements them:
- Architecture and Extensibility
- Mutation Engine
- Delivery and Persistence
- Coverage and Detection
Architecture and Extensibility
In the main function, a Fuzzer
object is expected to be instantiated with its Run()
method invoked.
1 | Fuzzer* fuzzer; |
CustomFuzzer
is an arbitrary user defined child class of Fuzzer
.
1 | class CustomFuzzer : public Fuzzer { |
Fuzzer
exposes several virtual methods for the child to override, such as CreateMutator()
which allows custom mutator configuration.
As a generic fuzzer we can instantiate a probabilistic mutator that allows specifying the probability at which each specific mutator is triggered:
1 | PSelectMutator *pselect = new PSelectMutator(); |
The above PSelectMutator
is in fact a child class of Mutator
, which we can again override to enable custom mutation.
For example in order to achieve j00ru’s idea of having a dedicated mutation probability for each section of a file, we can easily override CreateMutator()
to create a PSelectMutator
for each section, and override Mutate()
to fragmentize each sample into the relevant sections before delegating the call to the respective PSelectMutator->Mutate()
.
Similar principal is applied to other components like Sample Delivery and Instrumentation(yes most instrumentation methods are virtual so you can extend to use custom instrumentation).
1 | class Instrumentation { |
Jackalope supports two modes: server/client mode for synchronisation across machines, as well as local multithreaded mode.
For the purpose of this article I’ll focus on the latter.
The main fuzzer thread starts n
number of worker threads:
1 | for (int i = 1; i <= num_threads; i++) { |
Each worker thread waits for a job, which can either be to process a sample or run a sample:
1 | void Fuzzer::RunFuzzerThread(ThreadContext *tc) { |
ProcessSample()
attempts to run the input corpus once to see if it behaves badly without mutation.
Doesn’t seem too useful.
FuzzJob()
tries to mutate each testcase fully and run it through the target.
1 | while (1) { |
It will also update the stats table that’s be displayed by the main thread.
Details regarding the run is covered below under Delivery and Persistence.
Mutation Engine
Jackalope comes with a probabilistic mutator and several common mutators like byte flip, block duplicate, append and splice.
Interestingly it doesn’t come with a bit flip mutator, maybe that’s less worthy to deploy empirically.
PSelectMutator
has an interesting algorithm:
1 | double p = prng->RandReal() * psum; |
It converts each user specified weight into an amalgamated absolute percentage, then generates a random percentage to compare with.
Apart from this there’s nothing much to talk about.
Mutation is meant to be customized anyways.
Delivery and Persistence
The functionality required to run the target process with a testcase is fully implemented by TinyInst, hence the fuzzer just needs to call the API:
Fuzzer::FuzzJob()
1 | RunResult result = RunSample(tc, &mutated_sample, &has_new_coverage, true, true, init_timeout, timeout, entry->sample); |
Fuzzer::RunSample()
1 | RunResult result = RunSampleAndGetCoverage(tc, sample, &initialCoverage, init_timeout, timeout); |
Fuzzer::RunSampleAndGetCoverage()
1 | RunResult result = tc->instrumentation->Run(tc->target_argc, tc->target_argv, init_timeout, timeout); |
The above functions are just delegating the call to the lowest layer Run()
method implemented by each instrumentation engine.
TinyInstInstrumentation::Run()
1 | RunResult TinyInstInstrumentation::Run(int argc, char **argv, uint32_t init_timeout, uint32_t timeout) { |
This is a manager to call into the actual Run() function residing in TinyInst.
In persistent mode, it first checks if the process needs to be restarted because the maximum number of iterations has been hit.
Otherwise, it tries to continue the process if it is already running, or start it if it isn’t.
This works because TinyInst adds breakpoints to the target function in persistent mode.
1 | // called when a module gets loaded |
Upon hitting the target function, it saves the arguments and overwrites the return address to generate an exception when the function ends.
1 | // called when the target method is reached |
This exception will be caught, notifying TinyInst that the target function has ended.
1 | case EXCEPTION_ACCESS_VIOLATION: { |
In persistent(loop) mode, it restores the arguments and sets the instruction pointer back to the start of target function
1 | // called every time the target method returns |
This explains why the run manager in Jackalope can repeatedly attempt to Continue()
the target.
Coverage and Detection
Coverage data is also kindly provided by TinyInst APIs.
On initialization, TinyInst marks all code segments of the instrumented module as unexecutable and copies the code elsewhere in memory.
Therefore whenever code in the module is exercised, an exception is raised and TinyInst catches that.
TinyInst then tries to instrument all basic blocks reachable via direct jumps from the initially exercised code by modifying the copied code recursively, then redirecting control flow to the copied code.
This is to prevent further exceptions in the current code path, since exception handling is expensive.
Edge Coverage
Edge instrumentation is achieved by adding a mov
statement that sets an index in the coverage bytemap.
First a coverage code unique for each edge is generated:
1 | uint64_t LiteCov::GetEdgeCode(ModuleInfo *module, size_t edge_address1, |
The hash table buf_to_coverage is used to map the bytemap index to coverage code:
1 | data->buf_to_coverage[data->coverage_buffer_next] = coverage_code; |
Finally the assembly instruction is written to the copied code.
1 | void LiteCov::EmitCoverageInstrumentation(ModuleInfo *module, |
When the bytemap is retrieved, it’s decoded using the buf_to_coverage table, and the resulting coverage codes are written to disk as coverage.
On Jackalope’s end it simply calls the API to run/get coverage:
1 | RunResult result = tc->instrumentation->Run(tc->target_argc, tc->target_argv, init_timeout, timeout); |
If the result is a crash or hang, it’s logged immediately to disk.
Otherwise, a check is performed to see if new coverage was generated.
1 | CoverageDifference(tc->thread_coverage, initialCoverage, new_thread_coverage); |
Mutated samples that result in new coverage is saved to disk and added to a priority queue
, prioritizing samples that have most recently contributed to new coverage.
1 | std::priority_queue<SampleQueueEntry *, std::vector<SampleQueueEntry *>, CmpEntryPtrs> sample_queue; |
Conclusion
A mutation based coverage guided fuzzer is really straightforward.
All you need is an instrumentation framework, a mutator and a detector.
The instrumentation approach can either be static(compile time like afl-as or binary level like pe-afl) or dynamic(Dynamorio, PIN, TinyInst), and it will be the most challenging to implement.
The remaining is just a dumb fuzzer.
If you want to implement a fuzzer from scratch, start with a dumb one first!