WinAFL fuzzing in action
Introduction
Armed with some understanding of AFL and WinAFL’s theory, we can proceed to actually use it to fuzz some toy and production binaries.
Build
Building WinAFL is easy if you have Visual Studios. Just follow the instructions on the git repo.
First build Dynamorio:
1 | git clone https://github.com/DynamoRIO/dynamorio.git |
Then:
1 | git clone https://github.com/googleprojectzero/winafl.git |
You should find the built afl-fuzz
binary in <winafl git path>/build64/bin/Release
Note: It is recommended to comment out line 238 in winafl.c
as shown below
1 | if(options.debug_mode) { |
This function kills the drrun process if it detects a crash, which is 1. redundant since the process will already “crash”, and 2. masks Windows error messages which makes debugging Dynamorio a pain
Test1
Just like in the Symbolic Execution series, let’s start off with a simple toy binary to familiarise ourselves with the arguments and intricacies of the tool.
stringhelper.c:
1 |
|
This is a super simple DLL that converts spaces in a string to comma.
Compile with
1 | cl /LD stringhelper.c |
I initially intended to model CVE-2021-3156 for a subtle heap overflow, but AFL doesn’t do well with ASAN turned on…(https://github.com/googleprojectzero/winafl/blob/master/afl_docs/notes_for_asan.txt). This required the scenario design to be slightly more wrangled, which kind of demolishes the purpose of a toy binary.
Now we write a small harness for it
1 |
|
Remember how we dicussed shared memory delivery
in the previous post? There’s no reason to not enable this feature, so I adapted https://github.com/googleprojectzero/Jackalope/blob/6d92931b2cf614699e2a023254d5ee7e20f6e34b/test.cpp for our use case.
We copy the sample data into a malloced buffer, then null terminate it to conform with our test DLL’s requirements. Of course the data might contain nulls, but it won’t jeopardise execution flow and raise false positives.
Compile…
1 | cl harness.c |
WinAFL is kind of buggy with dynamic symbols resolution so I didn’t build it with that feature(-DUSE_DRSYMS=1
). target_offset
can be easily resolved with a debugger or disassembler.
Pre Fuzz
Before we fuzz, it’s nice to do a test run with drrun
just to make sure the harness is working.
1 | "<install path>\drrun.exe" -c "<install path>\winafl.dll" -debug -target_module harness.exe -target_offset 0x1090 -fuzz_iterations 10 -nargs 3 -- harness.exe -f a.txt |
Here we specify the already built winafl.dll
as a client library to drrun
. We will be performing 10 dry runs against harness.exe:0x1090
which is the offset of our fuzz
function from the module base. a.txt
is just a text file containing the word hello
.
NOTE: target_module
here isn’t referring to the path of the module, but the ACTUAL NAME of the module. The same can be said for coverage_module
. I spent at least an hour trying to debug this issue… until
1 | else if (strcmp(token, "-target_module") == 0) { |
Another day of hating third party things
If all is well, a log file should be generated with the content
1 | Everything appears to be running normally. |
amidst other output. That’s our green light to commence fuzzing.
Fuzzing
1 | "<install path>\afl-fuzz.exe" -s -i in -o out -D "<install path>\dynamorio\build\bin64" -t 5000 -- -coverage_module harness.exe -coverage_module stringhelper.dll -fuzz_iterations 5000 -target_module harness.exe -target_offset 0x1090 -nargs 3 -- harness.exe -m @@ |
-s
instructs WinAFL to use shared memory delivery, in
is a directory containing our a.txt
and out
is an empty output directory for the results. -D
points to Dynamorio install path and -t
is for 5000ms timeout.
We instrument both harness.exe
and stringhelper.dll
for coverage, and only restart the process every 5000
loops(persistent mode). WinAFL will replace @@
with the actual shared memory/file name when it runs.
Running it for a few seconds finds us almost a dozen crashes.
1 | WinAFL 1.16b based on AFL 2.43b (harness.exe) |
NOTE: Sometimes crashes may be perceived by AFL as hangs. This can be due to a number of reasons:
- There’s a post mortem debugger present
- WER is dropping crash dumps and that’s taking time
- Program is crashing but default timeout too short, usually comes with warnings like
nudge operation failed
because the program has exited just as WinAFL instructsdrconfig
to kill it - Edge case bugs in WinAFL
I suggest disabling all system wide error reporting softwares, as well as any JIT/post mortem debuggers.
If you see too many hangs, consider increasing the timeout value.
Crash Analysis
AFL has two minimization tools available. afl-cmin
minimizes the number of corpus present, by trying to group corpuses with similar tuples together(see previous post for more details). afl-tmin
minimizes an individual testcase by removing/replacing blocks of code with 0s(again, see previous post).
afl-cmin
is usually used in pre fuzzing steps. Before fuzzing an image parser for example, you may scrape the web for thousands of images as corpuses. Running afl-cmin
on the corpuses will return a minimized set of corpuses that allows you to still exercise all tuples that the thousands would have. You can also use it to minimize testcases in out\queue
to prepare for post fuzz coverage analysis.
In our case however, we’ll use afl-tmin
.
1 | "<afl path>\afl-tmin.exe" -D <dynamorio build path>\bin64 -i "out\crashes\id_000000_00_EXCEPTION_ACCESS_VIOLATION" -o minimized -- -coverage_module harness.exe -coverage_module stringhelper.dll -target_module harness.exe -target_offset 0x1090 -nargs 3 -- harness.exe -f @@ |
The arguments are essentially identical, just change delivery method from shared memory to file.
1 | $ xxd out/crashes/id_000000_00_EXCEPTION_ACCESS_VIOLATION |
For the scope of this toy, afl-tmin
worked like a charm, reducing the initial convoluted testcase into a concise and straightforward trigger.
Coverage Analysis
What’s fuzzing if we don’t check coverage?
If you use IDA or Binary Ninja, the Lighthouse
https://github.com/gaasedelen/lighthouse plugin is well tested and works amazingly.
In the unfortunate case where you have access to neither, you can use Ghidra with the dragondance plugin https://github.com/0ffffffffh/dragondance
First we use afl-cmin
to compress the testcases in the out\queue
directory. These testcases are meant to explore all paths that AFL have currently ventured.
1 | python "<winafl install path>\winafl-cmin.py" -i out -o minset --crash-dir minset\ --hang-dir minset\ -D <dynamorio install path>\build\bin64 -t 100000 -coverage_module harness.exe -coverage_module stringhelper.dll -target_module harness.exe -target_offset 0x1090 -nargs 3 -- harness.exe -f @@ |
This will compress all testcases into new directory minset
, including crashes and hangs since we’re doing post fuzz analysis.
We’ll run drrun
on every testcase, generating multiple log files.
To run on a specific testcase:
1 | <dynamorio path>\bin64\drrun.exe -t drcov -logdir logs -- harness.exe -f minset\id_00_id_000000 |
This will produce as many log files as the number of testcases you have.
At time of writing, dragondance does not allow you to import a directory of log files, so we have to either add every file manually using the GUI, or merge the log files into one.
I adapted https://github.com/vanhauser-thc ‘s drcov-merge script for Linux such that it works for Windows.
1 |
|
We can also script the whole workflow using python.
1 | import os, sys |
The final merged logs file will be called merged.log
, which we can import into dragondance conveniently.
The flow graph on the left side is our small StringSpaceToComma
function. The darker the colour, the more times the basic block was executed.
Almost all our blocks were covered, apart from the one involved in the check for a null pointer, which our harness made sure to not pass in. Neat!
It is important to note that unlike Symbolic Execution, the goal of Fuzz Testing is not to attain maximum coverage after execution. Instead, we should be going for maximum coverage in our testcases BEFORE execution, so our mutated values can taint a larger area of code.
Test2
Enough with toy binaries. Let’s move on to a production software, KeePassXC
https://github.com/keepassxreboot/keepassxc
KeePassXC is an open source, community driven fork of KeePass, a password manager.
I chose KeePassXC as a target because it’s easy to fuzz(open source), highly used(15k stars) and quite complicated.
In this post we will be fuzzing the database unlock function, which includes parsing the kdbx
database.
The better way to target such a program is to compile it on Linux and use AFL++, given its open source, cross platform nature. However I’m just here to have fun so WinAFL dumb mode it is.
Building
Building KeePassXC from source(on Windows) is honestly painful. The docs are terrible and shit keeps happening.
You can try to follow the official wiki:
https://github.com/keepassxreboot/keepassxc/blob/develop/INSTALL.md
and come back here if you struggle.
I assume you have Visual Studios installed.
1. Install ruby and asciidoctor
Install ruby and asciidoctor following instructions on https://github.com/keepassxreboot/keepassxc/wiki/Set-up-Build-Environment-on-Windows
You might face an error when running gem
, that reports something like UndefinedCoversionError
.
Comment out the line
1 | LOCALE = Encoding.find(Encoding.locale_charmap) |
in <install path>\lib\ruby\3.2.0\win32
and change it to
1 | LOCALE = Encoding::UTF_8 |
2. Download vcpkg
Download and unzip the pre-built vcpkg export by following link on https://github.com/keepassxreboot/keepassxc/wiki/Set-up-Build-Environment-on-Windows
3. Download KeePassXC source
https://keepassxc.org/download/#source
4. Configure cmake
Open up x64 Native Tools Command Prompt
1 | mkdir build && cd build |
We turn all extra features off since we’re not targeting them.
Harnessing
The initial plan was to load up the dll responsible for unlocking the database and extract the functions out to fuzz. Unfortunately for KeePassXC, the processing functions are compiled into the exe itself, so we can’t do that unless we fix relocations first.
The good news is that KeePassXC comes with a command line version, which we can easily modify to make it a harness.
keepassxc-cli database open flow
cli\keepassxc-cli.cpp
1 | int main(int argc, char** argv) |
The cli main function initializes some QT library functions and IO streams. It then calls into enterInteractiveMode
if we choose to open
a database.
1 | int enterInteractiveMode(const QStringList& arguments) |
This function initializes an Open
class, and passes the execute
method our arguments.
Open.cpp
1 | int Open::execute(const QStringList& arguments) |
Just forwards the arguments to DatabaseCommand::execute
.
DatabaseCommand.cpp
1 | int DatabaseCommand::execute(const QStringList& arguments) |
Check if we already have a database open. If not pass arguments to unlockDatabase
.
Utils.cpp
1 | QSharedPointer<Database> unlockDatabase(const QString& databaseFilename, |
This prompts us to input a password or a key file, then invokes the core database open function.
At this stage, it’s quite obvious that we can make a fuzz function that simply constructs a call to db->open
, skipping all the abstraction.
modified keepassxc-cli.cpp:
1 | __declspec(noinline) int fuzz(const QString& databaseFilename, QSharedPointer<CompositeKey>& compositeKey, QTextStream& err) |
Build with:
1 | cmake --build . --config Release |
Now we can begin fuzzing.
Fuzzing KeePassXC
Create 2 corpuses with the KeePassXC GUI, one version 4.0 and one version 3.1
Choose the shortest decryption time for both and set password as hello
.
Now we should have 2 kdbx files as corpuses. Each of them are 2kb large, which is alright for WinAFL.
Unfortunately, running drrun
shows an access violation in QtCore.
1 | 00007ff8`82d5f759 8b0d99422700 mov ecx,dword ptr [Qt5Core!QAbstractDeclarativeData::setWidgetParent+0x580 (00007ff8`82fd39f8)] |
The crash happens when the program tries to access TLS. As per discussion in https://groups.google.com/g/DynamoRIO-Users/c/cPv56eXe3t4 , Dynamorio isn’t tested to support Windows 11.
The issue is also stated by Christopher in his research https://www.signal-labs.com/blog/fuzzing-wechats-wxam-parser#:~:text=I%20see%2C%20this%20DLL%20uses%20CRT%20(also%20thread%2Dlocal%20storage)%20%E2%80%94%20this%20causes%20issues%20with%20DynamoRIO%20(which%20I%20was%20using%20with%20WinAFL). , pointing out that TLS operations mess dynamorio up.
At this point, I switched to Windows 10 VM to give it another try.
Actual Fuzzing
1 | In pre_fuzz_handler |
This time drrun
doesn’t complain.
Our initial testcases also successfully exercise part of the code.
Now begin actual fuzzing.
1 | "C:\Users\IEUser\Desktop\winafl\build64\bin\Release\afl-fuzz.exe" -i in -o out -D C:\Users\IEUser\Desktop\dynamorio\build64\bin64 -t 50000 -- -coverage_module keepassxc-cli.exe -fuzz_iterations 5000 -target_module keepassxc-cli.exe -target_offset 0x14450 -nargs 2 -- keepassxc-cli.exe @@ |
Fast forward 2000000 days
Conclusion
Looking back at the second fuzzing exercise on KeePassXC, it’s evidently not successful.
For some reason I tried to fuzz a target that performs a decryption routine… which is definitely going to be super slow.
I should be digging deeper into the source code by manual analysis, and set persistent mode on the functions that perform parsing of the DB after decryption.
In particular, the xmlReader.readDatabase
method:
1 | KeePass2RandomStream randomStream; |
Since my main goal is to practise the motion of setting up fuzzing using different frameworks, I shall be forgiven this time :P
(Lorem ipsum dolor sit homework for the reader blah)
WinAFL is also not the most appropriate tool to use against an open source cross platform code. CodeQL with AFL++ should lead to more promising results in a shorter time.
As for WinAFL itself… I have mixed feelings for the tool.
On one hand it’s pretty easy to use and also has proven results. On the other hand… support is lacking for newer binaries and OS versions. Dynamorio is buggy on Windows 11, and syzygy is also not maintained anymore. Even the original creator released(and maintains) a newer fuzzer that is a superset of WinAFL, called Jackalope
(https://github.com/googleprojectzero/Jackalope).
Our exploration with WinAFL will end here, and future posts could be AFL++ reading, Jackalope review or snapshot fuzzing things.
..