Fuzzing the MSXML6 library with WinAFL

Introduction

In this blog post, I’ll write about how I tried to fuzz the MSXML library using the WinAFL fuzzer.

If you haven’t played around with WinAFL, it’s a massive fuzzer created by Ivan Fratric based on the lcumtuf’s AFL which uses DynamoRIO to measure code coverage and the Windows API for memory and process creation. Axel Souchet has been actively contributing features such as corpus minimization, latest afl stable builds, persistent execution mode which will cover on the next blog post and the finally the afl-tmin tool.

We will start by creating a test harness which will allow us to fuzz some parsing functionality within the library, calculate the coverage, minimise the test cases and finish by kicking off the fuzzer and triage the findings. Lastly, thanks to Mitja Kolsek from 0patch for providing the patch which will see how one can use the 0patch to patch this issue!

Using the above steps, I’ve managed to find a NULL pointer dereference on the msxml6!DTD::findEntityGeneral function, which I reported to Microsoft but got rejected as this is not a security issue. Fair enough, indeed the crash is crap, yet hopefully somebody might find interesting the techniques I followed!

The Harness

While doing some research I ended up on this page which Microsoft has kindly provided a sample C++ code which allows us to feed some XML files and validate its structure. I am going to use Visual Studio 2015 to build the following program but before I do that, I am slightly going to modify it and use Ivan’s charToWChar method so as to accept an argument as a file:

// xmlvalidate_fuzz.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include <stdio.h>
#include <tchar.h>
#include <windows.h>
#import <msxml6.dll>
extern "C" __declspec(dllexport)  int main(int argc, char** argv);

// Macro that calls a COM method returning HRESULT value.
#define CHK_HR(stmt)        do { hr=(stmt); if (FAILED(hr)) goto CleanUp; } while(0)

void dump_com_error(_com_error &e)
{
    _bstr_t bstrSource(e.Source());
    _bstr_t bstrDescription(e.Description());

    printf("Error\n");
    printf("\a\tCode = %08lx\n", e.Error());
    printf("\a\tCode meaning = %s", e.ErrorMessage());
    printf("\a\tSource = %s\n", (LPCSTR)bstrSource);
    printf("\a\tDescription = %s\n", (LPCSTR)bstrDescription);
}

_bstr_t validateFile(_bstr_t bstrFile)
{
    // Initialize objects and variables.
    MSXML2::IXMLDOMDocument2Ptr pXMLDoc;
    MSXML2::IXMLDOMParseErrorPtr pError;
    _bstr_t bstrResult = L"";
    HRESULT hr = S_OK;

    // Create a DOMDocument and set its properties.
    CHK_HR(pXMLDoc.CreateInstance(__uuidof(MSXML2::DOMDocument60), NULL, CLSCTX_INPROC_SERVER));

    pXMLDoc->async = VARIANT_FALSE;
    pXMLDoc->validateOnParse = VARIANT_TRUE;
    pXMLDoc->resolveExternals = VARIANT_TRUE;

    // Load and validate the specified file into the DOM.
    // And return validation results in message to the user.
    if (pXMLDoc->load(bstrFile) != VARIANT_TRUE)
    {
        pError = pXMLDoc->parseError;

        bstrResult = _bstr_t(L"Validation failed on ") + bstrFile +
            _bstr_t(L"\n=====================") +
            _bstr_t(L"\nReason: ") + _bstr_t(pError->Getreason()) +
            _bstr_t(L"\nSource: ") + _bstr_t(pError->GetsrcText()) +
            _bstr_t(L"\nLine: ") + _bstr_t(pError->Getline()) +
            _bstr_t(L"\n");
    }
    else
    {
        bstrResult = _bstr_t(L"Validation succeeded for ") + bstrFile +
            _bstr_t(L"\n======================\n") +
            _bstr_t(pXMLDoc->xml) + _bstr_t(L"\n");
    }

CleanUp:
    return bstrResult;
}

wchar_t* charToWChar(const char* text)
{
    size_t size = strlen(text) + 1;
    wchar_t* wa = new wchar_t[size];
    mbstowcs(wa, text, size);
    return wa;
}

int main(int argc, char** argv)
{
    if (argc < 2) {
        printf("Usage: %s <xml file>\n", argv[0]);
        return 0;
    }

    HRESULT hr = CoInitialize(NULL);
    if (SUCCEEDED(hr))
    {
        try
        {
            _bstr_t bstrOutput = validateFile(charToWChar(argv[1]));
            MessageBoxW(NULL, bstrOutput, L"noNamespace", MB_OK);
        }
        catch (_com_error &e)
        {
            dump_com_error(e);
        }
        CoUninitialize();
    }

    return 0;

}

Notice also the following snippet: extern "C" __declspec(dllexport) int main(int argc, char** argv);

Essentially, this allows us to use target_method argument which DynamoRIO will try to retrieve the address for a given symbol name as seen here.

I could use the offsets method as per README, but due to ASLR and all that stuff, we want to scale a bit the fuzzing and spread the binary to many Virtual Machines and use the same commands to fuzz it. The extern "C" directive will unmangle the function name and will make it look prettier.

To confirm that indeed DynamoRIO can use this method the following command can be used:

dumpbin /EXPORTS xmlvalidate_fuzz.exe

Viewing the exported functions.

Now let’s quickly run the binary and observe the output. You should get the following output:

Output from the xmlvlidation binary.

Code Coverage

WinAFL

Since the library is closed source, we will be using DynamoRIO’s code coverage library feature via the WinAFL:

C:\DRIO\bin32\drrun.exe -c winafl.dll -debug -coverage_module msxml6.dll -target_module xmlvalidate.exe -target_method main -fuzz_iterations 10 -nargs 2 -- C:\xml_fuzz_initial\xmlvalidate.exe C:\xml_fuzz_initial\nn-valid.xml

WinAFL will start executing the binary ten times. Once this is done, navigate back to the winafl folder and check the log file:

Checking the coverage within WinAFL.

From the output we can see that everything appears to be running normally! On the right side of the file, the dots depict the coverage of the DLL, if you scroll down you’ll see that we did hit many function as we are getting more dots throughout the whole file. That’s a very good indication that we are hiting a lot of code and we properly targeting the MSXML6 library.

Lighthouse - Code Coverage Explorer for IDA Pro

This plugin will help us understand better which function we are hitting and give a nice overview of the coverage using IDA. It’s an excellent plugin with very good documentation and has been developed by Markus Gaasedelen (@gaasedelen) Make sure to download the latest DynamoRIO version 7, and install it as per instrcutions here. Luckily, we do have two sample test cases from the documentation, one valid and one invalid. Let’s feed the valid one and observe the coverage. To do that, run the following command:

C:\DRIO7\bin64\drrun.exe -t drcov -- xmlvalidate.exe nn-valid.xml

Next step fire up IDA, drag the msxml6.dll and make sure to fetch the symbols! Now, check if a .log file has been created and open it on IDA from the File -> Load File -> Code Coverage File(s) menu. Once the coverage file is loaded it will highlight all the functions that your test case hit.

Case minimisation

Now it’s time to grab some XML files (as small as possible). I’ve used a slightly hacked version of joxean’s find_samples.py script. Once you get a few test cases let’s minimise our initial seed files. This can be done using the following command:

python winafl-cmin.py --working-dir C:\winafl\bin32 -D C:\DRIO\bin32 -t 100000 -i C:\xml_fuzz\samples -o C:\minset_xml -coverage_module msxml6.dll -target_module xmlvalidate.exe -target_method fuzzme -nargs 1 -- C:\xml_fuzz\xmlvalidate.exe @@

You might see the following output:

corpus minimization tool for WinAFL by <0vercl0k@tuxfamily.org>
Based on WinAFL by <ifratric@google.com>
Based on AFL by <lcamtuf@google.com>
[+] CWD changed to C:\winafl\bin32.
[*] Testing the target binary...
[!] Dry-run failed, 2 executions resulted differently:
Tuples matching? False
Return codes matching? True

I am not quite sure but I think that the winafl-cmin.py script expects that the initial seed files lead to the same code path, that is we have to run the script one time for the valid cases and one for the invalid ones. I might be wrong though and maybe there’s a bug which in that case I need to ping Axel.

Let’s identify the ‘good’ and the ‘bad’ XML test cases using this bash script:

$ for file in *; do printf "==== FILE: $file =====\n"; /cygdrive/c/xml_fuzz/xmlvalidate.exe $file ;sleep 1; done

The following screenshot depicts my results:

Looping through the test cases with Cygwin

Feel free to expirement a bit, and see which files are causing this issue - your mileage may vary. Once you are set, run again the above command and hopefully you’ll get the following result:

Minimising our initial seed files.

So look at that! The initial campaign included 76 cases which after the minimisation it was narrowed down to 26.
Thank you Axel!

With the minimised test cases let’s code a python script that will automate all the code coverage:

import sys
import os

testcases = []
for root, dirs, files in os.walk(".", topdown=False):
    for name in files:
        if name.endswith(".xml"):
            testcase =  os.path.abspath(os.path.join(root, name))
            testcases.append(testcase)

for testcase in testcases:
    print "[*] Running DynamoRIO for testcase: ", testcase
    os.system("C:\\DRIO7\\bin32\\drrun.exe -t drcov -- C:\\xml_fuzz\\xmlvalidate.exe %s" % testcase)

The above script produced the following output for my case:

Coverage files produced by the Lighthouse plugin.

As previously, using IDA open all those .log files under File -> Load File -> Code Coverage File(s) menu.

Code coverage using the Lighthouse plugin and IDA Pro.

Interestingly enough, notice how many parse functions do exist, and if you navigate around the coverage you’ll see that we’ve managed to hit a decent amount of interesting code.

Since we do have some decent coverage, let’s move on and finally fuzz it!

All I do is fuzz, fuzz, fuzz

Let’s kick off the fuzzer:

afl-fuzz.exe -i C:\minset_xml -o C:\xml_results -D C:\DRIO\bin32\ -t 20000 -- -coverage_module MSXML6.dll -target_module xmlvalidate.exe -target_method main -nargs 2 -- C:\xml_fuzz\xmlvalidate.exe @@

Running the above yields the following output:

WinAFL running with a slow speed.

As you can see, the initial code does that job - however the speed is very slow. Three executions per second will take long to give some proper results. Interestingly enough, I’ve had luck in the past and with that speed (using python and radamsa prior the afl/winafl era) had success in finding bugs and within three days of fuzzing!

Let’s try our best though and get rid of the part that slows down the fuzzing. If you’ve done some Windows programming you know that the following line initialises a COM object which could be the bottleneck of the slow speed:

HRESULT hr = CoInitialize(NULL);

This line probably is a major issue so in fact, let’s refactor the code, we are going to create a fuzzme method which is going to receive the filename as an argument outside the COM initialisation call. The refactored code should look like this:

--- cut ---

extern "C" __declspec(dllexport) _bstr_t fuzzme(wchar_t* filename);

_bstr_t fuzzme(wchar_t* filename)
{
    _bstr_t bstrOutput = validateFile(filename);
    //bstrOutput += validateFile(L"nn-notValid.xml");
    //MessageBoxW(NULL, bstrOutput, L"noNamespace", MB_OK);
    return bstrOutput;

}
int main(int argc, char** argv)
{
    if (argc < 2) {
        printf("Usage: %s <xml file>\n", argv[0]);
        return 0;
    }

    HRESULT hr = CoInitialize(NULL);
    if (SUCCEEDED(hr))
    {
        try
        {
            _bstr_t bstrOutput = fuzzme(charToWChar(argv[1]));
        }
        catch (_com_error &e)
        {
            dump_com_error(e);
        }
        CoUninitialize();
    }
    return 0;
}
--- cut ---

You can grab the refactored version here. With the refactored binary let’s run one more time the fuzzer and see if we were right. This time, we will pass the fuzzme target_method instead of main, and use only one argument which is the filename. While we are here, let’s use the lcamtuf’s xml.dic from here.

afl-fuzz.exe -i C:\minset_xml -o C:\xml_results -D C:\DRIO\bin32\ -t 20000 -x xml.dict -- -coverage_module MSXML6.dll -target_module xmlvalidate.exe -target_method fuzzme -nargs 1 -- C:\xml_fuzz\xmlvalidate.exe @@

Once you’ve run that, here’s the output within a few seconds of fuzzing on a VMWare instance:

WinAFL running with a massive speed.

Brilliant! That’s much much better, now let it run and wait for crashes!

The findings - Crash triage/analysis

Generally, I’ve tried to fuzz this binary with different test cases, however unfortunately I kept getting the NULL pointer dereference bug. The following screenshot depicts the findings after a ~ 12 days fuzzing campaign:

Fuzzing results after 12 days.

Notice that a total of 33 million executions were performed and 26 unique crashes were discovered!

In order to triage these findings, I’ve used the BugId tool from SkyLined, it’s an excellent tool which will give you a detailed report regarding the crash and the exploitability of the crash.

Here’s my python code for that:

import sys
import os


sys.path.append("C:\\BugId")

testcases = []
for root, dirs, files in os.walk(".\\fuzzer01\\crashes", topdown=False):
    for name in files:
        if name.endswith("00"):
            testcase =  os.path.abspath(os.path.join(root, name))
            testcases.append(testcase)

for testcase in testcases:
    print "[*] Gonna run: ", testcase
    os.system("C:\\python27\\python.exe C:\\BugId\\BugId.py C:\\Users\\IEUser\\Desktop\\xml_validate_results\\xmlvalidate.exe -- %s" % testcase)

The above script gives the following output:

Running cBugId to triage the crashes..

Once I ran that for all my crashes, it clearly showed that we’re hitting the same bug. To confirm, let’s fire up windbg:

0:000> g
(a6c.5c0): Access violation - code c0000005 (!!! second chance !!!)
eax=03727aa0 ebx=0012fc3c ecx=00000000 edx=00000000 esi=030f4f1c edi=00000002
eip=6f95025a esp=0012fbcc ebp=0012fbcc iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010246
msxml6!DTD::findEntityGeneral+0x5:
6f95025a 8b4918          mov     ecx,dword ptr [ecx+18h] ds:0023:00000018=????????
0:000> kv
ChildEBP RetAddr  Args to Child              
0012fbcc 6f9de300 03727aa0 00000002 030f4f1c msxml6!DTD::findEntityGeneral+0x5 (FPO: [Non-Fpo]) (CONV: thiscall) [d:\w7rtm\sql\xml\msxml6\xml\dtd\dtd.hxx @ 236]
0012fbe8 6f999db3 03727aa0 00000003 030c5fb0 msxml6!DTD::checkAttrEntityRef+0x14 (FPO: [Non-Fpo]) (CONV: thiscall) [d:\w7rtm\sql\xml\msxml6\xml\dtd\dtd.cxx @ 1470]
0012fc10 6f90508f 030f4f18 0012fc3c 00000000 msxml6!GetAttributeValueCollapsing+0x43 (FPO: [Non-Fpo]) (CONV: stdcall) [d:\w7rtm\sql\xml\msxml6\xml\parse\nodefactory.cxx @ 771]
0012fc28 6f902d87 00000003 030f4f14 6f9051f4 msxml6!NodeFactory::FindAttributeValue+0x3c (FPO: [Non-Fpo]) (CONV: thiscall) [d:\w7rtm\sql\xml\msxml6\xml\parse\nodefactory.cxx @ 743]
0012fc8c 6f8f7f0d 030c5fb0 030c3f20 01570040 msxml6!NodeFactory::CreateNode+0x124 (FPO: [Non-Fpo]) (CONV: stdcall) [d:\w7rtm\sql\xml\msxml6\xml\parse\nodefactory.cxx @ 444]
0012fd1c 6f8f5042 010c3f20 ffffffff c4fd70d3 msxml6!XMLParser::Run+0x740 (FPO: [Non-Fpo]) (CONV: stdcall) [d:\w7rtm\sql\xml\msxml6\xml\tokenizer\parser\xmlparser.cxx @ 1165]
0012fd58 6f8f4f93 030c3f20 c4fd7017 00000000 msxml6!Document::run+0x89 (FPO: [Non-Fpo]) (CONV: thiscall) [d:\w7rtm\sql\xml\msxml6\xml\om\document.cxx @ 1494]
0012fd9c 6f90a95b 030ddf58 00000000 00000000 msxml6!Document::_load+0x1f1 (FPO: [Non-Fpo]) (CONV: thiscall) [d:\w7rtm\sql\xml\msxml6\xml\om\document.cxx @ 1012]
0012fdc8 6f8f6c75 037278f0 00000000 c4fd73b3 msxml6!Document::load+0xa5 (FPO: [Non-Fpo]) (CONV: thiscall) [d:\w7rtm\sql\xml\msxml6\xml\om\document.cxx @ 754]
0012fe38 00401d36 00000000 00000008 00000000 msxml6!DOMDocumentWrapper::load+0x1ff (FPO: [Non-Fpo]) (CONV: stdcall) [d:\w7rtm\sql\xml\msxml6\xml\om\xmldom.cxx @ 1111]
-- cut --

Running cBugId to triage the crashes..

Let’s take a look at one of the crasher:

C:\Users\IEUser\Desktop\xml_validate_results\fuzzer01\crashes>type id_000000_00
<?xml version="&a;1.0"?>
<book xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:noNamespaceSchemaLocation="nn.xsd"
      id="bk101">
   <author>Gambardella, Matthew</author>
   <title>XML Developer's Guide</title>
   <genre>Computer</genre>
   <price>44.95</price>
   <publish_date>2000-10-01</publish_date>
   <description>An in-depth look at creating applications with
   XML.</description>

As you can see, if we provide some garbage either on the xml version or the encoding, we will get the above crash. Mitja also minimised the case as seen below:

<?xml version='1.0' encoding='&aaa;'?>

The whole idea of fuzzing this library was based on finding a vulnerability within Internet Explorer’s context and somehow trigger it. After a bit of googling, let’s use the following PoC (crashme.html) and see if it will crash IE11:

<!DOCTYPE html>
<html>
<head>
</head>
<body>
<script>

var xmlDoc = new ActiveXObject("Msxml2.DOMDocument.6.0");
xmlDoc.async = false;
xmlDoc.load("crashme.xml");
if (xmlDoc.parseError.errorCode != 0) {
   var myErr = xmlDoc.parseError;
   console.log("You have error " + myErr.reason);
} else {
   console.log(xmlDoc.xml);
}

</script>
</body>
</html>

Running that under Python’s SimpleHTTPServer gives the following:

Running cBugId to triage the crashes..

Bingo! As expected, at least with PageHeap enabled we are able to trigger exactly the same crash as with our harness. Be careful not to include that xml on Microsoft Outlook, because it will also crash it as well! Also, since it’s on the library itself, had it been a more sexy crash would increase the attack surface!

Patching

After exchanging a few emails with Mitja, he kindly provided me the following patch which can be applied on a fully updated x64 system:

;target platform: Windows 7 x64
;
RUN_CMD C:\Users\symeon\Desktop\xmlvalidate_64bit\xmlvalidate.exe C:\Users\symeon\Desktop\xmlvalidate_64bit\poc2.xml
MODULE_PATH "C:\Windows\System32\msxml6.dll"
PATCH_ID 200000
PATCH_FORMAT_VER 2
VULN_ID 9999999
PLATFORM win64


patchlet_start
 PATCHLET_ID 1
 PATCHLET_TYPE 2
 
 PATCHLET_OFFSET 0xD093D 
 PIT msxml6.dll!0xD097D
  
 code_start

  test rbp, rbp ;is rbp (this) NULL?
  jnz continue
  jmp PIT_0xD097D
  continue:
 code_end
patchlet_end

Let’s debug and test that patch, I’ve created an account and installed the 0patch agent for developers, and continued by right clicking on the above .0pp file:

Running the crasher with the 0patch console

Once I’ve executed my harness with the xml crasher, I immediately hit the breakpoint:

Hitting the breakpoint under Windbg

From the code above, indeed rbp is null which would lead to the null pointer dereference. Since we have deployed the 0patch agent though, in fact it’s going to jump to msxml6.dll!0xD097D and avoid the crash:

Bug fully patched!

Fantastic! My next step was to fire up winafl again with the patched version which unfortunately failed. Due to the nature of 0patch (function hooking?) it does not play nice with WinAFL and it crashes it.

Nevertheless, this is a sort of “DoS 0day” and as I mentioned earlier I reported it to Microsoft back in June 2017 and after twenty days I got the following email:

MSRC Response!

I totally agree with that decision, however I was mostly interested in patching the annoying bug so I can move on with my fuzzing :o)
After spending a few hours on the debugger, the only “controllable” user input would be the length of the encoding string:

eax=03052660 ebx=0012fc3c ecx=00000011 edx=00000020 esi=03054f24 edi=00000002
eip=6f80e616 esp=0012fbd4 ebp=0012fbe4 iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
msxml6!Name::create+0xf:
6f80e616 e8e7e6f9ff      call    msxml6!Name::create (6f7acd02)
0:000> dds esp L3
0012fbd4  00000000
0012fbd8  03064ff8
0012fbdc  00000003

0:000> dc 03064ff8 L4
03064ff8  00610061 00000061 ???????? ????????  a.a.a...????????

The above unicode string is in fact our entity from the test case, where the number 3 is the length aparently (and the signature of the function: Name *__stdcall Name::create(String *pS, const wchar_t *pch, int iLen, Atom *pAtomURN))

Conclusion

As you can see, spending some time on Microsoft’s APIs/documentation can be gold! Moreover, refactoring some basic functions and pinpointing the issues that affect the performance can also lead to massive improvements!

On that note I can’t thank enough Ivan for porting the afl to Windows and creating this amazing project. Moreover thanks to Axel as well who’s been actively contributing and adding amazing features.

Shouts to my colleague Javier (we all have one of those heap junkie friends, right?) for motivating me to write this blog, Richard who’s been answering my silly questions and helping me all this time, Mitja from the 0patch team for building this patch and finally Patroklo for teaching me a few tricks about fuzzing a few years ago!

References

Evolutionary Kernel Fuzzing-BH2017-rjohnson-FINAL.pdf
Super Awesome Fuzzing, Part One