The purpose of this little post is to create a piece of code able to monitor exceptions raised in a process (a bit like gynvael’s ExcpHook but in userland), and to generate a report with information related to the exception. The other purpose is to have a look at the internals of course.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
That’s why I divided this post in two big parts:
- the first one will talk about Windows internals background required to understand how things work under the hood,
- the last one will talk about Detours and how to hook ntdll!KiUserExceptionDispatcher toward our purpose. Basically, the library gives programmers a set of APIs to easily hook procedures. It also has a clean and readable documentation, so you should use it! It is usually used for that kind of things:
- Hot-patching bugs (no need to reboot),
- Tracing API calls (API Monitor like),
- Monitoring (a bit like our example),
- Pseudo-sandboxing (prevent API calls),
Lights on ntdll!KiUserExceptionDispatcher
The purpose of this part is to be sure to understand how exceptions are given back to userland in order to be handled (or not) by the SEH/UEF mechanisms ; though I’m going to focus on Windows 7 x86 because that’s the OS I run in my VM. The other objective of this part is to give you the big picture, I mean we are not going into too many details, just enough to write a working exception sentinel PoC later.
When your userland application does something wrong an exception is raised by your CPU: let’s say you are trying to do a division by zero (nt!KiTrap00 will handle that case), or you are trying to fetch a memory page that doesn’t exist (nt!KiTrap0E).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
I’m sure you already know that but in x86 Intel processors there is a table called the IDT that stores the different routines that will handle the exceptions. The virtual address of that table is stored in a special x86 register called IDTR, and that register is accessible only by using the instructions sidt (Stores Interrupt Descriptor Table register) and lidt (Loads Interrupt Descriptor Table register).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
The entry just above tells us that for the processor 0, if a division-by-zero exception is raised the kernel mode routine nt!KiTrap00 will be called with a flat-model code32 ring0 segment (cf GDT dump).
Once the CPU is in nt!KiTrap00’s code it basically does a lot of things, same thing for all the other nt!KiTrap routines, but somehow they (more or less) end up in the kernel mode exceptions dispatcher: nt!KiDispatchException (remember gynvael’s tool ? He was hooking that method!) once they created the nt!_KTRAP_FRAME structure associated with the fault.
Now, you may already have asked yourself how the kernel reaches back to the userland in order to process the exception via the SEH mechanism for example ?
That’s kind of simple actually. The trick used by the Windows kernel is to check where the exception took place: if it’s from user mode, the kernel mode exceptions dispatcher sets the field eip of the trap frame structure (passed in argument) to the symbol nt!KeUserExceptionDispatcher. Then, nt!KeEloiHelper will use that same trap frame to resume the execution (in our case on nt!KeUserExceptionDispatcher).
But guess what ? That symbol holds the address of ntdll!KiUserExceptionDispatcher, so it makes total sense!
If like me you like illustrations, I’ve made a WinDbg session where I am going to show what we just talked about. First, let’s trigger our division-by-zero exception:
1 2 3 4 5 6 7 8 9 10 11 12
Now let’s go a bit further in the ISR, and more precisely when the nt!_KTRAP_FRAME is built:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
The idea now is to track the modification of the nt!_KTRAP_FRAME.Eip field as we discussed earlier (BTW, don’t try to put directly a breakpoint on nt!KiDispatchException with VMware, it just blows my guest virtual machine) via a hardware-breakpoint:
1 2 3 4 5 6 7 8 9 10
OK, so here we can clearly see the trap frame has been modified (keep in mind WinDbg gives you the control after the actual writing). That basically means that when the kernel will resume the execution via nt!KiExceptionExit (or nt!Kei386EoiHelper, two symbols for one same address) the CPU will directly execute the user mode exceptions dispatcher.
Great, I think we have now enough understanding to move on the second part of the article.
In this part we are going to talk about Detours, what looks like the API and how you can use it to build a userland exceptions sentinel without too many lines of codes. Here is the list of the features we want:
- To hook ntdll!KiUserExceptionDispatcher: we will use Detours for that,
- To generate a tiny readable exception report: for the disassembly part we will use Distorm (yet another easy cool library to use),
- To focus x86 architecture: because unfortunately the express version doesn’t work for x86_64.
Detours is going to modify the first bytes of the API you want to hook in order to redirect its execution in your piece of code: it’s called an inline-hook.
Detours can work in two modes:
- A first mode where you don’t touch to the binary you’re going to hook, you will need a DLL module you will inject into your binary’s memory. Then, Detours will modify in-memory the code of the APIs you will hook. That’s what we are going to use.
- A second mode where you modify the binary file itself, more precisely the IAT. In that mode, you won’t need to have a DLL injecter. If you are interested in details about those tricks they described them in the Detours.chm file in the installation directory, read it!
So our sentinel will be divided in two main parts:
- A program that will start the target binary and inject our DLL module (that’s where all the important things are),
- The sentinel DLL module that will hook the userland exceptions dispatcher and write the exception report.
The first one is really easy to implement using DetourCreateProcessWithDll: it’s going to create the process and inject the DLL we want.
To successfully hook a function you have to know its address of course, and you have to implement the hook function. Then, you have to call DetourTransactionBegin, DetourUpdateThread, DetourTransactionCommit and you’re done, wonderful isn’t it ?
The only tricky thing, in our case, is that we want to hook ntdll!KiUserExceptionDispatcher, and that function has its own custom calling convention. Fortunately for us, in the samples directory of Detours you can find how you are supposed to deal with that specific case:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
Here is what looks ntdll!KiUserExceptionDispatcher like in memory after the hook:
Disassembling some instructions pointed by the CONTEXT.Eip field is also really straightforward to do with distorm_decode:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
So the prototype works pretty great like that.
1 2 3 4
But once I’ve encountered a behavior that I didn’t plan on: there was like a stack-corruption in a stack-frame protected by the /GS cookie. If the cookie has been, somehow, rewritten the program calls ___report_gs_failure (sometimes the implementation is directly inlined, thus you can find the definition of the function in your binary) in order to kill the program because the stack-frame is broken. Long story short, I was also hooking kernel32!UnhandleExceptionFilter to not miss that kind of exceptions, but I noticed while writing this post that it doesn’t work anymore. We are going to see why in the next part.
The untold story: Win8 and nt!KiFastFailDispatch
When I was writing this little post I did also some tests on my personal machine: a Windows 8 host. But the test for the /GS thing we just talked about wasn’t working at all as I said. So I started my investigation by looking at the code of __report_gsfailure (generated with a VS2012) and I saw this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
The first thing I asked myself was about that weird int 29h. Next thing I did was to download a fresh Windows 8 VM here and attached a kernel debugger in order to check the IDT entry 0x29:
1 2 3 4 5 6 7 8 9 10 11 12
As opposed I was used to see on my Win7 machine:
1 2 3 4 5 6 7 8 9 10 11 12 13
I’ve opened my favorite IDE and I wrote a bit of code to test if there was a different behavior between Win7 and Win8 regarding this exception handling:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
On Win7 I’m able to catch the exception via a SEH handler: it means the Windows kernel calls the user mode exception dispatcher for further processing by the user exception handlers (as we saw at the beginning of the post). But on Win8, at my surprise, I don’t get the message ; the process is killed directly after displaying the usual message box “a program has stopped”. Definitely weird.
What happens on Win7
When the interruption 0x29 is triggered by my code, the CPU is going to check if there is an IDT entry for that interruption, and if there isn’t it’s going to raise a #GP (nt!KiTrap0d) that will end up in nt!KiDispatchException.
And as previously, the function is going to check where the fault happened and because it happened in userland it will modify the trap frame structure to reach ntdll!KiUserExceptionDispatcher. That’s why we can catch it in our __except scope.
1 2 3 4 5 6 7 8 9 10 11 12 13
What happens on Win8
This time the kernel has defined an ISR for the interruption 0x29: nt!KiRaiseSecurityCheckFailure. This function is going to call nt!KiFastFailDispatch, and this one is going to call nt!KiDispatchException:
BUT the exception is going to be processed as a second-chance exception because of the way nt!KiFastFailDispatch calls the kernel mode exception dispatcher. And if we look at the source of nt!KiDispatchException in ReactOS we can see that this exception won’t have the chance to reach back the userland as in Win7 :)):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
To convince yourself you can even modify the FirstChance argument passed to nt!KiDispatchException from nt!KiFastFailDispatch. You will see the SEH handler is called like in Win7:
Cool, we have now our answer to the weird behavior! I guess if you want to monitor /GS exception you are going to find another trick :)).
I hope you enjoyed this little trip in the Windows’ exception world both in user and kernel mode. You will find the seems-to-be-working PoC on my github account here: The sentinel. By the way, you are highly encouraged to improve it, or to modify it in order to suit your use-case!
If you liked the subject of the post, I’ve made a list of really cool/interesting links you should check out:
- New Security Assertions in Windows 8 – @aionescu endless source of inspiration
- Exploiting the Otherwise Unexploitable on Windows – Yet another awesome article by Skywing and skape
- A catalog of NTDLL kernel mode to user mode callbacks, part 2: KiUserExceptionDispatcher
- Windows Exceptions, Part II: Exception Dispatching
- EasyHook – “EasyHook starts where Microsoft Detours ends.”