Yesterday afternoon, I was peacefully coding some stuff you know but I couldn’t make my code working. As usual, in those type of situations you fire up your debugger in order to understand what is going on under the hood. That was a bit weird, to give you a bit of context I was doing some inline x86 assembly, and I’ve put on purpose an int3 just before the piece of assembly code I thought was buggy. Once my file loaded in OllyDbg2, I hit F9 in order to reach quickly the int3 I’ve slipped into the inline assembly code. A bit of single-stepping, and BOOM I got a nasty crash. It happens sometimes, and that’s uncool. Then, I relaunch my binary and try to reproduce the bug: same actions and BOOM again. OK, this time it’s cool, I got a reproducible crash in OllyDbg2.
I like when things like that happens to me (remember the crashes I’ve found in OllyDbg/IDA here: PDB Ain’t PDD), it’s always a nice exercise for me where I’ve to:
- pinpoint the bug in the application: usually not trivial when it’s a real/big application
- reverse-engineer the codes involved in the bug in order to figure out why it’s happening (sometimes I got the sources, sometimes I don’t like this time)
In this post, I will show you how I’ve manage to pinpoint where the bug was, using GFlags, PageHeap and WinDbg. Then, we will reverse-engineer the buggy code in order to understand why the bug is happening, and how we can code a clean trigger.
The first thing I did was to launch WinDbg to debug OllyDbg2 to debug my binary (yeah.). Once OllyDbg2 has been started up, I reproduced exactly the same steps as previously to trigger the bug and here is what WinDbg was telling me:
1 2 3 4 5 6 7 8 9
We got a debug message from the heap allocator informing us the process has written outside of its heap buffer. The thing is, this message and the breakpoint are not triggered when the faulty write is done but triggered like after, when another call to the allocator has been made. At this moment, the allocator is checking the chunks are OK and if it sees something weird, it outputs a message and breaks. The stack-trace should confirm that:
1 2 3 4 5 6 7 8 9 10 11
As we said just above, the message from the heap allocator has been probably triggered when OllyDbg2 wanted to free a chunk of memory.
Basically, the problem with our issue is the fact we don’t know:
- where the heap chunk has been allocated
- where the faulty write has been made
That’s what makes our bug not trivial to debug without the suitable tools. If you want to have more information about debugging heap issues efficiently, you should definitely read the heap chapter in Advanced Windows Debugging (cheers `Ivan).
Pinpointing the heap issue: introducing full PageHeap
In a nutshell, the full PageHeap option is really powerful to diagnostic heap issues, here are at least two reasons why:
- it will save where each heap chunk has been allocated
- it will allocate a guard page at the end of our chunk (thus when the faulty write occurs, we might have a write access exception)
To do so, this option changes a bit how the allocator works (it adds more meta-data for each heap chunk, etc.) ; if you want more information, try at home allocating stuff with/without page heap and compare the allocated memory. Here is how looks like a heap chunk when PageHeap full is enabled:
To enable it for ollydbg.exe, it’s trivial. We just launch the gflags.exe binary (it’s in Windbg’s directory) and you tick the features you want to enable.
Now, you just have to relaunch your target in WinDbg, reproduce the bug and here is what I get now:
1 2 3 4 5 6 7 8 9
Woot, this is very cool, because now we know exactly where something is going wrong. Let’s get more information about the heap chunk now:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
With this really handy command we got a lot of relevant information:
- This chunk has a size of 0x2d0 bytes. Thus, starting from 0xf00ed30 to 0xf00efff.
- The faulty write now makes sense: the application tries to write 4 bytes outside of its heap buffer (off-by-one on an unsigned array I guess).
- The memory has been allocated in ollydbg!Memalloc (called by ollydbg!Getsourceline, PDB related ?). We will study that routine later in the post.
- The faulty write occurs at address 0x4ce769.
Looking inside OllyDbg2
We are kind of lucky, the routines involved with this bug are quite simple to reverse-engineer, and Hexrays works just like a charm. Here is the C code (the interesting part at least) of the buggy function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
So, let me explain what this routine does:
- This routine is called by OllyDbg2 when it finds a PDB database for your binary and, more precisely, when in this database it finds the path of your application’s source codes. It’s useful to have those kind of information when you are debugging, OllyDbg2 is able to tell you at which line of your C code you’re currently at.
- At line 10: “u->Sourcefile” is a string pointer on the path of your source code (found in the PDB database). The routine is just reading the whole file, giving you its size, and a pointer on the file content now stored memory.
- From line 12 to 18: we have a loop counting the total number of lines in your source code.
- At line 20: we have the allocation of our chunk. It allocates 12*(nb_lines + 1) bytes. We saw previously in WinDbg that the size of the chunk was 0x2d0: it should means we have exactly ((0x2d0 / 12) – 1) = 59 lines in our source code:
- From line 24 to 39: we have a loop similar to previous one. It’s basically counting lines again and initializing the memory we just allocated with some information.
- At line 41: we have our bug. Somehow, we can manage to get out of the loop with “nb_lines2 = nb_lines + 1”. That means the line 41 will try to write one cell outside of our buffer. In our case, if we have “nb_lines2 = 60” and our heap buffer starting at 0xf00ed30, it means we’re going to try to write at (0xf00ed30+6034)=0xf00f000. That’s exactly what we saw earlier.
At this point, we have fully explained the bug. If you want to do some dynamic analysis in order to follow important routines, I’ve made several breakpoints, here they are:
1 2 3 4 5
On my environment, it gives me something like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
- Download the last version of OllyDbg2 here, extract the files
- Download the three files from odb2-oob-write-heap, put them in the same directory than ollydbg.exe is
- Launch WinDbg and open the last version of OllyDbg2
- Set your breakpoints (or not), F5 to launch
- Open the trigger in OllyDbg2
- Press F9 when the binary is fully loaded
- BOOM :). Note that you may not have a visible crash (remember, that’s what made our bug not trivial to debug without full pageheap). Try to poke around with the debugger: restarting the binary or closing OllyDbg2 should be enough to get the message from the heap allocator in your debugger.
You can even trigger the bug with only the binary and the PDB database. The trick is to tamper the PDB, and more precisely where it keeps the path to your source code. That way, when OllyDbg2 will load the PDB database, it will read that same database like it’s the source code of the application. Awesome.
Those kind of crashes are always an occasion to learn new things. Either it’s trivial to debug/repro and you won’t waste much of your time, or it’s not and you will improve your debugger/reverse-engineer-fu on a real example. So do it!
By the way, I doubt the bug is exploitable and I didn’t even try to exploit it ; but if you succeed I would be really glad to read your write-up! But if we assume it’s exploitable for a second, you would still have to distribute the PDB file, the source file (I guess it would give you more control than with the PDB) and the binary to your victim. So no big deal.
If you are too lazy to debug your crashes, send them to me, I may have a look at it!
Oh, I almost forgot: we are still looking for motivated contributors to write cool posts, spread the world.