Friday, September 18, 2009

Hackers Disassembling 1.1.6(Step Six: Using a Disassembler with a Debugger)

Step Six: Using a Disassembler with a Debugger

There are two ways to analyze programs distributed without source code: disassembling (a static analysis), and debugging (a dynamic analysis). Generally, every debugger has a built-in disassembler; otherwise, we'd have to debug programs directly in machine code!

However, disassemblers included with debuggers usually are primitive and provide few functions. The disassembler built into the popular SoftIce debugger is not much better than DUMPBIN, whose disadvantages we have experienced. The code becomes much more understandable when it's loaded in IDA!

When is the debugger useful? Disassemblers have several limitations because of their static nature.

First, we would have to execute the program on an "emulator" of the processor, "hardwired" into our own heads. In other words, we would need to mentally run the entire program. To do so, we would need to know the purpose of all processor instructions, functions, and structures of the operating system (including undocumented ones).

Second, it's not easy to start analysis at an arbitrary place in the program. We would need to know the contents of registers and memory, but how could we find these? For registers and local variables, we can scroll the disassembler window upward to see the values stored in these locations. But that won't work with global variables, which can be modified by anyone at any time. If only we could set a breakpoint… But what kind of breakpoint works in a disassembler?

Third, disassembling forces us to completely reconstruct the algorithm of each function, whereas debugging allows us to consider a function as a "black box" that only has input and output. Let's assume that we have a function that decrypts the main module of the program. If we're using a disassembler, we have to figure out the decryption algorithm. (This can be a difficult task.) Then, we need to port this function into IDA-C, debug it, and launch a decrypting program. In the debugger, it's possible to execute the function without trying to understand how it works and, after it finishes, to continue the analysis of decrypted code. We could continue the comparison, but it's clear that the debugger doesn't compete with the disassembler; they are partners.

Experienced hackers always use these tools in conjunction. The profram's logic is reconstructed using a disassembler, and details are cleared up on the fly by running the program in a debugger. When doing so, hackers would like to see in the debugger the character names assigned in the disassembler.

Fortunately, IDA Pro allows this to happen! Select the Produce output file submenu from the File menu, then click Produce MAP file (or press the Shift>+ key combination). A dialog box prompting you for a file name will appear. (Enter simple.map or similar file name.) Then, a modal dialog box will open, asking which names should be included in the MAP file. Press the key, leaving all the default checkboxes. The simple.map file will contain all the necessary debug information in Borland's MAP format. The SoftIce debugger doesn't support such a format, however. Therefore, before using the file, we need to convert it to the SYM format using the idasym utility, created for this purpose. It can be downloaded for free from http://www.idapro.com, or obtained from the distributor who sold you IDA.

Run idasym simple.map on the command line and make sure that simple.sym has been created. Then, load the simple.exe application in the debugger. Wait until the SoftIce window appears, then give the SYM command to display the contents of the character table. SoftIce's response should look like this (abridged version):

:sym
CODE (001B)
001B:00401000 start
001B:00401074 __GetExceptDLLinfo
001B:0040107C _Main
001B:00401104 _memchr
001B:00401124 _memcpy
001B:00401148 _memmove
001B:00401194 _memset
001B:004011C4 _strcmp
001B:004011F0 _strlen
001B:0040120C _memcmp
001B:00401250 _strrchr
001B:00403C08 _printf
DATA(0023)
0023:00407000 aBorlandCCopyri
0023:004070D9 aEnterPassword
0023:004070E9 aMygoodpassword
0023:004070F9 aWrongPassword
0023:00407109 aPasswordOk
0023:00407210 aNotype
0023:00407219 aBccxh1

It works! It shows the character names that simplify understanding of the code. You also can set a breakpoint at any of them — for example, bpm aMygoodpassword — and the debugger will understand what you want. You no longer need to remember those hexadecimal addresses.

0 comments:

Post a Comment