Wednesday, August 5, 2009

Hackers disassembling 1.1.1(Step One: Warming up)

Step One: Warming up



The algorithm of simplest authentication consists of a character-by-character comparison of the password entered by a user to the reference value stored either in the program (which frequently happens), or outside of it—for example, in a configuration file or the registry (which happens less often).

The advantage of such protection is its extremely simple software implementation. Its core is actually only one line of code that, in the C language, could be written as follows: if (strcmp (password entered, reference password)) {/* Password is incorrect */} else {/* Password is OK*/}.

Let's supplement this code with procedures to prompt for a password and display the comparison, and then examine the program for its vulnerability to cracking.

Listing 1: The Simplest System of Authentication


// Matching the password character by character

#include
#include

#define PASSWORD_SIZE 100
#define PASSWORD "myGOODpassword\n"
// The CR above is needed
// so as not to cut off
// the user-entered CR.

int main ()
{
// The counter for authentication failures
int count=0;
// The buffer for the user-entered password
char buff [PASSWORD_SIZE];

// The main authentication loop
for (;;)
{
// Prompting the user for a password
// and reading it
printf ("Enter password:");
fgets (&buff [0], PASSWORD_SIZE,stdin);

// Matching the entered password against the reference value
if (strcmp (&buff [0], PASSWORD))
// "Scolding" if the passwords don't match;
printf ("Wrong password\n");
// otherwise (if the passwords are identical),
// getting out of the authentication loop
else break;

// Incrementing the counter of authentication failures
// and terminating the program if 3 attempts have been used
if (++count>3) return -1;
}

// Once we're here, the user has entered the right password.
printf ("Password OK\n");
}




In popular movies, cool hackers easily penetrate heavily protected systems by guessing the required password in just a few attempts. Can we do this in the real world?

Passwords can be common words, like "Ferrari", "QWERTY", or names of pet hamsters, geographical locations, etc. However, guessing the password is like looking for a needle in a haystack, and there's no guarantee of success — we can only hope that we get lucky. And lady luck, as we all know, can't be trifled with. Is there a more reliable way to crack this code?

Let's think. If the reference password is stored in the program, and isn't ciphered in some artful manner, it can be found by simply looking at the binary code. Looking at all the text strings, especially those that look like a password, we'll quickly find the required key and easily "open" the program!

The area in which we need to look can be narrowed down using the fact that, in the overwhelming majority of cases, compilers put initialized variables in the data segment (in PE files, in the .data section). The only exception is, perhaps, early Borland compilers, with their maniacal passion for putting text strings in the code segment—directly where they're used. This simplifies the compiler, but creates a lot of problems. Modern operating systems, as opposed to our old friend MS-DOS, prohibit modifying the code segment. Therefore, all variables allocated in it are read-only. Apart from this, on processors with a separate caching system (Pentiums, for example), these string "litter" the code cache, loaded during read ahead and, when they're called for the first time, loaded again from the slow RAM (L2 cache) into the data cache. The result is slower operation and a drop in performance.

So, let's assume it's in the data section. Now, we just need a handy instrument to view the binary file. You can press in your favorite shell (FAR, DOS Navigator) and, by pressing the key admire the digits scrolling down until it bores you. You can also use a hex-editor (QVIEW, HIEW, etc.) but, in this book, for presentation purposes, I'll use the DUMPBIN utility supplied with Microsoft Visual Studio.

Let's print out the data section (the key is /SECTION:.data) as raw data (the key is /RAWDATA:BYTES), having specified the ">" character for redirecting the output to a file. (The response occupies a lot of space, and only its "tail" would find room on the screen.)

> dumpbin /RAWDATA:BYTES /SECTION:.data simple.exe >filename

RAW DATA #3
00406000: 00 00 00 00 00 00 00 00 00 00 00 00 3B 11 40 00 ............;.@.
00406010: 64 40 40 00 00 00 00 00 00 00 00 00 70 11 40 00 d@@.........p.@.
00406020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00406030: 45 6E 74 65 72 20 70 61 73 73 77 6F 72 64 3A 00 Enter password:.
00406040: 6D 79 47 4F 4F 44 70 61 73 73 77 6F 72 64 0A 00 myGOODpassword..
00406050: 57 72 6F 6E 67 20 70 61 73 73 77 6F 72 64 0A 00 Wrong password..
00406060: 50 61 73 73 77 6F 72 64 20 4F 4B 0A 00 00 00 00 Password OK.....
00406070: 40 6E 40 00 00 00 00 00 40 6E 40 00 01 01 00 00 @n@.....@n@.....

Look! In the middle of the other stuff, there's a string that is similar to a reference password (it's printed in bold). Shall we try it? It seems likely we need not even bother: Judging from the source code, it really is the password. The compiler has selected too prominent of a place in which to store it—it wouldn't be such a bad idea to hide the reference password better.

One of the ways to do this is to manually place the reference password value in a section that we choose ourselves. The ability to define the location isn't standard, and, consequently, each compiler (strictly speaking, not actually the compiler, but the linker—but that isn't really important) is free to implement it in any way (or not implement it at all). In Microsoft Visual C++, a special pragma — data_seg — is used for this, and indicates in which section the initialized variables following it should be placed. By default, unassigned variables are placed in the .bbs section, and are controlled by the bss_seg pragma.

Let's add the following lines to Listing 1, and see how they run.

int count=0;
// From now on, all the initialized variables will be
// located in the .kpnc section.
#pragma data_seg (."kpnc")
// Note that the period before the name
// isn't mandatory, just customary.
char passwd[ ]=PASSWORD;
#pragma data_seg ()
// Now all the initialized variables will again
// be located in the section by default (i.e., ."data").
char buff [PASSWORD_SIZE]=" ";
...
if (strcmp(&buff[0] , &passwd[0]))

> dumpbin /RAWDATA:BYTES /SECTION: .data simple2.exe >filename

RAW DATA #3
00406000: 00 00 00 00 00 00 00 00 00 00 00 00 45 11 40 00 ............E.@.
00406010: 04 41 40 00 00 00 00 00 00 00 00 00 40 12 40 00 .A@.........@.@.
00406020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00406030: 45 6E 74 65 72 20 70 61 73 73 77 6F 72 64 3A 00 Enter password:.
00406040: 57 72 6F 6E 67 20 70 61 73 73 77 6F 72 64 0A 00 Wrong password..
00406050: 50 61 73 73 77 6F 72 64 20 4F 4B 0A 00 00 00 00 Password OK.....
00406060: 20 6E 40 00 00 00 00 00 20 6E 40 00 01 01 00 00 n@..... n@......
00406070: 00 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 ................

Aha! Now, there's no password in the data section and hackers' attack has been retarded. But don't jump to conclusions. Simply display the list of sections in the file:

> dumpbin simple2.exe

Summary
2000 .data
1000 .kpnc
1000 .rdata
4000 .texts

The nonstandard section .kpnc attracts our attention right away. Well, shall we check to see what's in it?

dumpbin /SECTION:.kpnc /RAWDATA simple2.exe

RAW DATA #4
00408000: 6D 79 47 4F 4F 44 70 61 73 73 77 6F 72 64 0A 00 myGOODpassword..

There's the password! And we thought we hid it. It's certainly possible to put confidential data into a section of noninitialized data (.bss), the service RTL section (.rdata), or even into the code section (.text) — not everyone will look there for the password, and such allocation won't disturb the functioning of the program. But you shouldn't forget about the possibility of an automated search for text strings in a binary file. Wherever the reference password may be, such a filter will easily find it. (The only problem is determining which text string holds the required key; most likely, a dozen or so possible "candidates" will need to be tried.)

If the password is written in Unicode, the search is somewhat more complicated, since not all such utilities support this encoding. But it'd be rather native to hope that this obstacle will stop a hacker for long.

1 comments:

Unknown said...

thank you for document! i have one question!
how to dump elf?

Post a Comment