Hacking KING: September 2009

Function Arguments

Identifying function arguments is a key part of the analysis of disassembled programs. So prepare yourself: This section may seem long and boring. Unfortunately, there's no way around it — knowing the basics of hacking has its price!

There are three ways to pass arguments to a function: via the stack, via registers, and via the stack and registers simultaneously. The transfer of implicit arguments through global variables comes close to joining this list, but this is covered in another section ("Global Variables").

Arguments can be transferred by value or by reference. In the first case, a copy of the corresponding variable is passed to the function; in the second case, a pointer is passed.

Conventions on passing arguments To work successfully, the calling function should not only know the prototype of the called function, but also should "agree" upon the method of passing arguments with it: by reference or by value, via registers or via the stack. If arguments are passed via registers, it shows which argument is placed in which register. If arguments are passed via the stack, it must define the order in which arguments are placed. It also must ascertain who is "responsible" for clearing up the stack of arguments after the called function is executed.

The ambiguity of the mechanism for passing arguments is one of the reasons for incompatibility between various compilers. Why not force all compiler manufacturers to follow one method? Alas, this solution would pose more problems than it would solve.

Each mechanism has its own merits and drawbacks, and each is interrelated with the language. In particular, C's looseness concerning the observance of function prototypes is possible because the arguments are pushed out from the stack not by the called function (child), but by the calling one (parent), which remembers what it has passed. For example, two arguments are passed to the main function: the count of command-line arguments, and the pointer to an array that contains them. However, if a program doesn't work with the command line (or receives the argument in another way), a prototype of main can be declared in the following manner: main().

In Pascal, such a trick would result either in a compilation error or in the program crash. In this language, the stack is cleared by the child function. If the function fails to do this (or does this incorrectly, popping out a number of words different from the number passed to it), the stack will be unbalanced, and everything will come crashing down. (More specifically, all addressing of the local variables of the parent function will be impaired, and a random value will appear on the stack instead of the return address.)

The drawback of C's solution is the insignificant increase in the size of the generated code. We need to insert a processor instruction (and occasionally more than one) to pop arguments from the stack after each function call. In Pascal, this instruction is used in the function directly, and consequently, occurs only once in the program.

Having failed to find a suitable middle ground, compiler developers have decided to use all possible data-transfer mechanisms. To cope with compatibility problems, they have standardized each mechanism by adopting a number of conventions:

The C convention (designated as __cdecl) directs you to send arguments to the stack from right to left in the order in which they are declared. It charges the called function with clearing the stack. The names of the functions that observe the C convention are preceded with the "_" character, automatically inserted by the compiler. The this pointer (in C++ programs) is transferred via the stack last.

The Pascal convention (designated as PASCAL[i]) directs you to send arguments to the stack from left to right in the order in which they are declared. It charges the calling function with clearing the stack.

The standard convention (designated as __stdcall) is a hybrid of the C and Pascal conventions. Arguments are sent to the stack from right to left, but clearing the stack is performed by the calling function. The names of functions that adhere to the standard convention are preceded with the "_" character and end with the "@" suffix. This is followed by the number of bytes being transferred to the function. The this pointer (in C++ programs) is transferred via the stack last.

The fastcall convention dictates that you transfer arguments via registers. Compilers from Microsoft and Borland support the __fastcall keyword, but they interpret it differently. Watcom C++ doesn't understand the __fastcall keyword, but it has a special pragma — "aux" — in its pocket that allows you to manually choose the registers for transferring arguments (see the "Fastcall Conventions" explanation further on for more details). The names of the functions that adhere to the __fastcall convention are preceded by the "@" character, which is automatically inserted by the compiler.

The default convention. If there's no explicit declaration of the call type, the compiler usually uses its own conventions and chooses them at its own discretion. The this pointer is the most influenced — by default, most compilers transfer it via a register. For Microsoft, this is ECX, for Borland it is EAX, and for Watcom it is EAX, EDX, or both of them. Other arguments can be transferred via registers if the optimizer considers this a better way. The mechanism of transferring arguments and the logic of sampling them is different in different compilers. It is also unpredictable—we have to figure it out from the situation.

Goals and tasks When analyzing a function, a code digger faces the following task: He or she must determine what type of convention is used for calling, count the number of arguments being transferred to the function (and/or being used by the function), and clarify the type and purpose of arguments. Shall we begin?

The convention type is roughly identified by the way the stack is cleared. If it's cleared by the called function, we're dealing with cdecl; otherwise, we are dealing with stdcall or PASCAL. This uncertainty occurs because, if the original prototype of the function is unknown, the order for placing arguments onto the stack can't be determined. But if the compiler is known and the programmer has used the default types of calls, it's possible to determine the type of function call. In programs for Windows, both PASCAL and stdcall calls are widely used, so the uncertainty remains. However, nothing changes the order of transferring arguments: If both calling and called functions are available, we can always establish a correspondence between transferred and received arguments. In other words, if the actual order of passing arguments is known (and it should be known — see the calling function), we don't even need to know the sequence of arguments in the function prototype.

Another matter is presented by library functions whose prototypes are known. If you know the order in which arguments are placed into the stack, it's possible to figure out their type and purpose from the prototype!

Determining the number of arguments and the way they are passed As we mentioned above, arguments can be passed via the stack, via registers, or via both the stack and registers simultaneously. Implicit arguments can also be transferred via global variables.

If the stack was only used for passing arguments, it'd be easy to count them. Alas, the stack is actively used for temporary storage of the data from registers, too. Therefore, if you encounter the PUSH instruction, don't rush to identify it as an argument. It's impossible to determine the number of bytes passed to the function as arguments, but we can easily determine the number of bytes that are popped from the stack after the function is completed!

If the function obeys the standard or Pascal convention, it clears the stack using the RET n instruction (n is simply the required value in bytes). Things are not as simple with cdecl functions. In general, their call is followed by the instruction ADD ESP, n (again, n is the required value in bytes). But variations are possible; there could be a delay in clearing the stack, or arguments could be popped into any free register. However, we'll defer optimizing riddles, being content with non-optimizing compilers.

We can assume that the number of bytes placed onto the stack equals the number of those popped out; otherwise, after the function is executed, the stack will become unbalanced, and the program will crash. (Optimizing compilers allow a misbalance of the stack in some parts, but we'll save this discussion for later.) Hence, the number of arguments equals the number of transferred bytes divided by the word size[i]. Is this correct? No, it isn't! Few arguments occupy exactly one element of the stack. The type double, for example, consumes 8 bytes; a character string transferred by value, not by reference, will "eat" as many bytes as it needs. In addition, a string, data structure, array, or object can be pushed onto the stack using the MOVS instruction instead of PUSH. (By the way, the use of MOVS is strong evidence that the argument was passed by value.)

Let's try to sort out the mess I've created in our heads. It's impossible to determine the number of arguments passed via the stack by analyzing the code of the calling function. Even the number of passed bytes cannot be determined definitively. The type of transfer is also veiled in obscurity. In the "Constants and Offsets" section, we'll return to this question. For now, we'll give the following example: PUSH 0x40404040/CALL MyFuct:0x404040. What is this: an argument passed by value (the constant 0x404040), or a pointer to something located at the offset 0x404040 (passed by reference)? This problem can't be resolved, can it?

Don't worry; the curtain hasn't fallen yet, and we'll continue the fight. The majority of problems can be solved by an analysis of the called function. Having clarified how it treats the arguments passed to it, we'll determine both their type and quantity. For this, we'll have to become acquainted with addressing arguments on the stack. For an easy warm-up, let's consider the following example:

Listing 70: A Mechanism of Passing Arguments

#include
#include

struct XT{
char s0[20];
int x;
};

void MyFunc(double a, struct XT xt)
{
printf("%f, %x, %s\n", a, xt.x, &xt.s0[0]);
}

main()
{
struct XT xt;
strcpy(&xt.s0[0], "Hello, World!");
xt.x = 0x777;
MyFunc(6.66, xt);
}

The disassembled listing of this program compiled using the Microsoft Visual C++ compiler with its default configuration looks like this:

Listing 71: The Disassembled Code for Passing Arguments Using Visual C++

main proc near ; CODE XREF: start+AF↓p

var_18 = byte ptr -18h
var_4 = dword ptr -4
push ebp
mov ebp, esp
sub esp, 18h
; The first PUSH relates to the function prolog,
; not the arguments being passed.

push esi
push edi
; The lack of explicit initialization of registers indicates
; that they probably are saved on the stack, not passed as
; arguments. However, if arguments passed to this function
; not only via the stack, but also via the ESI
; and EDI registers, placing them onto the stack might indicate
; that the arguments will be passed to the next function.

push offset aHelloWorld ; "Hello, World!"
; Aha! Here is the passing of the argument — a pointer to the
; string. (Strictly speaking, passing probably occurs. See the
; "Constants and Offsets" section for an explanation.)
; Theoretically, it's possible to save a constant temporarily on
; the stack, then pop it out into any of available registers.
; It's also possible to directly address it in the stack.
; However, I know no compilers capable of these
; cunning maneuvers. Placing a constant onto the stack
; is always an indication of passing an argument.

lea eax, [ebp+var_18]
; The pointer to a local buffer is placed in EAX.

push eax
; EAX is saved on the stack.
; The series of arguments is indissolvable. Having recognized
; the first argument, we can be sure that everything pushed
; onto the stack is an argument, too.

call strcpy
; The prototype of the strcpy (char*, char*) function doesn't allow
; us to determine the order in which arguments are placed. However,
; since all library C functions obey the cdecl convention, the
; arguments are placed from right to left. Thus, the code initially
; looked like this: strcpy (&buff[0], "Hello, World!"). Could the
; programmer instead use a conversion such as stdcall? This is
; extremely unlikely, since the strcpy itself would have to be
; recompiled; otherwise, where it would learn that the order
; in which arguments are placed has changed? Although standard
; libraries are, as a rule, delivered with the source codes
; included, practically nobody ever recompiles them.

add esp, 8
; Since 8 bytes are popped out of the stack, we can conclude that
; two words of arguments were passed to the function. Consequently,
; PUSH ESI and PUSH EDI were not arguments of the function!

mov [ebp+var_4], 777h
; The constant 0x777 is placed in a local variable.
; It's certainly a constant, not a pointer, because in Windows
; no user data can be stored in this memory area.

sub esp, 18h
; Memory is allocated for a temporary variable. Temporary variables
; are created when arguments are passed by value. Therefore, let's
; prepare ourselves for the next "candidate" to be an argument.
; (See the "Register and Temporary Variables" section.)

mov ecx, 6
; The constant 0x6 is placed in ECX. We don't yet know the purpose.

lea esi, [ebp+var_18]
; The pointer to the local buffer, which contains the copied
; string "Hello, World!", is placed in ESI.

mov edi, esp
; The pointer is copied to the top of the stack in EDI.

repe movsd
; Here it is - passing the string by value. The entire string is
; copied on the stack, swallowing 6*4 bytes of it (where 6 is the value
; of the ECX counter, and 4 is the size of the double word - movsd).
; Hence, this argument occupies 20 (0x14) bytes of stack space. We'll
; use this value to determine the number of arguments according to
; the number of bytes being popped out. The data from [ebp+var_18]
; to [ebp+var_18-0x14] (that is, from var_18 to var_4) is copied
; to the stack. But var_4 contains the constant 0x777!
; Therefore, it will be passed to the function together with the string.
; This will allow us to reconstruct the initial structure:
; struct x{
; char s0[20]
; int x
; }
; It turns out that a structure is passed to the function,
; not a single string!
push 401AA3D7h
push 0A3D70A4h
; Two more arguments are placed onto the stack. Why two?
; This may be a single argument of type int64, or a double one.
; It's not really possible to determine from the code which type it is.

call MyFunc
; MyFunc is called. Unfortunately, we can't figure out the
; function's prototype. It's only clear that the first argument
; (from the left, or from the right?) is a structure, and it's followed
; either by two int, or by one int64, also known as double.
; We can clear up this situation by analyzing the called function,
; but we'll defer this until after we've mastered addressing
; arguments to the stack.

add esp, 20h
; This popped out 0x20 bytes. Since 20 bytes (0x14) account for one
; structure, and 8 bytes for the following two arguments,
; we obtain 0x14+0x8=0x20, which is what we wanted to prove.
pop edi
pop esi
mov esp, ebp
pop ebp
retn
sub_401022 endp

aHelloWorld db 'Hello,World!',0 ; DATA XREF: sub_401022+8↑o
align 4

The disassembled listing of this program compiled using Borland C++ will be somewhat different. Let's look at it as well:

Listing 72: The Disassembled Code of Passing Arguments Using Borland C++

_main proc near ; DATA XREF: DATA:00407044↓o

var_18 = byte ptr -18h
var_4 = dword ptr -4
push ebp
mov ebp, esp
add esp, OFFFFFFE8h
; This is addition with a minus sign.
; Having pressed <-> in IDA, we obtain ADD ESP, -18h.

push esi
push edi
; For now, everything is happening just as in the previous example.

mov esi, offset aHelloWorld ; "Hello, World!"
; Here, we see some differences.
; The strcpy call has vanished. The compiler didn't even expand
; the function by replacing it where the call takes place -
; it simply excluded the call!

lea edi, [ebp+var_18]
; The pointer to the local buffer is placed in EDI.

mov eax, edi
; The same pointer is placed in EAX.

mov ecx, 3
repe movsd
movsb
; Note: 4*3+1=13 bytes are copied - 13, not 20 as we would expect
; judging from the structure declaration.
; This is how the compiler has optimized the code:
; It has copied only the string into the buffer,
; ignoring its uninitialized "tail."

mov [ebp+var_4], 777h
; The value of the constant 0x777 is assigned to a local variable.

push 401AA3D7h
push 0A3D70A4h
; Same here. We can't determine whether these two numbers
; are one or two arguments.

lea ecx, [ebp+var_18]
; The pointer to the string's beginning is placed in ECX.

mov edx, 5
; The constant 5 is placed in EDX. (The purpose isn't yet clear.)

loc_4010D3: ; CODE XREF: _main+37↓j
push dword ptr [ecx+edx*4]
; What kind of awful code is this? Let's try to figure it out
; starting from its end. First of all, what does ECX+EDX*4 make?
; ECX is the pointer to the buffer,
; and we understand that pretty clearly, but EDX*4 == 5*4 == 20.
; Aha! So we obtained a pointer to the end of the string,
; not to its beginning. Actually it's a pointer not to the end,
; but to the variable ebp+var_4 (0x18-0x14=0x4). If this is
; the pointer to var_4, then why is it calculated in such an
; intricate manner? We're probably dealing with a structure.
; And look: The push instruction sends a double word onto the stack
; that is stored at the address according to this pointer.

dec edx
; Now we decrement EDX... Do you get the feeling
; that we're dealing with a loop?

jns short loc_4010D3
; This jump works until EDX is a negative number,
; which confirms our assumption about the loop. Yes, this
; unnatural construction is used by Borland to pass the argument -
; a structure - to the function by value!

call MyFunc
; Look: The stack isn't cleared! This is
; the last function called, and stack doesn't need
; to be cleared - so Borland doesn't bother.

xor eax, eax
; The result returned by the function is zeroed.
; In Borland,void functions always return zero.
; Actually, the code placed after their call zeroes EAX.

pop edi
pop esi
; The EDI and ESI registers that were stored previously are restored.

mov esp, ebp
; ESI is restored, which is why the stack wasn't cleared
; upon calling the last function!
pop ebp
retn
_main endp

Note that, by default, Microsoft C ++ transfers arguments from right to left, and Borland C++ transfers them from left to right! There's no standard call type that, while passing arguments from left to right, would make the calling function clear the stack. Borland C++ uses its own call type, which isn't compatible with anything.

Addressing arguments in the stack The basic concept of the stack includes two operations: pushing elements onto the stack, and popping the last element off it. Accessing an arbitrary element is something new! However, such a deviation from the rules significantly increases the operating speed. If we need, say, the third element, why can't we pull it from the stack directly, without removing the first two elements? The stack is not only a "pile", as popular tutorials on programming teach us, but also an array. Therefore, knowing the position of the stack pointer (as we must — otherwise, where would we put the next element?) and the size of the elements, we can calculate the offset of any element, then easily read it.

The stack, like any other homogeneous array, has a drawback: It can store only one type of data (double words, for example). If we need to place 1 byte (such as an argument of the char type), we must expand it to a double word and place it as a whole. If an argument occupies four words (double, int64), we need two stack elements to pass it.

Besides passing arguments, the stack also saves the return address of the function. This requires one or two stack elements, depending on the type of the function call (near or far). The near call operates within one segment; we need to save only the offset of the instruction that follows the CALL instruction. If the calling function is in one segment and the called one is in another segment, we need to remember both the segment and the offset to know where to return to. The return address is placed after the arguments; therefore, the arguments appear behind it relative to the top of the stack. Their offset varies depending on how many stack elements the return address occupies — one or two. Fortunately, the flat memory model of Windows NT/9x allows us to forget about segmented memory model just as we would forget a bad dream; we can use only near calls everywhere.

Nonoptimizing compilers use a special register (usually, EBP) for addressing arguments that copies the value of the stack pointer register to the beginning of the function. Since the stack grows from higher addresses to lower ones, the offsets of all arguments (including the return address) are positive. The offset of the Nth argument is calculated using the following formula:

arg_offset = N*size_element+size_return_address

The argument number N counts from the top of the stack beginning from zero; the size of one stack element is size_element, generally equal to the bit capacity of the segment element (4 bytes in Windows NT/9x); and the space taken up by the return address in bytes is size_return_address (usually 4 bytes in Windows NT/9x). In addition, we often have to solve the opposite task: using a known offset of an element to determine the number of the argument being addressed. The following formula, easily derived from the previous one, is helpful for this:

N=(arg_offset-size_return_address)/(size_element)

However, since the old EBP value should be saved in the same stack before copying the current ESP value to EBP, we must correct this formula, adding the EBP register capacity (BP in 16-bit mode) to the size of the return address.

From the hacker's point of view, there's a key advantage of such addressing of arguments: Having seen an instruction like MOV EAX, [EBP+0x10] somewhere in the middle of the code, we can instantly calculate which argument is being addressed. However, to save the EBP register, the optimizing compilers address arguments directly via ESP. The difference is basic! The ESP value changes during the function's execution; it changes every time data is pushed onto or popped off the stack. Thus, the offset of arguments relative to ESP doesn't remain constant either. To determine exactly which argument is addressed, we need to know the value of ESP at a given point of the program. For this, we have to trace all of its changes from the beginning of the function. We'll discuss this "artful" addressing in greater detail later. (See the "Local Stack Variables" section.) For now, let's return to the previous example (it's time to complete it) and analyze the called function:

Listing 73: The Disassembled Code of a Function Receiving Arguments

MyFunc proc near ; CODE XREF: main+39↑p

arg_0 = dword ptr 8
arg_4 = dword ptr 0Ch
arg_8 = byte ptr 10h
arg_1C = dword ptr 24h
; IDA recognized four arguments passed to the function.
; However,we shouldn't blindly trust IDA. If one argument
; (int64, for example) is passed as several words, IDA will accept it
; as several arguments, not as one! Therefore, the result produced
; by IDA should be interpreted as follows: no less than four arguments
; were passed to the function. However, again, everything is not that easy!
; Nothing prevents the called function from accessing
; the stack of the parent function as deeply as it wants.
; Perhaps nobody passed us any arguments,
; and we've rushed into the stack and stolen something from it.
; This mainly results from programming errors
; that occur because of confusion over prototypes. However, we need
; to take into account such a possibility. (In any case, you'll
; encounter it sometimes, so be informed). The number next to 'arg'
; represents the offset of the argument relative to the beginning
; of the stack frame. Note: the stack frame is shifted
; by 8 bytes relative to EBP - 4 bytes hold the saved
; return address, and an additional 4 bytes are used
; for saving the EBP register.

push ebp
mov ebp, esp
lea eax, [ebp+arg_8]
; A pointer to an argument is obtained. Attention: a pointer to
; an argument, not an argument pointer! Now, let's figure out
; for which argument we're obtaining this pointer. IDA has already
; calculated that this argument is displaced by 8 bytes relative to
; the beginning of the stack frame. In the original code,
; the bracketed expression looked like ebp+0x10 - just as it is shown
; by most disassemblers. If IDA were not so clever, we would have had
; to manually and permanently subtract 8 bytes from each
; address expression. We'll still have a chance to practice this.
; What we pushed onto the stack last is on top.
; Let's look at the calling function to find what we pushed (see the
; variant compiled by Microsoft Visual C++). Aha! The last items
; were the two unclear arguments. Before them, a structure
; consisting of a string and a variable of the int type was placed
; onto the stack. Thus, EBP+ARG_8 points to a string.

push eax
; The obtained pointer is pushed onto the stack.
; The pointer likely will be passed to the next function.

mov ecx, [ebp+arg_1C]
; The contents of the EBP+ARG_1C argument are placed in ECX.
; What does it point to?
; You may recall that the int type is in the structure at an offset
; of 0x14 bytes from the beginning, and ARG_8 is simply
; its beginning.Consequently, 0x8+0x14 == 0x1C. That is,
; the value of the variable of the int type is a member
; of the structure, and is placed in ECX.

push ecx
; The obtained variable is placed onto the stack. It was passed
; by value, because ECX stores the value, not the pointer.

mov edx, [ebp+arg_4]
; Now, we take one of the two unclear arguments
; that were placed last onto the stack...

push edx
; ... and push them onto the stack again to pass
; the argument to the next function.

mov eax, [ebp+arg_0]
push eax
; The second unclear argument is pushed onto the stack.

push offset aFXS ; "%f,%x,%s\n"
call _printf
; Oops! Here we have the call of printf, passing a format
; specification string! The printf function, as you probably know,
; has a variable number of arguments, the type and quantity
; of which are specified by this string. Remember
; that we first placed the pointer to the string on the stack.
; The rightmost specifier %s indicates the output
; of a string. Then, a variable of the int type was placed onto the
; stack. The second specifier is %x - the output of an integer
; in hexadecimal representation. Then comes the last specifier -
; %f - which corresponds to placing two arguments onto the stack.
; If we look into the programmer's guide for Microsoft Visual C++,
; we'll see that the %f specifier outputs a floating-point value,
; which, depending on the type, can occupy 4 bytes (float)
; or 8 bytes (double). In this case, it obviously occupies
; 8 bytes, making it a double. Thus, we've reconstructed
; the prototype of our function. Here it is:
; cdecl MyFunc (double a, struct B b)
; The call type is cdecl - that is, the stack was cleared by
; the calling function. Alas, the original order of
; passing arguments can't be figured out. Remember that
; Borland C++ cleared the stack using the calling function,
; but changed the order of passing parameters.
; It seems likely that if a program was compiled by Borland C++,
; we can simply reverse the order of arguments. Unfortunately,
; it's not so easy. If there was an explicit conversion
; of the function type to cdecl, Borland C++
; would follow its orders. Then, reversing the
; order of arguments would give an incorrect result! However, the
; original order of arguments in the function prototype doesn't play
; a role. It's only important to establish a correspondence between
; the passed and accepted arguments, which we have done.
; Note: This was possible only with the combined analysis
; of the called and calling functions.
; Analysis of just one of them wouldn't give us any results.
; Note: Never completely rely on the format specification string.
; Since the specifiers are formed manually by the programmer,
; errors sometimes are hard to detect
; and give an extremely mysterious code after compilation!

add esp, 14h
pop ebp
retn
MyFunc endp

We've made some progress; we successfully reconstructed the prototype of our first function. However, we have many miles to go before we reach the end of the section. If you're tired, take a rest and clear your head. We're going to move on to an important, but boring subject — the comparative analysis of various types of function calls and their implementation in popular compilers.

Let's begin by learning the standard convention on calls — stdcall. Take a look at the following example:

Listing 74: A Demonstration of the stdcall Call

#include
#include

__stdcall MyFunc(int a, int b, char *c)
{
return a+b+strlen(c);
}

main()
{
printf("%x\n", MyFunc(0x666, 0x777, "Hello, World!"));
}

The disassembled listing of this example compiled with Microsoft Visual C++ using the default settings should look like this:

Listing 75: The Disassembled Code for the stdcall Call

main proc near ; CODE XREF: start+AF↓p
push ebp
mov ebp, esp

push offset aHelloWorld ; const char *
; The pointer to the aHelloWorld string is placed onto the stack.
; By examining the source code (fortunately, we have it),
; we'll find this is the rightmost argument passed
; to the function. Therefore, we have a call of stdcall
; or cdecl type, not Pascal. Notice that the string is passed
; by reference, not by value.

push 777h ; int
; One more argument is placed onto the stack - a constant of type int.
; (IDA, from version 4.17, automatically determines its type.)

push 666h ; int
; The last, leftmost argument is passed to the function -
; a constant of type int.

call MyFunc
; Note that the function call isn't followed by any instructions
; for clearing the stack from the arguments placed into it.
; If the compiler hasn't decided on
; a delayed cleanup, it is likely that the stack is cleared
; by the called function. Consequently, the type of call
; is stdcall, which was what we wanted to prove.

push eax
; The value returned by the function is passed
; to the following function as an argument.

push offset asc_406040 ; "%x\n"
call _printf
; OK, this is the next printf function. The format string shows
; that the passed argument has the int type.

add esp, 8
; This popped 8 bytes from the stack. Of these, 4 bytes relate
; to the argument of type int, and 4 bytes to the pointer
; to the format string.

pop ebp
retn
main endp

; int __cdecl MyFunc(int, int, const char *)
MyFunc proc near ; CODE XREF: sub_40101D+12↓p
; Beginning with version 4.17, IDA automatically reconstructs the
; function prototypes. However, it does not always do this correctly.
; In this case, IDA has a made a gross error - the call type cannot
; be cdecl, since the stack is cleared up by the called function!
; It seems likely that IDA doesn't even attempt
; to analyze the call type. Instead, it probably takes
; the call type from the default settings
; of the compiler that it has recognized. In general, the results
; of IDA's work should be cautiously interpreted.

arg_0 = dword ptr 8
arg_4 = dword ptr 0Ch
arg_8 = dword ptr 10h

push ebp
mov ebp, esp
push esi
; This, apparently, is saving the register on the stack.
; It's not passing it to the function because the register hasn't
; been explicitly initialized, neither by the calling function,
; nor by the called one.

mov esi, [ebp+arg_0]
; The last argument pushed onto the stack is placed into
; the ESI register.

add esi, [ebp+arg_4]
; The contents of ESI are added to the last argument placed
; onto the stack.

mov eax, [ebp+arg_8]
; The next-to-last argument is written into EAX...

push eax ; const char *
; ... and pushed onto the stack.

call _strlen
; Since strlen expects a pointer to a string, we can conclude
; that the next-to-last argument is a string passed by reference.

add esp, 4
; The last argument is popped from the stack.

add eax, esi
; As you'll remember, ESI stores the first two arguments, and EAX
; contains the returned string length. Thus, the function
; sums up two of its arguments with the string length.

pop esi
pop ebp
retn 0Ch
; The stack is cleared by the called function; therefore,
; the call type is stdcall or Pascal. Let's assume it's stdcall.
; Then, the function prototype should look like this:
; int MyFunc (int a, int b, char *c)
;
; Two variables of the int type, followed by a string, are on the top
; of the stack. Since the top of the stack always contains what
; was placed on it last, and, according to stdcall, the arguments are
; pushed from right to left, we obtain exactly this order of arguments.

MyFunc endp

Now let's examine how the cdecl function is called. Let's replace the stdcall keyword in the previous example with cdecl:

Listing 76: A Demonstration of the cdecl Call

#include
#include

__cdecl MyFunc (int a, int b, char *c)
{
return a+b+strlen(c);
}

main()
{
printf ("%x\n", MyFunc(0x666, 0x777, "Hello, World!"));
}

The disassembled listing of the compiled example should look like this:

Listing 77: The Disassembled Code for the cdecl Call

main proc near ; CODE XREF: start+AF↓p
push ebp
mov ebp, esp

push offset aHelloWorld ; const char *
push 777h ; int
push 666h ; int
; The arguments are passed to the function via the stack.

call MyFunc
add esp, 0Ch
; The stack is cleared by the calling function.
; This means that the call type is cdecl, since the other two
; conventions charge the called function with clearing the stack.

push eax
push offset asc_406040 ; "%x\n"
call _printf
add esp, 8
pop ebp
retn
main endp

; int __cdecl MyFunc (int, int, const char *)
; This time, IDA has correctly determined the call type.
; However, as previously shown, it could have made a mistake.
; So we still shouldn't rely on it.

MyFunc proc near ; CODE XREF: main+12↑p

arg_0 = dword ptr 8
arg_4 = dword ptr 0Ch
arg_8 = dword ptr 10h
; Since the function has the cdecl type,
; arguments are passed from right to left. Its prototype looks
; like this: MyFunc (int arg_0, int arg_4, char *arg_8).

push ebp
mov ebp, esp
push esi
; ESI is saved on the stack.

mov esi, [ebp+arg_0]
; The arg_0 argument of the int type is placed into ESI.

add esi, [ebp+arg_4]
; It's added to arg_4.

mov eax, [ebp+arg_8]
; The pointer to the string is placed into EAX.

push eax ; const char *
; It's passed to the strlen function via the stack.

call _strlen
add esp, 4

add eax, esi
; The string length arg_8 is added to the sum of arg_0 and arg_4.

pop esi
pop ebp
retn
MyFunc endp

Before we proceed to the really serious things, let's consider the last standard type — PASCAL:

Listing 78: A Demonstration of the PASCAL Call

#include
#include

// Attention! Microsoft Visual C++ no longer supports the PASCAL call
// type. Instead, it uses the similar WINAPI call type
// defined in the windows.h file.

#if defined(_MSC_VER)
#include
// We include windows.h only when we compile using Microsoft
// Visual C++; a more effective solution for other compilers is
// using the PASCAL keyword - if they support it (as Borland does).

#endif

// This kind of programming trick makes the listing less readable,
// but it allows us to compile the code with more than one compiler.
#if defined(_MSC_VER)
WINAPI
#else
__pascal
#endif

MyFunc(int a, int b, char *c)
{
return a+b+strlen(c);
}

main()
{
printf("%x\n", MyFunc(0x666, 0x777, "Hello, World!"));
}

The disassembled listing of this program compiled with Borland C++ should look like this:

Listing 79: The Disassembled Code for the PASCAL Call Using Borland C++

; int __cdecl main(int argc, const char **argv, const char *envp)
_main proc near ; DATA XREF: DATA:00407044↓o
push ebp
mov ebp, esp

push 666h ; int
push 777h ; int
push offset aHelloWorld ; s
; The arguments are passed to the function. Reviewing
; the source code, we notice that the arguments are passed
; from left to right. However, if the source code isn't available,
; it's impossible to establish this! Fortunately, the original
; function prototype is not of much importance.

call MyFunc
; The function doesn't clear the stack!
; If this is not the result of optimization, the call type is
; PASCAL or stdcall. Since PASCAL is already out of the question,
; we'll assume we're dealing with stdcall.

push eax
push offset unk_407074 ; format
call _printf
add esp, 8

xor eax, eax
pop ebp
retn
_main endp

; int __cdecl MyFunc(const char *s,int,int)
; IDA has given an incorrect result again!
; The call type is obviously not cdecl!
; Although the order of arguments is the reverse,
; everything else about the function prototype
; is suitable for use.

MyFunc proc near ; CODE XREF: _main+12↑p

s = dword ptr 8
arg_4 = dword ptr 0Ch
arg_8 = dword ptr 10h

push ebp
mov ebp, esp
; The stack frame is opened.

mov eax, [ebp+s]
; The pointer to the string is placed into EAX.

push eax
call _strlen
; It's passed to the strlen function.

pop ecx
; One argument is deleted by popping it from the stack
; into any available register.
mov edx, [ebp+arg_8]
; The arg_8 argument of type int is placed in EDX.

add edx, [ebp+arg_4]
; It's added to the arg_4 argument.

add eax, edx
; The string length is added to the sum of arg_8 and arg_4.

pop ebp
retn 0Ch
; The stack is cleared by the called function.
; This means that its type is PASCAL or stdcall.
MyFunc endp

As you can see, the identification of basic call types and the reconstruction of the function prototypes are rather simple. The only thing that might spoil the mood is confusion between PASCAL and stdcall. However, the order of placing arguments onto the stack is of no importance, except in special cases. We'll give one here:

Listing 80: Distinguishing PASCAL from stdcall

#include
#include
#include

// CALLBACK procedure for receiving timer messages
VOID CALLBACK TimerProc(
HWND hwnd, // Handle of the window for timer messages
UINT uMsg, // WM_TIMER message
UINT idEvent, // Timer identifier
DWORD dwTime // Current system time
)
{
// All beeps
MessageBeep((dwTime % 5)*0x10); // The time elapsed, in seconds,
// is displayed from the moment
// the system starts.
printf("\r:=%d", dwTime/1000);
}

main()
// This is a console application, but it also can have
// a message loop and can set the timer!
{
int a;
MSG msg;

// The timer is set by passing the address of the TimerProc
// procedure to it.

SetTimer (0,0,1000, TimerProc);

// This is the message loop. When you're fed up with it,
// press + to break the loop.

while (GetMessage(&msg, (HWND) NULL, 0, 0))
{
TranslateMessage(&msg);
DispatchMessage(&msg);
}
}

Let's compile an example like this — cl pascal.callback.c USER32.lib — and see what results:

Listing 81: The Disassembled Code for Distinguishing PASCAL from stdcall

main proc near ; CODE XREF: start+AF↓p
; This time, IDA hasn't determined the function prototype.

Msg = MSG ptr -20h
; IDA recognized one local variable and even determined
; its type.

push ebp
mov ebp, esp
sub esp, 20h

push offset TimerProc ; IpTimerFunc
; The pointer is passed to the TimerProc function.

push 1000 ; uElapse
; The timer time delay is passed.

push 0 ; nIDEvent
; The nIDEvent argument is always ignored in console applications.

push 0 ; hWnd
; There are no windows, so we're passing NULL.

call ds:SetTimer
; The Win32 API functions are called according to the stdcall
; convention. Knowing their prototype (described in the SDK),
; we can determine the type and purpose of arguments.
; In this case, the source code looked like this:

; SetTimer (NULL, BULL, 1000, TimerProc);

loc_401051: ; CODE XREF: main+42↓j
push 0 ; wMsgFilterMax
; NULL - no filter

push 0 ; wMsgFilterMin
; NULL - no filter

push 0 ; hWnd
; NULL - no windows in the console application

lea eax, [ebp+Msg]
; Get the pointer to the msg local variable.
; The type of this variable is determined only
; on the basis of the prototype of the GetMessageA function.
push eax ; lpMsg
; Pass the pointer to msg.

call ds:GetMessageA
; The GetMessageA(&msg, NULL, NULL, NULL) function is called.

test eax, eax
jz short loc_40107B
; This is the check for WM_QUIT.

lea ecx, [ebp+Msg]
; ECX contains the pointer to the filled MSG structure...

push ecx ; lpMsg
; ... and passes it to the TranslateMessage function.

call ds:TranslateMessage
; The TranslateMessage(&msg) function is called.

lea edx, [ebp+Msg]
; EDX contains the pointer to msg...

push edx ; 1pMsg
; ... and passes it to the DispatchMessageA function.

call ds:DispatchMessageA
; The DispatchMessageA function is called.

jmp short loc_401051
; This is the message handling loop...

loc_40107B: ; CODE XREF: main+2C↑j
; ... and the output.

mov esp, ebp
pop ebp
retn
main endp
TimerProc proc near ; DATA XREF: main+6↑o
; IDA hasn't automatically reconstructed the prototype of TimerProc as
; a consequence of the implicit call of this function by the operating
; system - we'll have to do this ourselves. We know TimerProc is
; passed to the SetTimer function. Looking into the description
; of SetTimer (SDK should always be near at hand!), we'll find
; its prototype:
;
; VOID CALLBACK TimerProc(
; HWND hwnd, // Handle of window for timer messages
; UINT uMsg, // WM_TIMER message
; UINT idEvent, // Timer identifier
; DWORD dwTime // Current system time
; )
;
; Now we just have to clarify the call type. This time, it's important.
; Since we don't have the code of the calling function
; (it's located deep under the hood of the operating system),
; we'll be able to find out the argument types
; only if we know the order in which they are passed.
; We already mentioned above that all CALLBACK functions obey the
; Pascal convention. Don't confuse CALLBACK functions with Win32 API
; functions! The former are called by the operating system,
; the latter by an application.
;
; OK, the call type of this function is PASCAL. This means that arguments
; will be pushed from left to right, and the stack is cleared by the
; called function. (You should make sure that this is really the case.)

arg_C = dword ptr 14h
; IDA has revealed only one argument - although, judging by the prototype,
; four of them are passed. Why? It's simple: The function used
; only one argument. It didn't even address the rest of them.
; It appears that IDA was not able to determine them!
; By the way, what kind of argument is this?
; Let's see: Its offset is 0xC. On the top of the stack, we find
; what was pushed onto it last. On the bottom, we should see
; the opposite. But it turns out
; that dwTime was placed onto the stack first! (Since we have the source code,
; we know for certain that arg_C is dwTime.) The Pascal convention
; requires pushing arguments in the reverse order. Something is wrong
; here... The program works, however (launch it to check). The SDK
; says CALLBACK is an analog of FAR PASCAL. So everything is clear
; with FAR - all calls are near in WinNT/9x . But how can we explain
; the inversion of pushing arguments? Let's look into
; and see how the PASCAL type is defined there:
;
; #elif (_MSC_VER >= 800) || defined(_STDCALL_SUPPORTED)
; #define CALLBACK __stdcall
; #define WINAPI __stdcall
; #define WINAPIV __cdecl
; #define APIENTRY WINAPI
; #define APIPRIVATE __stdcall
; #define PASCAL __stdcall
;
; Well, who would have thought it! The call declared
; as PASCAL is actually stdcall! And CALLBACK is also defined
; as stdcall. At last, everything is clear. (Now, if someone tells you
; that CALLBACK is PASCAL, you can smile and say that a hedgehog is
; a bird, although a proud one - it won't fly until you kick it!)
; It seems likely that rummaging in the jungle of include files may
; be beneficial. By the way, perversions with overlapping types
; create a big problem when adding modules written in an environment
; that supports call conventions of the PASCAL function to a C project.
; Since PASCAL in Windows is stdcall, nothing will work!
; However, there's still the PASCAL keyword.
; It isn't overlapping, but it also isn't supported by the
; most recent versions of Microsoft Visual C++. The way out is to use
; the assembly inserts or Borland C++, which, like many other
; compilers, continues to support the Pascal convention.
;
; So, we've clarified that arguments
; are passed to the CALLBACK functions from right to left,
; but the stack is cleared by the called function,
; as must be done according to the stdcall convention.

push ebp
mov ebp, esp
mov eax, [ebp+arg_C]
; The dwTime argument is placed into EAX. How did we get this?
; There are three arguments before it on the stack.
; Each has a size of 4 bytes. Consequently, 4*3=0xC.

xor edx, edx
; EDX is zeroed.

mov ecx, 5
; A value of 5 is placed in ECX.
div ecx
; dwTime (in EAX) is divided by 5.

shl edx, 4
; EDX contains the remainder from division; using the cyclic shift
; instruction, we multiply it by 0x10 (or 4th degree of 2).

push edx ; uType
; The obtained result is passed to the MessageBeep function.
; In the SDK, we'll find that MessageBeep accepts
; the constants such as NB_OK, MB_ICONASTERISK, MB_ICONHAND, etc.,
; but nothing is said about the immediate values
; of each constant. However, the SDK informs us that MessageBeep
; is described in the WINUSER.h file. Let's open it and search
; for MB_OK using the context search:

;
; #define MB_OK 0x00000000L
; #define MB_OKCANCEL 0x00000001L
; #define MB_ABORTRETRYIGNORE 0x00000002L
; #define MB_YESNOCANCEL 0x00000003L
; #define MB_YESNO 0x00000004L
; #define MB_RETRYCANCEL 0x00000005L
;
; #define MB_ICONHAND 0x00000010L
; #define MB_ICONQUESTION 0x00000020L
; #define MB_ICONEXCLAMATION 0x00000030L
; #define MB_ICONASTERISK 0x00000040L
; All the constants that we're interested in
; have values of 0x0, 0x10, 0x20, 0x30, and 0x40. Now we can
; get a sense of the program. We divide by 5 the time elapsed
; from the system startup (in milliseconds). The remainder
; is a number belonging to the interval from 0 to 4. This number
; is multiplyed by 0x10, - 0x0, 0x0x10 - 0x40.

call ds:MessageBeep
; All possible types of beeps.

mov eax, [ebp+arg_C]
; dwTime is placed into EAX.

xor edx, edx
; EDX is zeroed.

mov ecx, 3E8h
; The decimal equivalent of 0x3E8 is 1000.

div ecx
; dwTime is divided by 1000;
; that is, milliseconds are converted into seconds and...

push eax
; ... the result passed to the printf function.

push offset aD ; "\r:=%d"
call _printf
add esp, 8
; printf("\r:=%d")

pop ebp
retn 10h
; Please turn the lights off when you leave - i.e.,
; clear the stack yourself!

TimerProc endp

An important remark on the types defined in ! We spoke about this in the comments on the previous listing, but repetition is justified; after all, not all readers grasp the analysis of disassembled listings.

The CALLBACK and WINAPI functions obey the Pascal calling convention, but PASCAL is defined in as stdcall (and as cdecl on some platforms). Thus, on the INTEL platform, Windows functions follow the same convention: Arguments are pushed onto the stack from right to left, and the stack is cleared by the called function.

To make ourselves familiar with the Pascal convention, let's create a simple Pascal program and disassemble it (PASCAL calls occur in other programs, but it makes sense to study PASCAL calls in Pascal programs):

Listing 82: A Demonstration of the PASCAL Call

USES WINCRT;
Procedure MyProc(a:Word; b:Byte; c:String);
begin
WriteLn(a+b,' ',c);
end;

BEGIN
MyProc($666,$77,'Hello, Sailor!');
END.

The disassembled code of this program compiled using Turbo Pascal for Windows should look like this:

Listing 83: The Disassembled Code for the PASCAL Call Compiled Using Turbo Pascal for Windows

PROGRAM proc near
call INITTASK
; INITTASK is called from KRNL386.EXE to initialize a 16-bit task.

call @__SystemInit$qv ; __SystemInit(void)
; The SYSTEM module is initialized.

call @__WINCRTInit$qv ; __WINCRTInit(void)
; The WinCRT module is initialized.

push bp
mov bp, sp
; The function prolog is in the middle of the function.
; This is Turbo Pascal!

xor ax, ax
call @__StackCheck$q4Word ; Stack overflow check (AX)

push 666h
; Note that the arguments are passed from left to right.

push 77h ; 'w'
mov di, offset aHelloSailor ; "Hello, Sailor!"
; DI contains a pointer to the string "Hello, Sailor!"

push ds
push di
; The FAR pointer is passed, not NEAR-
; that is, both segment and offset of the string.

call MyProc
; The stack is cleared by the called function.

leave
; The function's epilog closes the stack frame.

xor ax, ax
call @Halt$q4Word ; Halt (Word)
; The program ends!

PROGRAM endp

MyProc proc near ; CODE XREF: PROGRAM+23↑p
; IDA hasn't determined the function prototype.
; We'll just have to do this ourselves!

var_100 = byte ptr -100h
; This is a local variable. Since it's located at 0x100 bytes
; above the stack frame, it seems to be an array of 0x100 bytes
; (the maximum string length in Pascal is 0xFF bytes).
; It's likely to be the buffer allocated for the string.

arg_0 = dword ptr 4
arg_4 = byte ptr 8
arg_6 = word ptr 0Ah
; The function accepted three arguments.

push bp
mov bp, sp
; The stack frame is opened.

mov ax, 100h
call @__StackCheck$q4Word ; Stack overflow check (AX)
; Here, we find out if there are 100 bytes available on the stack,
; which we need for local variables.

sub sp, 100h
; Space is allocated for local variables.

les di, [bp+arg_0]
; The pointer to the rightmost argument is obtained.

push es
push di
; We passed the far pointer to the arg_0 argument,
; with its segment address not even popped from the stack!

lea di, [bp+var_100]
; The pointer to the local buffer is obtained.

push ss
; Its segment address is pushed onto the stack.

push di
; The buffer offset is pushed onto the stack.

push 0FFh
; The maximum string length is pushed.

call @$basq$qm6Stringt14Byte ; Store string
; The string is copied into the local buffer (consequently,
; arg_0 is a string). This way of achieving the goal, however,
; seems a little strange. Why not use a reference?
; Turbo Pascal won't let us -
; the strings are passed by value in Pascal.
; :-(

mov di, offset unk_1E18
; A pointer is obtained to the output buffer.
; Here, we need to become acquainted with the output system
; of Pascal - it is strikingly different from the C output
; system. First, the left-side order of pushing arguments
; onto the stack doesn't allow us (without using additional
; tricks) to organize support for procedures that have
; a variable number of arguments.
; But WriteLn is just a procedure with a variable number
; of parameters, isn't it?
; No, it's not a procedure! It's an operator.
; At compile time, the compiler divides it into several
; procedure calls to output each argument separately.
; Therefore, in the compiled code, each procedure takes a fixed
; number of arguments. There will be three of them in our case:
; The first one will be used to output the sum of two numbers
; (accepted by the WriteLongint procedure), the second one
; to output the blank space as a character (WriteChar),
; and the last one to output the string (WriteString).
; In Windows, it's impossible
; to output the string directly into the window and forget about it,
; because the window may require redrawing.
; The operating system doesn't save its contents - this would
; require a big memory space in a graphic environment with a high
; resolution. The code that outputs the string should know how
; to repeat the output on request. If you have ever programmed
; in Windows, you likely remember that all output should be placed
; into the WM_PAINT message handler. Turbo Pascal allows us to treat
; the window under Windows as a console. In this case,
; everything displayed earlier should be stored somewhere.
; Since local variables cease to exist as soon as their procedures
; are executed, they are not suitable for storing the buffer.
; Either the heap or the data segment remains. Pascal uses the latter -
; we've just received the pointer to such a buffer. In addition,
; to boost the output performance,
; Turbo Pascal creates a simple cache. The WriteLingint, WriteChar,
; and WriteString functions merge the results of their activity,
; represented by characters in this buffer. In the end, the call
; of WriteLn follows, which outputs the buffer contents into the window.
; The run-time system track the redrawing of the window,
; and, if necessary, repeats the output
; without involving the programmer.

push ds
push di
; The buffer address is pushed onto the stack.

mov al, [bp+arg_4]
; The type of the arg_4 argument is byte.

xor ah, ah
; The higher byte of the AH register is zeroed.

add ax, [bp+arg_6]
; This summed up arg_4 and arg_6. Since al was previously extended
; to AX, arg_6 has the Word type. (When summing two numbers of different
; types, Pascal extends the smaller number to the size of the larger
; one.) Apart from this, the calling procedure passes the value 0x666
; with this argument, which would not fit in 1 byte.

xor dx, dx
; DX is zeroed...

push dx
; ... and pushed onto the stack.

push ax
; The sum of two left arguments is pushed onto the stack.
push 0
; One more zero!

call @Write$qm4Text7Longint4Word ; Write(varf; v: Longint; width: Word)
; The WriteLongint function has the following prototype:
; WriteLongint(Text far &, a: Longint, count: Word).
; Text far & - the pointer to the output buffer
; a - the long integer being output
; count - how many variables should be output
; (if zero - one variable)
;
; Consequently, in our case, we output one variable - the sum of two
; arguments. A small addition - the WriteLongint function doesn't
; follow the Pascal convention, since it doesn't clear the stack
; completely, but leaves the pointer to the buffer on the stack.
; The compiler developers have accepted this solution to achieve
; better performance: If other functions need the pointer to the
; buffer (at least one of them does - WriteLn), why should we pop it,
; then push it back again each time? If you look into the end
; of the WriteLongint function, you'll see RET 6. The function
; pops two arguments from the stack - two words for Longint,
; and one word for count. Such a lovely technical detail! It's
; small, but it can lead to great confusion, especially if a code
; digger is not familiar with the Pascal input\output system!

push 20h ; ' '
; The next argument is pushed onto the stack for passing it to
; the WriteLn function. The pointer to the buffer is still on the stack.

push 0
; We need to output only one character.

call @Write$qm4Text4Char4Word ; Write(var f; c: Char; width: Word)

lea di, [bp+var_100]
; A pointer is obtained to the local copy of the string
; passed to the function.
push ss
push di
; Its address is pushed onto the stack.

push 0
; This is the output of only one string!

call @Write$qm4Textm6String4Word ; Write(var f; s: String; width: Word)

call @WriteLn$qm4Text ; WriteLn(var f: Text)
; It seems likely that no parameters are passed to the functions.
; Actually, the pointer to the buffer lays on the top of the stack
; and waits for its "hour of triumph."
; Upon completion of WriteLn, it will be removed from the stack.

call @__IOCheck$qv ; Exit if error.
; Check if the output is successful.

leave
; The stack frame is closed.

retn 8
; Since 8 bytes are popped from the stack, we now have everything
; we need to reconstruct the prototype of our procedure.
; It looks like this:
; MyProc(a: Byte, b: Word, c: String)

MyProc endp

Turbo Pascal turned out to be very artful! This analysis has taught us a good lesson: We can never be sure that the function will pop all arguments passed to it from the stack. In addition, it's impossible to determine the number of arguments by the number of machine words popped from the stack!

Fastcall conventions. However unproductive the transfer of arguments via the stack might be, the stdcall and cdecl call types are standardized, and they should be observed. Otherwise, the modules compiled by one compiler (libraries, for example) will be incompatible with modules compiled by other compilers. However, if the called function is compiled by the same compiler as the calling one, we don't need to follow the standard conventions. Instead, we can take advantage of the more effective passing of arguments via registers.

Beginners may wonder: Why hasn't the passing of arguments via registers been standardized? Is anyone planning to do so? The response is: Who would standardize it? Committees on the standardization of C and C++? Certainly not! All platform-dependent solutions are left to compiler developers — each developer is free to implement them as desired, or not to implement them. Readers still may ask: What prevents the compiler developers for one specific platform from reaching common agreements?

Developers have agreed to pass the value returned by the function through [E]AX:[[E]DX], although the standard doesn't discuss specific registers. At least, they have a partial agreement: Most manufacturers of 16-bit compilers have observed conventions without making compatibility claims. But the fastcall is so named because it is aimed at providing maximum performance. The optimization technique doesn't stand still, and introducing a standard is equivalent to anchoring a ball and chain to your leg. On the other hand, the average gain from passing arguments via registers is slight, and many compiler developers forsake speed for simplicity of implementation. If performance is crucial, we can use the inline functions.

This reasoning likely will interest programmers, but code diggers are worried about the reconstruction of function prototypes, not about performance. Is it possible to find out what arguments the fastcall function receives without analyzing its code (that is, looking only at the calling function)? The popular answer, "No, because the compiler passes arguments via the most ‘convenient’ registers," is wrong, and the speaker clearly shows his or her ignorance of the compilation procedure.

In compiler development, there is a translation unit: Depending on the implementation, the compiler may translate the program code in its entirety, or it may translate each function separately. The first type incurs substantial overhead, since we need to store the entire parse tree in memory. The second type saves in the memory only each function's name and reference to the code generated for it. Compilers of the first type are rare; I've never come across (although I have heard about) such a C\C++ compiler for Windows. Compilers of the second type are more efficient, require less memory, and are easier to implement; they are good in all respects except for their intrinsic inability to perform pass-through optimization. Each function is optimized individually and independently. Therefore, the compiler can't choose the optimal registers for passing arguments, since it doesn't know how they're handled by the called function. Functions translated independently should follow conventions, even if this isn't advantageous.

Thus, knowing the "handwriting" of the particular compiler, we can reconstruct the function prototype with minimal effort.

Borland C ++ 3.x passes arguments via the AX(AL), DX(DL), and BX(BL) registers. When no free registers remain, arguments are pushed onto the stack from left to right. Then they're popped by the called function (stdcall).

The method of passing arguments is rather interesting. The compiler doesn't assign each arguments its "own" registers; instead, it provides each argument easy access to the "pile" of candidates stacked in order of preference. Each argument takes as many registers from the pile as it needs, and when the pile is exhausted, the stack is used. The only exception is arguments of the long int type, which are always passed via DX:AX (the higher word is passed via DX) or, if that's impossible, via the stack.

If each argument occupies no more than 16 bits (as is often the case), the first argument from the left is placed into AX(AL), the second one into DX(DL), and the third one into BX(BL). If the first argument from the left is of the long int type, it takes two registers from the pile at once: DX:AX. The second argument gets the BX(BL) register. Nothing remains for the third argument, so it is passed via the stack. When long int is passed as the second argument, it is sent to the stack, since the AX register it needs is already occupied by the first argument. In this case, the third argument is passed via DX. Finally, if long int is the third argument from the left, it goes onto the stack. The first two arguments are passed via AX(AL) and DX(DL), respectively.

Floating-point values and far pointers are always passed via the main stack (not via the stack of the coprocessor, as common sense would tell us).

Borland C++ 5.x is similar to its predecessor, Borland C++ 3.x. However, it prefers the CX register to BX and places arguments of int and long int types in any suitable 32-bit registers, not in DX:AX. This is the result of converting the compiler from 16-bit to 32-bit mode.

Microsoft Visual C++ 4.x–6.x, when possible, passes the first argument from the left via the ECX register, the second one via the EDX register, and the rest via the stack. Floating-point values and far pointers are always transferred via the stack. The argument of the __int64 type (a nonstandard, 64-bit integer introduced by Microsoft) is always passed via the stack.

If __int64 is the first argument from the left, the second argument is passed via ECX, and the third one via EDX. If __int64 is the second argument, the first one is passed via ECX, and the third one via EDX.

Watcom C greatly differs from compilers from Borland and Microsoft. In particular, it doesn't support the fastcall keyword. (This results in serious compatibility problems.) By default, Watcom always passes arguments via registers. Instead of the commonly used "pile of preferences," Watcom strictly assigns a certain register to each argument: The EAX register is assigned to the first argument, EDX to the second one, EBX to the third one, and ECX to the fourth one. If it is impossible to place an argument into the specified register, this argument, and all other arguments to the right of it, are pushed onto the stack! In particular, by default, the float and double types are pushed onto the stack of the main processor, which spoils the whole thing.

The programmer may arbitrarily set his or her own order for passing arguments using the aux pragma, which has the following format: *pragma aux function_name parm [the list of registers]. The list of registers allowable for each type of argument is given in the following table.

I'll give a few explanations. First, arguments of the char type are passed via 32-bit registers, not via 8-bit ones. Second, the unexpectedly large number of possible pairs of registers for passing far pointers is striking. Third, the segment address may be passed not only via segment registers, but also via 16-bit, general-purpose registers.

Floating-point arguments can be passed via the stack of the coprocessor — just specify 8087 instead of the register name and compile the program using the -7 key (or -fpi, or -fpu87) to inform the compiler that the coprocessor's instructions are allowed. The documentation on Watcom says that arguments of the double type can also be passed via pairs of 32-bit, general-purpose registers, but I have failed to force the compiler to generate such a code. Maybe I don't know Watcom well enough, or perhaps an error occurred. I also have never encountered any program in which floating-point values have been passed via general-purpose registers. However, these are subtleties.

Thus, when analyzing programs compiled using Watcom, remember that arguments can be passed via practically any register.

Identifying arguments sent to and received from registers Both the called and calling functions must follow conventions when passing arguments via registers. The compiler should place arguments into the registers where the called function expects them to be, rather than into those "convenient" for the compiler. As a result, before each function that follows the fastcall convention, a code appears that "shuffles" the contents of registers in a strictly determined manner. The manner depends on the specific compiler. The most popular methods of passing arguments were considered above. If your compiler isn't in the list (which is quite probable — compilers spring up like mushrooms after a rain), experiment to figure out its "nature" yourself or consult its documentation. Developers rarely disclose such subtleties — not because of the desire to keep it secret, but because the documentation for each byte of the compiler wouldn't fit into a freight train.

Analyzing the code of the calling function does not help us recognize passing arguments via registers unless their initialization is evident. Therefore, we need to analyze the called function. In most cases, the registers saved on the stack just after the function receives control did not pass arguments, and we can strike them off the list of "candidates." Among the remaining registers, we need to find the ones whose contents are used without obvious initialization. At first, the function appears to receive arguments via just these registers. Upon closer examination, however, several issues emerge. First, implicit arguments of the function (the this pointer, pointers to the object virtual tables, etc.) often are passed via registers. Second, an unskilled programmer might believe the value should be equal to zero upon its declaration. If he or she forgets about initialization, the compiler places the value into the register. During program analysis, this value might be mistaken for the function argument passed via the register. Interestingly, this register accidentally may be explicitly initialized by the calling function. The programmer, for example, could call some function before this one, whose return value (placed into EAX by the compiler) wasn't used. The compiler could place the uninitialized variable into EAX. When, upon the normal completion of the execution, the function returns zero, everything may work. To catch such a bug, the code digger should analyze the algorithm and figure out whether the code of the successful function's completion is really placed into EAX, or if the variables were overwritten.

If we discard "clinical" cases, passing arguments via registers doesn't strongly complicate the analysis.

A practical investigation of the mechanism of passing arguments via registers Let's consider the following example. Note the conditional compilation directives used for compatibility with various compilers:

Listing 84: Passing Arguments via Registers

#include
#include

#if defined(__BORLANDC__) || defined (_MSC_VER)
// This branch of the program should be compiled only by Borland C++
// or Microsoft C++ compilers that support the fastcall keyword.

__fastcall
#endif

// Next is the MyFunc function, which has various types of arguments
// for demonstrating the mechanism of passing them.
MyFunc(char a, int b, long int c, int d)
{

#if defined(__WATCOMC__)
// This branch is specially intended for Watcom C.
// The aux pragma forcefully sets the order of passing arguments
// via the following registers: EAX, ESI, EDI, EBX.
#pragma aux MyFunc parm [EAX] [ESI] [EDI] [EBX];
#endif
return a+b+c+d;
}

main()
{

printf("%x\n", MyFunc(0xl, 0x2, 0x3, 0x4));
return 0;
}

The disassembled code of this example compiled using the Microsoft Visual C++ 6.0 compiler should look like this:

Listing 85: The Disassembled Code for Passing Arguments Compiled Using Microsoft Visual C++

main proc near ; CODE XREF: start+AF↓p
push ebp
mov ebp, esp

push 4
push 3
; If you run out of registers, the arguments are pushed onto the stack
; from right to left, passed to the calling function,
; then cleared from the stack by the called function (that is,
; everything is done just as if the stdcall convention were observed).

mov edx, 2
; EDX is used for passing the argument second from the left.
; It's easy to determine its type - this is int.
; It's certainly not char, and it's not a pointer.
; (A value of 2 is strange for a pointer.)

mov cl, 1
; The CL register is used for passing the argument
; first from the left (that is, of the char type -
; only variables of the char type have a size of 8 bits).

call MyFunc
; Already, we can reconstruct the prototype of the function:
; MyFunc(char, int, int, int).
; We've made a mistake by taking the long int type for int,
; but these types are identical
; in the Microsoft Visual C++ compiler.

push eax
; The result just obtained is passed to the printf function.

push offset asc_406030 ; "%x\n"
call _printf
add esp, 8
xor eax, eax
pop ebp
retn
main endp

MyFunc proc near ; CODE XREF: main+E↑p

var_8 = dword ptr -8
var_4 = byte ptr -4

arg_0 = dword ptr 8
arg_4 = dword ptr 0Ch
; Only two arguments are passed to the function
; via the stack, and IDA successfully recognized them.

push ebp
mov ebp, esp
sub esp, 8
; This allocates 8 bytes for local variables.

mov [ebp+var_8], edx
; The EDX register was not explicitly initialized before its
; contents were loaded into the var_8 local variable.
; Therefore, it is used for passing arguments!
; This program was compiled by Microsoft Visual C++,
; and, as you probably know, it passes arguments via
; the ECX:EDX registers. Therefore, we can infer that we're
; dealing with the second-from-the-left argument
; of the function. Somewhere below, we'll probably come across
; a reference to ECX - to the first-from-the-left argument
; of the function (although not necessarily -
; the first argument might not be used by the function).

mov [ebp+var_4], cl
; Actually, the reference to CL kept us from waiting long for it.
; Since the argument of the char type is passed via CL,
; the first function argument is probably char.
; However, the function simply may be accessing
; the lower byte of the argument (for example, of the int type).
; However, looking at the code of the calling function, we can
; make sure that only char, not int, is passed to the function.
; Incidentally, note the stupidity of the compiler - was it really
; necessary to pass arguments via registers to send them
; immediately into local variables? After all, addressing
; the memory negates all the benefits of the fastcall convention!
; It's even hard to describe such a call as "fast."

movsx eax, [ebp+var_4]
; EAX is loaded with the first-from-the-left argument passed
; via CL, which is of the char type with a signed extension to
; a double word. Hence, it's signed char (that is, char,
; by default, for Microsoft Visual C++).

add eax, [ebp+var_8]
; The contents of EAX are added with the argument second from the left.

add eax, [ebp+arg_0]
; The argument third from the left, passed via the stack,
; is added to the previous sum...

add eax, [ebp+arg_4]
; ... and all this is added to the fourth argument,
; also passed via the stack.

mov esp, ebp
pop ebp
; The stack frame is closed.

retn 8
; We cleared up the stack,
; as required by the fastcall convention.
MyFunc endp

Now, let's compare this with the result of disassembling the code generated by the Borland C++ compiler.

Listing 86: The Disassembled Code for Passing Arguments Compiled Using Borland C++

; int __cdecl main(int argc, const char **argv, const char *envp)
_main proc near ; DATA XREF: DATA:00407044↓o

argc = dword ptr 8
argv = dword ptr 0Ch
envp = dword ptr 10h

push ebp
mov ebp, esp

push 4
; Arguments are passed via the stack. Glancing downward,
; we discover explicit initialization of the ECX, EDX,
; and AL registers. There were no registers left for the fourth
; argument, so it had to be passed via the stack.
; Hence, the argument fourth from the left of the function is 0x4.

mov ecx, 3
mov edx, 2
mov al, 1
; All this code can do is pass arguments via registers.
call MyFunc

push eax
push offset unk_407074 ; format
call _printf
add esp, 8

xor eax, eax

pop ebp
retn
_main endp

MyFunc proc near ;CODE XREF: _main+11↑p

arg_0 = dword ptr 8
; Only one argument has been passed to the function
; via the stack.

push ebp
mov ebp, esp
; The stack frame is opened.

movsx eax, al
; Borland has generated a code that is more optimized than
; the one generated by Microsoft. Borland saved memory
; by not sending the local variable into the register.
; However, Microsoft Visual C++ is also capable of doing so,
; provided that the optimization key is specified.
; Also, note that Borland handles arguments in expressions
; from left to right as they are listed in the function prototype,
; whereas Microsoft Visual C++ acts in the opposite manner.

add edx, eax
add ecx, edx
; The EDX and CX registers haven't been initialized.
; Hence, the arguments were passed to the function via them.

mov edx, [ebp+arg_0]
; EDX is loaded with the last function argument
; passed via the stack...

add ecx, edx
; ... summed again,

mov eax, ecx
; ... and passed to EAX. (EAX is the register in which
; the function places the result of its execution.)

pop ebp
retn 4
; The stack is cleared.

MyFunc endp

And last, the result of disassembling the same example compiled with Watcom C should look like this.

Listing 87: The Disassembled Code for Passing Arguments via Registers Compiled Using Watcom C

main_ proc near ; CODE XREF: __CMain+40↓p
push 18h
call __CHK
; Checking for stack overflow.

push ebx
push esi
push edi
; The registers are pushed onto the stack.

mov ebx, 4
mov edi, 3
mov esi, 2
mov eax, 1
; The arguments are passed via the registers we specified!
; Note that the first argument of the char type
; is passed via the 32-bit EAX register.
; Watcom's behavior significantly complicates
; the reconstruction of function prototypes. In this case,
; the values are placed into the registers in the order
; in which the arguments were declared in the function prototype,
; beginning from the right. Alas, this happens relatively rarely.

call MyFunc

push eax
push offset unk_420004
call printf_

add esp, 8
xor eax, eax
pop edi
pop esi
pop ebx
retn
main_ endp

MyFunc proc near ; CODE XREF: main_+21↑p
; The function doesn't receive even a single argument from the stack.

push 4
call __CHK

and eax, 0FFh
; Zeroing the higher 24 bits and referencing the register before
; initializing it suggests that the char type is passed via EAX.
; Unfortunately, we can't say what kind of argument it is.

add esi, eax
; The ESI register has not been initialized by our function.
; Therefore, an argument of the int type is transferred via it.
; We can assume it's the argument second from the left
; in the function prototype, since the registers in the calling
; function are initialized in the order in which they are listed
; in the prototype (if nothing hinders this beginning from
; the right), and expressions are calculated from left to right.
; The original order of arguments is not crucial, but,
; it's nice if we succeed in determining it.

lea eax, [esi+edi]
; Oops! Do you believe that
; the pointer is being loaded into EAX? And that ESI and EDI,
; passed to the function, are also pointers? EAX with its char
; type becomes similar to an index. Alas! The Watcom compiler
; is too artful, and it's easy to run into gross errors when
; analyzing programs compiled using it. Yes, EAX is a pointer in
; the sense that LEA is used to calculate the sum of ESI and EDI.
; But neither the calling function nor the called one access
; the memory by this pointer. Therefore, the function
; arguments are constants, rather than pointers!

add eax, ebx
; Similarly, EDX contains the argument that was passed
; to the function.
; The function prototype should look like this:

; MyFunc(char a, int b, int c, int d)
; However, the order of arguments might differ.

retn
MyFunc endp

As you can see, passing arguments via registers isn't especially complex; it's even possible to reconstruct the original prototype of the called function. However, we've considered a rather idealized situation. In real programs, passing immediate values only is rarely done. Now, having mastered fastcall, let's disassemble a more difficult example.

Listing 88: A Difficult fastcall Example

#if defined(__BORLANDC__) || defined (_MSC_VER)
__fastcall
#endif
MyFunc(char a, int *b, int c)
{
#if defined(__WATCOMC__)
pragma aux MyFunc parm [EAX] [EBX] [ECX];
#endif
return a+b[0]+c;
}

main()
{
int a=2;
printf("%x\n", MyFunc (strlen("1"), &a, strlen("333")));
}

The result of disassembling the compiled code of this example should look like this:

Listing 89: The Disassembled Code of the Difficult factcall Example

main proc near ; CODE XREF: start+AF↓p

var_4 = dword ptr -4

push ebp
mov ebp, esp
; The stack frame is opened.

push ecx
push esi
; The registers are pushed onto the stack.

mov [ebp+var_4], 2
; A value of 2 is placed into the var_4 local variable.
; The type is determined from the fact that the variable occupies
; 4 bytes. (See the "Local Stack Variables" section
; for more details.)

push offset a333 ; const char *
; A pointer to the "333" string is passed to the strlen function.
; The arguments of MyFunc are passed from right to left as required.

call _strlen
add esp, 4
push eax
; Here, the value returned by the function is either saved
; onto the stack or passed to the next function.

lea esi, [ebp+var_4]
; The pointer to the var_4 local variable is placed into ESI.

push offset a1 ; const char *
; The pointer to the al string is passed to the strlen function.

call _strlen
add esp, 4

mov cl, al
; The returned value is copied to the CL register, and EDX is
; initialized. Since ECX:EDX are used for passing arguments to
; fastcall functions, the initialization of these two registers
; prior to calling the function is not accidental!
; We can assume that the leftmost argument of the char type
; is transferred via CL.

mov edx, esi
; ESI containes the pointer to var_4. Therefore, the second
; argument of the int type, placed into EDX, is passed by reference.

call MyFunc
; The preliminary function prototype looks like this:
; MyFunc(char *a, int *b, int c)
; Where did the c argument come from? Do you remember the code
; in which EAX was pushed onto the stack? Neither before nor after
; the function call was it popped out! To be sure of this,
; we need to see how many bytes the called function removes
; from the stack. Another interesting fact is that the values
; returned by the strlen function were not assigned to
; local variables, but were directly passed to MyFunc.
; This suggests that the source code of the
; program looked like this:
; MyFunc(strlen("1"),&var_4,strlen("333"));
; This is not necessarily the case - the compiler might optimize the
; code, throwing out the local variable if it isn't used anymore.
; However, judging from the code of the called function, the
; compiler works without optimization. In addition, if the values
; returned by the strlen functions are used only once as arguments
; of MyFunc, assigning them to local variables simply
; obscures the essence of the program. Moreover, for a code digger,
; it's more important to understand the algorithm of a program
; than to restore its source code.

push eax
push offset asc_406038 ; "%x\n"
call _printf
add esp, 8

pop esi

mov esp, ebp
pop ebp
; The stack frame is closed.

retn
main endp

MyFunc proc near ; CODE XREF: main+2E↑p

var_8 = dword ptr -8
var_4 = byte ptr -4
arg_0 = dword ptr 8
; The function accepts one argument.
; Hence, EAX has been pushed onto the stack.

push ebp
mov ebp, esp
; The stack frame is opened.

sub esp, 8
; This allocated 8 bytes for local variables.

mov [ebp+var_8], edx
; Since EDX is used without explicit initialization,
; the function argument second from the left is passed
; via it (according to the fastcall convention of the Microsoft
; Visual C++ compiler). Having analyzed the code of the calling
; function, we know that EDX contains the pointer to var_4.
; Therefore, var_8 contains the pointer to var_4.

mov [ebp+var_4], cl
; The leftmost argument of the function is passed via CL,
; and then immediately placed into the var_4 local variable.

movsx eax, [ebp+var_4]
; var_4 is extended to signed int.

mov ecx, [ebp+var_8]
; ECX is loaded with the contents of the var_8 pointer passed
; via EDX. As you may remember, the pointer was passed
; to the function via EDX.

add eax, [ecx]
; EAX, which stores the first-from-the-left argument of the function,
; is added with the contents of the memory location referenced
; by the ECX pointer.

add eax, [ebp+arg_0]
; Here is a reference to the function argument
; that was passed via the stack.

mov esp, ebp
pop ebp
; The stack frame is closed.

retn 4
; One argument was passed to the function via the stack.
MyFunc endp

Simple? Yes, it is! Then let's consider the result of creative work with Borland C++, which should look like this:

Listing 90: The Disassembled Code for fastcall Compiled Using Borland C++

; int __cdecl main(int argc, const char **argv, const char *envp)
_main proc near ; DATA XREF: DATA:00407044↓o

var_4 = dword ptr -4
argc = dword ptr 8
argv = dword ptr 0Ch
envp = dword ptr 10h

push ebp
mov ebp, esp
; The stack frame is opened.

push ecx
; ECX is saved... Just a moment! This is something new! In previous
; examples, Borland never saved ECX when entering a function.
; It seems likely that some argument has been passed to the
; function via ECX, and this function is passing it
; to other functions via the stack. However convincing
; such a solution might look, it's incorrect! The compiler simply
; allocates 4 bytes for local variables. Why? How did we determine this?
; Look: IDA recognized one local variable - var_4. But memory
; for it was not explicitly allocated. In any case, there was
; no SUB ESP, 4 instruction. But wait: PUSH ECX results in a decrease
; of the ESP register by four! Oh, this optimization...

mov [ebp+var_4], 2
; A value of 2 is placed into a local variable.

push offset a333 ; s
; A pointer to the "333" string is passed to the function.

call _strlen
pop ecx
; The argument is popped from the stack.

push eax
; Here we are either passing the value
; returned by the strlen function
; to the following function as the stack argument,
; or we are temporarily saving EAX onto the stack.
; (Later, it will become clear that the latter assumption is true.)

push offset al ; s
; The pointer to the AL string is passed to the strlen function.

call _strlen
pop ecx
; The argument is popped from the stack.

lea edx, [ebp+var_4]
; The offset of the var_4 local variable is loaded into EDX.

pop ecx
; Something is popped from the stack, but what exactly? Scrolling
; the screen of the disassembler upward, we find that EAX was pushed
; last onto the stack and contained the value returned by the strlen
; ("333") function. It is now located in the ECX register.
; (Borland passes the argument second from the left via it.)
; Incidentally, a note for fastcall fans: fastcall
; doesn't always provide the anticipated call acceleration -
; Intel 80x86 doesn't have enough registers, and they continually
; need to be saved onto the stack. Passing an argument via
; the stack would require only one reference to memory: PUSH EAX.
; Here we have two - PUSH EAX and POP ECX!

call MyFunc
; When reconstructing the function prototype, don't forget about
; the EAX register - it's not initialized explicitly,
; but it stores the value returned by the last call of strlen.
; Since the Borland C++ 5.x compiler
; uses the preferences EAX, EDX, and ECX, we can conclude
; that the function argument first from the left is passed to EAX,
; and the other two arguments - to EDX and ECX, respectively.
; Note that Borland C++, unlike Microsoft Visual C++,
; doesn't handle arguments in the order in which they appear in the list.
; Instead, it computes the values of all functions, "pulling" them
; out from right to left, then proceeds to variables and constants.
; This stands to reason: Functions change the values of many
; general-purpose registers. Until the last function is called,
; the passing of arguments via registers should not begin.

push eax
push offset asc_407074 ; format
call _printf
add esp, 8

xor eax, eax
; The zero value is returned.

pop ecx
pop ebp
; The stack frame is closed.

retn
_main endp

MyFunc proc near ; CODE XREF: _main+26↑p
push ebp
mov ebp, esp
; The stack frame is opened.

movsx eax, al
; EAX is extended to the signed double word.

mov edx, [edx]
; EDX is loaded with the contents of the memory location
; referenced by the EDX pointer.

add eax, edx
; The first argument of the function is added to the variable
; of the int type, passed by reference as the second argument.

add ecx, eax
; The third argument of the int type is added to the previous sum.

mov eax, ecx
; The result is placed back into EAX.
; What a stupid compiler this is! Wouldn't it be simpler
; to swap the arguments in the previous instruction?

pop ebp
; The stack frame is closed.

retn
MyFunc endp

Now let's consider the disassembled code of the same example compiled using Watcom C, which always has something new to teach us.

Listing 91: The Disassembled Code for fastcall Compiled Using Watcom C

main_ proc near ; CODE XREF: __CMain+40↓p

var_C = dword ptr -0Ch

push 18h
call __CHK
; The stack overflow is checked.

push ebx
push ecx
; The registers being modified are saved -
; or maybe memory is allocated for local variables?

sub esp, 4
; This is certainly the allocation of memory for one local variable.
; Therefore, the two PUSH instructions we saw above
; save the registers.

mov [esp+0Ch+var_C], 2
; A value of 2 is placed into the local variable.

mov eax, offset a333; "333"
call strlen_
; Note that Watcom passes the pointer to the string
; to the strlen function via the register!

mov ecx, eax
; The value returned by the function is copied into the ECX register.
; Watcom knows that the next call of strlen won't spoil this register!

mov eax, offset al ; "1"
call strlen_

and eax, 0FFh
; Since strlen returns the int type, here we have
; an explicit type conversion: int -> char.

mov ebx, esp
; EBX is loaded with the pointer to the var_C variable.

call MyFunc
; Which arguments were passed to the function?
; EAX (probably the leftmost argument), EBX (explicitly
; initialized prior to calling the function), and probably ECX
; (although this is not necessarily the case).
; ECX might contain a register variable, but in that case
; the called function should not access it.

push eax
push offset asc_42000A ; "%x\n"

call printf_

add esp, 8
add esp, 4
; And they say Watcom is an optimizing compiler!
; It can't even unite two instructions into one!
pop ecx
pop ebx

retn
main_ endp

MyFunc proc near ; CODE XREF: main_+33↑p
push 4
call __CHK
; The stack is checked.

and eax, OFFh
; The 24 higher bits are zeroed repeatedly. It would not be bad
; if Watcom were more certain about where to perform this
; operation - in the called function or in the calling one.
; However, such doubling simplifies the reconstruction
; of the function prototypes.

add eax, [ebx]
; EAX of type char, now extended to int, is added with the
; variable of the int type passed by reference via the EBX register.

add eax, ecx
; Aha! Here is the reference to ECX. We now know
; that this register was used for passing arguments.

retn
; The function prototype should look as this:
; MyFunc (char EAX, int *EBX, int ECX)
; Notice that it was possible to reconstruct it
; only by performing the combined analysis
; of the called and calling functions!

MyFunc endp

Passing floating-point values Most code breakers don't know the particulars of floating-point arithmetic and avoid it like the plague. There's nothing terribly complex about it, and mastering the coprocessor takes only a couple of days. However, it's much more difficult to master the mathematical libraries that emulate floating-point calculations (especially if IDA doesn't recognize the names of the library functions). But what contemporary compiler makes use of such libraries? The microprocessor and coprocessor are integrated within the same chip. Therefore, the coprocessor, starting from 80486DX (if my memory doesn't fail me), is always available; there's no need to programmatically emulate it.

Until the end of the 1990s, many hackers thought it possible to live their entire lives without coming across floating-point arithmetic. Indeed, in the good old days, processors were as slow as turtles, few people had coprocessors, and the tasks computers had to solve allowed hackers to use tricks and solutions that employed integer arithmetic.

Today, everything has changed. Floating-point calculations performed by the coprocessor at the same time as the execution of the main program are completed even faster than the integer calculations processed by the main processor. Programmers, inspired by such prospects, began to use floating-point data types even where integer ones had been more than sufficient. Contemporary code diggers can hardly do without knowledge of coprocessor instructions.

80x87 coprocessors support three types of floating-point data: short 32-bit, long 64-bit, and extended 80-bit. These correspond to the following types of the C language: float, double, and long double[i] (see Table 8).

Arguments of the float and double types can be passed to the function in three ways: via general-purpose registers of the main processor, via the stack of the main processor, or via the stack of the coprocessor. Arguments of the long double type require too many general-purpose registers to be passed using this method. In most cases, they are pushed onto the stack of the main processor or that of the coprocessor.

The first two ways are already familiar to us, but the third one is something new! The 80x87 coprocessor has eight 80-bit registers (designated as ST(0), ST(1), ST(2), ST(3), ST(4), ST(5), ST(6), and ST(7)) organized as a wraparound stack. This means that most of the coprocessor instructions don't operate with the register indexes; their destination is the top of the stack. For example, to add two floating-point numbers, we need to push them onto the stack of the coprocessor. Then we must call the addition instruction that adds the two numbers lying on the top of the stack and returns the result via the stack again. We have the option of adding the number that lies on the stack of the coprocessor to the number located in the RAM, but it's impossible to directly add two numbers located in the RAM!

Thus, the first stage of floating-point operations is pushing the operands onto the coprocessor stack. This operation is performed by the instructions of the FLDxx series (listed with brief explanations in Table 9). In most cases, we use the FLD source instruction, which pushes a floating-point number from the RAM or the coprocessor register onto the coprocessor stack. Strictly speaking, this is not one instruction; it's four instructions in one package, which have the opcodes 0xD9 0x0?, 0xDD 0x0?, 0xDB 0x0?, and 0xD9 0xCi for loading the short, long, and extended-real values and the FPU register, respectively. The ? character is an address field that specifies whether the operand is in the register or in memory, and the i character is an index of the FPU register.

The impossibility of loading floating-point numbers from CPU registers makes it senseless to use them for passing float, double, or long double arguments. In any case, to push these arguments onto the coprocessor stack, the called function would have to copy the contents of registers to the RAM. No matter what you do, there's no way to get rid of memory calls. Therefore, passing floating-point types via registers is rarely done. They are passed predominantly via the CPU stack or via the coprocessor stack along with the usual arguments. (This can be done only by advanced compilers — Watcom, in particular — and not by Microsoft Visual C++ or Borland C++.)

However, certain "peculiar" values can be loaded without addressing the memory; in particular, there are instructions for pushing numbers (zero, one, π, and certain others — the complete list is given in Table 9) onto the coprocessor stack.

An interesting feature of the coprocessor is support for integer calculations. I don't know of any compiler that uses this capability, but sometimes it's used in assembly inserts; therefore, it's unwise to neglect learning the integer coprocessor instructions.

The double and long double types occupy more than one word, and transferring them via the CPU stack takes several iterations. As a result, we can't always determine the type and number of arguments passed to the called function by analyzing the code of the calling function. Instead, investigate the algorithm of the called function. Since the coprocessor can't determine the type of the operand located in the memory (that is, the coprocessor doesn't know how many bytes it occupies), a separate instruction is assigned to each type. The assembler syntax hides these distinctions, allowing the programmer to ignore the subtleties of implementation. (Nevertheless, some people say that the assembler is a low-level language.) Few people know that FADD [float] and FADD [double] are different machine instructions having the opcodes 0xD8 ??000??? and 0XDC ??000???, respectively. Analyzing the disassembled listing doesn't give us any information on the floating-point types; to obtain this information, we need to get down to the machine level and sink our teeth into hexadecimal dumps of instructions.

Table 10 presents the opcodes of the main coprocessor instructions that work with memory. Note that performing arithmetic operations directly over floating-point values of the long double type is impossible; they must first be loaded onto the coprocessor stack.

(The second byte of the opcode is presented in binary form. The ? character denotes any bit.)

A note on floating-point types of the Turbo Pascal language Since the C language is machine-oriented, its floating-point types coincide with the coprocessor floating-point types. The main floating-point type of Turbo Pascal is Real; it occupies 6 bytes, which is not "native" to the computer. Therefore, for calculations carried out using the coprocessor, Real is programmatically converted to the Extended type (long double in terms of C). This takes up the lion's share of the performance. Unfortunately, the built-in mathematical library, intended to replace the coprocessor, does not support other types. When a "live" coprocessor is available, pure coprocessor types — Single, Double, Extended, and Comp — appear that correspond to float, double, long double, and __int64.

The mathematical library functions that provide support for floating-point calculations receive floating-point arguments from the registers. The first argument from the left is placed into AX, BX, DX; the second argument, if there is one, is placed into CX, SI, DI. The system functions that implement the interface to the processor (in particular, the functions for converting the Real type into the Extended type) receive arguments from registers and return the result via the coprocessor stack. Finally, the application functions and procedures receive floating-point arguments from the CPU stack.

Depending on the settings of the compiler, the program may be compiled either using the built-in mathematical library (the default), or by employing direct calls of the coprocessor instructions. (This is the /N$+ key.) In the first case, the program doesn't use the coprocessor's capabilities, even though it's installed in the computer. In the second case, if the coprocessor is available, the compiler uses its computational capabilities; if the coprocessor isn't available, any attempt to call a coprocessor instruction results in the generation of the int 0x7 exception by the main processor. This will be caught by the software coprocessor emulator, the same thing as the built-in library supporting floating-point calculations.

Now that you have a general outline of how floating-point arguments are passed, you are burning with the desire to see it "live," right? To begin with, let's consider a simple example.

Listing 92: Passing Floating-Point Arguments to a Function

#include

float MyFunc(float a, double b)
{
#if defined (__WATCOMC__)
#pragma aux MyFunc parm [8087];
// To be compiled using the -7 key
#endif
return a+b;
}

main()
{
printf("%f\n", MyFunc(6.66, 7.77));
}

The disassembled listing of this code, compiled with Microsoft Visual C++, should look as follows:

Listing 93: The Disassembled Code for Passing Floating-Point Arguments

main proc near ; CODE XREF: start+Af↓p

var_8 = qword ptr -8
; A local variable, this is likely to occupy 8 bytes.

push ebp
mov ebp, esp
; The stack frame is opened.

push 401F147Ah
; Unfortunately, IDA can't represent an operand as a floating-point
; number. Besides which, we can't determine
; whether or not this number is floating-point.
; It can be of any type: either int or a pointer.

push 0E147AE14h
push 40D51EB8h
; A draft of the prototype looks like this:
; MyFunc(int a, int b, int c)

call MyFunc
add esp, 4
; Here we go! Only one machine word is taken from the stack,
; whereas three words are pushed there!

fstp [esp+8+var_8]
; A floating-point number is pulled from the coprocessor stack.
; To find out which one, we need to press +,
; select Text representation from the pop-up menu,
; choose the Number of opcode bytes item, and enter
; the number of characters for opcode instructions (4, for example).
; To the left of FSTP, its machine representation -
; DD 1C 24 - appears. Using Table 10, we can determine the data
; type with which this instruction works. It's double.
; Therefore, the function has returned a floating-point value
; via the coprocessor stack.
; Since the function returns floating-point values, it's possible
; that it receives them as arguments. We can't confirm this
; assumption without carrying out an analysis of MyFunc.

push offset aF ; "%f\n"
; A pointer is passed to the format specification string,
; which orders the printf function to output one floating-point number.
; But we're not placing it in the stack!
; How can this be? Let's scroll the disassembler window
; upward while thinking over the ways of solving the problem.
; Closely examining the FSTP [ESP+8+var_8] instruction,
; let's figure out where it places the result of its work.
; IDA has determined var_8 as qword ptr-8. Therefore, [ES+8-8] is
; the equivalent of [ESP] - that is, the floating-point variable
; is pushed directly onto the top of the stack.
; And what's on the top? Two arguments that were passed
; to MyFunc and not popped off the stack.
; What an artful compiler! It hasn't bothered to create
; a local variable, and it used the function arguments
; to temporarily store data!

call _printf
add esp, 0Ch
; Three machine words are popped off the stack.

pop ebp
retn
main endp

MyFunc proc near ; CODE XREF: sub_401011+12↑p

var_4 = dword ptr -4
arg_0 = dword ptr 8
arg_4 = qword ptr 0Ch
; IDA detected only two arguments, while three machine words
; were passed to the function! One of the arguments is likely to
; occupy 8 bytes.

push ebp
mov ebp, esp
; The stack frame is opened.

push ecx
; No, this is not saving ECX - it's allocating memory for
; a local variable, since the var_4 variable is
; where the saved ECX is located.

fld [ebp+arg_0]
; The floating-point variable at the [ebp+8] address
; (the leftmost argument),
; is pushed onto the coprocessor stack.
; To learn the type of this variable, let's look at opcode
; of the instructions FLD - D9 45 08. Aha! D9 - hence, float.
; It turns out that the argument first from the left is float.

fadd [ebp+arg_4]
; The float type arg_0 is added to the argument
; second from the left of the type...
; If the first argument is float, should the second one
; also be float? Not necessarily! Let's peep into the opcode -
; it's DC 45 OC. Hence, the second argument is double, not float!

fst [ebp+var_4]
; The value from the top of the coprocessor stack (where the result
; of addition is located) is copied into the var_4 local
; variable. Why? We suddenly might need it.
; The value is not popped off, but copied! It still remains
; in the stack. Thus, the prototype of MyFunc looked like
; this: double MyFunc (float a, double b);

mov esp, ebp
pop ebp
; The stack frame is closed.

retn

MyFunc endp

The result of compiling with Borland C++ 5.x is almost identical to that of the example we just considered using Microsoft Visual C++ 6.x. Therefore, we'll proceed to the analysis of an example compiled using Watcom C (as always, Watcom is highly instructive).

Listing 94: The Disassembled Code for Passing Floating-Point Arguments Compiled Using Watcom

main_ proc near ; CODE XREF: __CMain+40↓p

var_8 = qword ptr -8
; A local variable, this is likely to occupy 8 bytes.

push 10h
call __CHK
; Checking for stack overflow

fld ds:db1_420008
; A variable of the double type taken from the data segment
; is loaded onto the top of the coprocessor stack. IDA has
; successfully determined the variable type,
; having added the db1 prefix to it.
; If IDA hadn't determined the type, we would have to examine
; the opcode of the FLD instruction.

fld ds:flt_420010
; A variable of the float type is loaded onto the top of the
; stack.

call MyFunc
; MyFunc is called, and two arguments are passed via
; the coprocessor stack. The prototype looks like this:
; MyFunc(float a, double b).

sub esp, 8
; Memory is allocated for a local variable of 8 bytes.

fstp [esp+8+var_8]
; A floating-point variable of the double type is popped off the top
; of the stack. The type is determined by the variable's size.

push offset unk_420004
call printf_
; This is a trick we already know - passing var_8
; to the printf function!

add esp, 0Ch
retn
main_ endp

MyFunc proc near ; CODE XREF: main_+16↑p

var_C = qword ptr -0Ch
var_4 = dword ptr -4
; IDA has found two local variables.

push 10h
call __CHK

sub esp, 0Ch
; Space is allocated for local variables.

fstp [esp+0Ch+var_4]
; A floating-point variable of the float type
; is popped off the top of the coprocessor stack.
; (As you may recall, it was placed there last.)
; Let's make sure of this by peeping into the opcode
; of the FSTP instruction. It is D9 5C 24 08.
; If 0xD9 is there, then it's float.

fstp [esp+0Ch+var_C]
; A floating-point variable of the double type
; is popped off the top of the coprocessor stack.
; (As you may remember, it was placed there before float.)
; To be on the safe side,
; check the opcode of the FSTP instruction. It is DD 1C 24.
; If 0xDD is there, it must be double.

fld [esp+0Ch+var_4]
; The float is pushed back onto the top of the stack...

fadd [esp+0Ch+var_C]
; ... and added to double. They dare to say that Watcom C
; is an optimizing compiler! It's difficult to agree with this
; when the compiler doesn't know that swapping the terms
; doesn't change the sum!

add esp, 0Ch
; Memory that was allocated for local variables is released.

retn
MyFunc endp

dbl_420008 dq 7.77 ; DATA XREF: main_+A↑r
flt_420010 dd 6.6599998 ; DATA XREF: main_+10↑r

Now comes Turbo Pascal for Windows 1.0. Let's enter the following example in the text editor:

Listing 95: Passing Floating-Point Values Using Turbo Pascal

USES WINCRT;

Procedure MyProc(a:Real);
begin
WriteLn(a);
end;

VAR
a: Real;
b: Real;

BEGIN
a:=6.66;
b:=7.77;
MyProc(a+b);
END.

Now, we'll compile it without coprocessor support. (This is the default.)

Listing 96: The Disassembled Code for Passing Floating-Point Values

PROGRAM proc near

call INITTASK
call @__SystemInit$qv ; __SystemInit (void)
; The System unit is initialized.

call @__WINCRTInit$qv ; __WINCRTInit(void)
; The WINCRT unit is initialized.

push bp
mov bp, sp
; The stack frame is opened.

xor ax, ax
call @__StackCheck$q4Word ; Stack overflow check (AX)
; This checks if there are at least 0 free bytes in the stack.

mov word_2030, 0EC83h
mov word_2032, 0B851h
mov word_2034, 551Eh
; A variable of the Real type is initialized.
; We know that it's Real
; only from the source code of the program.
; It's impossible to visually distinguish this series
; of instructions from three variables of the Word type.

mov word_2036, 3D83h
mov word_2038, 0D70Ah
mov word_203A, 78A3h
; Another variable of the Real type is initialized.
mov ax, word_2030
mov bx, word_2032
mov dx, word_2034
mov cx, word_2036
mov si, word_2038
mov di, word_203A
; Two variables of the Real type are passed via registers.

call @$brplu$q4Realt1 ; Real(AX:BX:DX)+= Real(CX:SI:DI)
; Fortunately, IDA recognized the addition operator
; in this function. It has even prompted us as to its prototype.
; If IDA hadn't helped us, it would be difficult to understand
; what this long and intricate function does.

push dx
push bx
push ax
; The returned value is passed to the MyProc procedure via
; the stack. Consequently, the MyProc prototype looks like this:
; MyProc(a:Real).

call MyProc

pop bp
; The stack frame is closed.

xor ax, ax
call @Halt$q4Word ; Halt(Word)
; The program's execution is halted.

PROGRAM endp

MyProc proc near ; CODE XREF: PROGRAM+5C↑p

arg_0 = word ptr 4
arg_2 = word ptr 6
arg_4 = word ptr 8
; The three arguments passed to the procedure,
; as we have already clarified, represent three "sections"
; of one argument of the Real type.

push bp
mov bp, sp
; The stack frame is opened.

xor ax, ax
call @__StackCheck$q4Word ; Stack overflow check (AX)
; Are there 0 bytes in the stack?

mov di, offset unk_2206
push ds
push di
; The pointer to the string output buffer is pushed onto the stack.

push [bp+arg_4]
push [bp+arg_2]
push [bp+arg_0]
; All three received arguments are pushed onto the stack.

mov ax, 11h
push ax
; The output width is 17 characters.

mov ax, 0FFFFh
push ax
; The number of digits after the point is maximal.

call @Write$qm4Text4Real4Wordt3
; Write(var f; v: Real; width, decimals: Word)
; The floating-point number is output into the unk_2206 buffer.

call @WriteLn$qm4Text ; WriteLn(var f: Text)
; The string is sent from the buffer to the display.

call @__IOCheck$qv ; Exit if error
pop bp
retn 6
MyProc endp

Now, using the /$N+ key, let's put the coprocessor instructions into action, and see how this will affect the code.

Listing 97: The Disassembled Code for Passing Floating-Point Values Compiled to Use Coprocessor Instructions

PROGRAM proc near

call INITTASK
call @__SystemInit$qv ; __SystemInit(void)
; The System module is initialized.

call @__InitEM86$qv ; Initialize software emulator
; The coprocessor emulator is turned on.

call @__WINCRTInit$qv ; __WINCRTInit(void)
; The WINCRT module is initialized.

push bp
mov bp, sp
; The stack frame is opened.

xor ax, ax
call @__StackCheck$q4Word ; Stack overflow check (AX)
; Checking for stack overflow

mov word_21C0, 0EC83h
mov word_21C2, 0B851h
mov word_21C4, 551Eh
mov word_21C6, 3D83h
mov word_21C8, 0D70Ah
mov word_21CA, 78A3h
; We're not yet able to determine the type of initialized
; variables. They could be Word or Real.

mov ax, word_21C0
mov bx, word_21C2
mov dx, word_21C4
call @Extended$q4Real ; Convert Real to Extended
; Now we transfer word_21C0, word_21C2, and word_21C4
; to the function that converts Real to Extended,
; loading the latter to the coprocessor stack. Therefore,
; word_21C0 through word_21C4 is a variable of the Real type.

mov ax, word_21C6
mov bx, word_21C8
mov dx, word_21CA
call @Extended$q4Real ; Convert Real to Extended
; Similarly, the word_21C6 through word_21CA variable is of the Real type.

wait
; Now we wait for the coprocessor to finish its work.

faddp st(1), st
; Two numbers of the Extended type that are located on the top
; of the coprocessor stack are added;
; the result is saved on the same stack.

call @Real$q8Extended
; Extended is converted to Real.
; The argument is passed via the coprocessor stack
; and returned into the AX, BX, and DX registers.

push dx
push bx
push ax
; The AX, BX, and DX registers contain a value of the Real type.
; Therefore, the procedure prototype looks like this:
; MyProc(a:Real);

call MyProc

pop bp
xor ax, ax
call @Halt$q4Word ; Halt(Word)
PROGRAM endp

MyProc proc near ; CODE XREF: PROGRAM+6D↑p

arg_0 = word ptr 4
arg_2 = word ptr 6
arg_4 = word ptr 8
; As we already know, these three arguments are actually
; one argument of the Real type.

push bp
mov bp, sp
; The stack frame is opened.

xor ax, ax
call @__StackCheck$q4Word ; Stack overflow check (AX)
; Checking for stack overflow

mov di, offset unk_2396
push ds
push di
; The pointer to the string output buffer is pushed onto the stack.

mov ax, [bp+arg_0]
mov bx, [bp+arg_2]
mov dx, [bp+arg_4]
call @Extended$q4Real
; Real is converted to Extended.

mov ax, 17h
push ax
; The output width is 0x17 characters.

mov ax, 0FFFFh
push ax
; This is for the number of digits after the decimal point.
; Everything we have is to be outputted.

call @Write$qm4Text8Extended4Wordt3
; Write(var f; v: Extended{st(0)
; width decimals: Word)
; The floating-point number from the coprocessor stack
; is outputted into the buffer.

call @WriteLn$qm4Text ; WriteLn(var f: Text)
; The string from the buffer is printed.

call @__IOCheck$qv ; Exit if error
pop bp
retn 6
MyProc endp

The conventions on thiscall usage, and the conventions on default calling In C++ programs, each function of the object implicitly accepts the this argument—a pointer to the object instance from which the function was called. We already discussed this topic in detail in the section "The this Pointer."

As far as I know, all C++ compilers use the combined calling convention by default, passing explicit arguments via the stack (if the function isn't declared as fastcall). The this pointer is passed via the register that has the greatest preference. (See Tables 2–7.)

In contrast, the cdecl and stdcall conventions require that you transfer all arguments via the stack — including the this implicit argument, placed on the stack after all explicit arguments. (In other words, this is the leftmost argument.)

Let's consider the following example:

Listing 98: Passing the this Implicit Argument

#include

class MyClass{
public:
void demo(int a);
// The prototype of demo actually looks like this:
// demo (this, int a)

void __stdcall demo_2(int a, int b);
// The prototype of demo_2 looks like this:
// demo_2(this, int a, int b)

void __cdecl demo_3(int a, int b, int c);
// The prototype of demo_3 looks like this:
// demo_3(this, int a, int b, int c)
};

// To save space, the implementation of the demo, demo_2,
// and demo_3 functions is not given here.

main()
{
MyClass *zzz = new MyClass;
zzz->demo();
zzz->demo_2();
zzz->demo_3();
}

The disassembled code of this example, compiled using Microsoft Visual C++ 6.0, is given in the following listing. (I show only the main function; the rest of the program isn't of interest now.)

Listing 99: The Disassembled Code for Passing the this Implicit Argument

main proc near ; CODE XREF: start+AF↓p
push esi
; ESI is saved in the stack.

push 1
call ??2@YAPAXI@Z ; operator new(uint)
; This allocates 1 byte for the object instance.

mov esi, eax
; ESI contains the pointer to the object instance.

add esp, 4
; An argument is popped off the stack.

mov ecx, esi
; Via ECX, the this pointer is passed to the demo function.
; As you may remember, the Microsoft Visual C++ compiler uses
; the ECX register to pass the first argument of the function.
; In this case, the this pointer is just that argument.
; The Borland C++ 5.x compiler would pass this via
; the EAX register, since this compiler gives it
; the greatest preference. (See Table 4.)

push 1
; The explicit argument of the function is pushed onto the stack.
; If this was the fastcall function, this argument
; would have been placed into the EDX register.
; It turns out that we are dealing with the type of the default
; calling convention.

call Demo

push 2
; The rightmost argument is pushed onto the stack.

push 1
; The argument second from the right is pushed onto the stack.

push esi
; The this implicit argument is pushed onto the stack.
; Such a method of passing arguments indicates that an explicit
; conversion of the function type to stdcall or cdecl
; has taken place. Scrolling the disassembler screen downward,
; we can see that the stack is cleared by the called function.
; Therefore, it complies with the stdcall convention.

call demo_2

push 3
push 2
push 1
push esi
call sub_401020
add esp, 10h
; If a function clears the stack up after completion, it has
; the default type or cdecl. Passing the this pointer via the stack
; allows us to think that the second assumption is correct.

xor eax, eax
pop esi
retn
main endp

Default arguments To simplify calling functions that have a "crowd" of arguments, the C++ language provides the capability of specifying default arguments. Two questions arise: Does calling functions with default arguments differ from calling other functions? Who initializes the omitted arguments — the called function, or the calling one?

If functions with default arguments are called, the compiler adds the missing arguments on its own. Therefore, the calls of such functions don't differ from the calls of other functions.

Let's prove this in the following example.

Listing 100: Passing Default Arguments

#include

MyFunc(int a=1, int b=2, int c=3)
{
printf("%x %x %x\n", a, b, c);
}

main()
{
MyFunc();
}

The result of disassembling the example is shown in the following listing. (Only the calling function is given.)

Listing 101: The Disassembled Code for Passing Default Arguments

main proc near ; CODE XREF: start+AF↓p
push ebp
mov ebp, esp
push 3
push 2
push 1
; Apparently, all omitted arguments have been passed
; to the function by the compiler on its own.

call MyFunc

add esp, 0Ch
pop ebp
retn
main endp

Analyzing how an unknown compiler passes arguments The diversity of existing compilers, and the continual emergence of new ones, doesn't allow me to give a comprehensive table for the features of each compiler. What should you do if you come across a program compiled by a compiler not covered in this book?

If you can identify the compiler (for example, using IDA or the text strings contained in the file), you need to get a copy of it. Then, you should run a series of test examples on it, passing arguments of various types to an "experimental" function. You also might want to study the compiler's documentation; all mechanisms that pass arguments and are supported by the compiler could be described briefly.

If you can't identify the compiler or get a copy of it, you'll have to investigate carefully and thoroughly the interaction of the called and calling functions.

Values Returned by Functions
The value returned by a function is traditionally a value returned by the return operator. However, this statement is only the tip of the iceberg, and doesn't give a complete picture of the functions' interactions. The following example, taken from real program code, illustrates this.

Listing 102: Returning a Value via an Argument Passed by Reference

int xdiv(int a, int b, int *c=0)
{
if (!b) return -1;
if (c) c[0]=a % b;
return a / b;
}

The xdiv function returns the result of integer division of the a argument by the b argument, but it also assigns the remainder to the c variable, passed to the function by reference. How many values has the function returned? Why is it worse or less permissible to return a result by reference than by the classical return?

Popular editions tend to simplify the problem of identifying the value returned by a function, considering one case that uses the return operator. In particular, Matt Pietrek, in his book "Windows 95 System Programming Secrets," follows this approach, leaving all other options out of the frame. Nevertheless, we will consider the following mechanisms:

Returning values using the return operator (via a register or coprocessor stack)

Returning values via arguments passed by reference

Returning values via the heap

Returning values via global variables

Returning values via CPU flags

"Returning values via the disk drive and memory-mapped files" could be included in this list, but that's beyond the topic of discussion. (However, if you consider a function as a "black box" with an input and an output, the result of the function's work written into a file is actually the value returned by the function.)

Returning values using the return operator According to convention, the value returned by the return operator is placed into the EAX register (AX in 16-bit mode). If the result exceeds the register's bit capacity, the higher 32 bits of an operand are loaded to EDX. (In 16-bit mode, the higher word is loaded to DX.)

In most cases, float results are returned via the coprocessor stack. They also may be returned via the EDX:EAX registers (DX:AX in 16-bit mode).

If a function returns a structure that consists of hundreds of bytes, or an object of similar size, then neither the registers nor the coprocessor stack will be sufficient. This is true for results larger than 8 bytes.

If there is no room for the return value in the registers, then the compiler, without telling the programmer, passes an implicit argument (the reference to the local variable storing the return result) to the function. Thus, the functions struct mystruct MyFunc (int a, int b) and void MyFunc (struct mystruct *my, int a, int b) are compiled in nearly identical code, and it is impossible to extract the original prototype from the machine code.

Microsoft Visual C++ is the only one that gives a clue. In this case, it returns the pointer to the variable being returned; the reconstructed prototype looks like struct mystruct* MyFunc (struct mystruct* my, int a, int b). It seems strange that the programmer, despite having just passed the argument to the function, would return the pointer to the argument. In this situation, Borland C++ returns a void result, erasing the distinction between an argument returned by value and an argument returned by reference. However, the "original prototype" asserts that a function returns a value, when it actually returns a reference — rather like seeing a cat and calling it a mouse.

A few words about identifying the returned value are necessary. If a function explicitly stores a value in the EAX or EDX register (AX or DX in 16-bit mode) and terminates its execution, the value's type can be determined roughly by Tables 11 and 12. If the registers are left undefined, the most likely result is a void-type value (i.e., nothing will be returned). An analysis of the calling function will produce more accurate information about how the called function accesses the EAX [EDX] register (AX [DX] in 16-bit mode). For example, char Types typically address the lower half of the EAX [AX] register (i.e., the AL register), or zero the higher bytes of the EAX register using the logical AND operation. It would seem that, if the calling function doesn't use the value left by the called function in the EAX [EDX] registers, its type is void. However, this assumption is incorrect. Programmers often ignore the returned value, confusing code diggers.

The next example shows the mechanism used to return the main value types.

Listing 103: Returning the Main Value Types

#include
#include

// A demonstration of returning a value of a char-type variable
// by the return operator
char char_func (char a, char b)

{
return a+b;
}

// A demonstration of returning an int-type variable
// by the return operator
int int_func(int a, int b)
{
return a+b;
}

// A demonstration of returning an int64-type variable
// by the return operator
__int64 int64_func(__int64 a, __int64 b)
{
return a+b;
}

// A demonstration of returning a pointer to int
// by the return operator
int* near_func(int* a, int* b)
{
int *c;
c=(int *)malloc(sizeof(int));
c[0]=a[0]+b[0];
return c;
}

main()
{
int a;
int b;
a=0x666;
b=0x777;
printf("%x\n",
char_func(0x1,0x2)+
int_func(0x3,0x4)+
int64_func(0x5,0x6)+
near_func(&a,&b)[0]);
}

The disassembled code of this example, compiled using Microsoft Visual C++ 6.0 with default settings, will give the following result:

Listing 104: The Disassembled Code for Returning the Main Value Types Complied Using Visual C++

char_func proc near ; CODE XREF: main+1A↓p

arg_0 = byte ptr 8
arg_4 = byte ptr 0Ch

push ebp
mov ebp, esp
; The stack frame is opened.

movsx eax, [ebp+arg_0]
; The arg_0 argument, of the signed char type, is loaded into EAX
; and, incidentally, extended to int.

movsx ecx, [ebp+arg_4]
; The arg_4 argument, of the signed char type, is loaded into ECX
; and, incidentally, extended to int.

add eax, ecx
; The arg_0 and arg_4 arguments, extended to int, are added
; and saved in the EAX register, producing the value to be
; returned by the function. Unfortunately, its type is impossible
; to determine precisely. It could be int or char.
; Of the two options, int is more probable: The sum
; of two char arguments should be placed into int
; for safety reasons; otherwise, an overflow is possible.

pop ebp
retn
char_func endp

int_func proc near ; CODE XREF: main+29↓p

arg_0 = dword ptr 8
arg_4 = dword ptr 0Ch

push ebp
mov ebp, esp
; The stack frame is opened.

mov eax, [ebp+arg_0]
; The value of the arg_0 argument is loaded into EAX.

add eax, [ebp+arg_4]
; The arg_0 and arg_4 arguments are added, and the result
; is left in the EAX register. This is the value returned
; by the function. Its type probably is int.

pop ebp
retn
int_func endp

int64_func proc near ; CODE XREF: main+40↓p

arg_0 = dword ptr 8
arg_4 = dword ptr 0Ch
arg_8 = dword ptr 10h
arg_C = dword ptr 14h

push ebp
mov ebp, esp
; The stack frame is opened.

mov eax, [ebp+arg_0]
; The value of the arg_0 argument is loaded into EAX.

add eax, [ebp+arg_8]
; The arg_0 and arg_8 arguments are added.

mov edx, [ebp+arg_4]
; The value of the arg_4 argument is loaded into EDX.

adc edx, [ebp+arg_C]
; The arg_4 and arg_C arguments are added, taking into account
; the carry, which remained after the addition of arg_0 and
; arg_8. Hence, arg_0 and arg_4, as well as arg_8 and arg_C,
; are the halves of two arguments of the __int64 type that
; will be summed. Therefore, the result of computation is
; returned via the EDX:EAX registers.

pop ebp
retn
int64_func endp

near_func proc near ; CODE XREF: main+54↓p

var_4 = dword ptr -4
arg_0 = dword ptr 8
arg_4 = dword ptr 0Ch

push ebp
mov ebp, esp
; The stack frame is opened.

push ecx
; ECX is saved.

push 4 ; size_t
call _malloc
add esp, 4
; Four bytes are allocated on the heap.

mov [ebp+var_4], eax
; The pointer to the memory just allocated
; is placed into the var_4 variable.

mov eax, [ebp+arg_0]
; The value of the arg_0 argument is loaded into EAX.

mov ecx, [eax]
; The int value referenced by the ECX register
; is loaded into ECX. Hence, the arg_0 argument is an int * type.

mov edx, [ebp+arg_4]
; The value of the arg_4 argument is loaded into EDX.

add ecx, [edx]
; The int value of the memory cell pointed to by the EDX register
; is added to *arg_0. Hence, the arg_4 argument is a int * type.

mov eax, [ebp+var_4]
; The pointer to the memory block allocated on the heap
; is loaded into EAX.

mov [eax], ecx
; The sum of *arg_0 and *arg_4 is copied onto the heap.

mov eax, [ebp+var_4]
; The pointer to the memory block allocated on the heap is loaded
; into EAX. This is the value to be returned by the function.
; Its prototype might look as follows: int* MyFunc(int *a, int *b)

mov esp, ebp
pop ebp
retn
near_func endp

main proc near ; CODE XREF: start+AF↓p

var_8 = dword ptr -8
var_4 = dword ptr -4

push ebp
mov ebp, esp
; The stack frame is opened.

sub esp, 8
; Space is allocated for local variables.

push esi
push edi
; Registers are saved on the stack.

mov [ebp+var_4], 666h
; The 0x666 value is placed into var_4, an int local variable.

mov [ebp+var_8], 777h
; The 0x777 value is placed into var_8, an int local variable.

push 2
push 1
call char_func
add esp, 8
; The char_func(1,2) function is called. As previously
; mentioned, it is impossible to know the type
; of the value it returns. It could return int or char.

movsx esi, al
; The value returned by the function is extended to signed int.
; Hence, it has returned signed char.

push 4
push 3
call int_func
add esp, 8
; The int_func(3,4) function is called. It returns the int value.

add eax, esi
; The contents of ESI are added to the value returned by
; the function.

cdq
; The double word in the EAX register is converted to
; a quadruple word, then placed into the EDX:EAX register.
; This proves that the value returned by the function
; from int was converted into int64, although the purpose
; of this action is, as yet, unclear.

mov esi, eax
mov edi, edx
; The extended quadruple word is copied to the EDI:ESI registers.

push 0
push 6
push 0
push 5
call int64_func
add esp, 10h
; The int64_func(5,6) function is called. It returns a value
; of the__int64 type. Now, the purpose of the extension
; of the previous result becomes clear.

add esi, eax
adc edi, edx
; The result returned by the int64_func is added
; to the quadruple word in the EDI:ESI registers.

lea eax, [ebp+var_8]
; The pointer to the var_8 variable is loaded into EAX.

push eax
; The var_8 pointer is passed as an argument to near_func.

lea ecx, [ebp+var_4]
; The pointer to the var_4 variable is loaded into ECX.

push ecx
; The var_4 pointer is passed as an argument to near_func.

call near_func
add esp, 8
; The near_func function is called.

mov eax, [eax]
; As previously mentioned, the function has returned the pointer
; to an int variable into the EAX register. Now, the value
; of this variable is loaded into the EAX register.

cdq
; EAX is extended to a quadruple word.

add esi, eax
adc edi, edx
; Two quadruple words are added.

push edi
push esi
; The result of addition is passed to the printf function.

push offset unk_406030
; The pointer is passed to the format-specification string.

call _printf
add esp, 0Ch

pop edi
pop esi
mov esp, ebp
pop ebp
retn
main endp

As you can see, identifying the type of value returned by the return operator is rather straightforward. However, consider the following example. Try to predict what will be returned, and in which registers.

Listing 105: Returning a Structure by Value

#include
#include

struct XT
{
char s0[4];
int x;
};

struct XT MyFunc(char *a, int b)
// The function returns a value of the XT structure by value.
{
struct XT xt;
strcpy(&xt.s0[0], a);
xt.x = b;
return xt;
}

main()
{
struct XT xt;
xt=MyFunc("Hello, Sailor!", 0x666);
printf("%s %x\n", &xt.s0[0], xt.x);
}

The disassembled listing of the compiled code is as follows:

Listing 106: The Disassembled Code for Returning a Structure by Value

MyFunc proc near ; CODE XREF: sub_401026+10↓p

var_8 = dword ptr -8
var_4 = dword ptr -4
; These local variables are the elements of the "split"
; XT structure. As mentioned in the section
; "Objects, Structures, and Arrays," the compiler always tends
; to access the elements of a structure by their actual addresses,
; not via the base pointer. Therefore, distinguishing
; a structure from independent variables
; is not an easy task; sometimes it is an impossible one.

arg_0 = dword ptr 8
arg_4 = dword ptr 0Ch
; The function takes two arguments.

push ebp
mov ebp, esp
; The stack frame is opened.

sub esp, 8
; Space is allocated for local variables.

mov eax, [ebp+arg_0]
; The arg_0 argument is loaded into EAX.
push eax
; The arg_0 argument is passed to the strcpy function;
; hence, arg_0 is a pointer to a string.

lea ecx, [ebp+var_8]
; The pointer to the var_8 local variable is loaded into ECX.

push ecx
; This pointer is passed to the strcpy function.
; Therefore, var_8 is a string buffer with a size of 4 bytes.

call strcpy
add esp, 8
; The string passed via arg_0 to var_8 is copied.

mov edx, [ebp+arg_4]
; The value of the arg_4 argument is loaded into the EDX register.

mov [ebp+var_4], edx
; The arg_4 argument is copied to the var_4 local variable.

mov eax, [ebp+var_8]
; The contents of (not the pointer to) the string buffer
; are loaded.

mov edx, [ebp+var_4]
; The value of var_4 is loaded into EDX. Loading
; the EDX:EAX registers before existing the function indicates
; var_4 has the value returned by the function. Unexpectedly,
; the function returns two variables of different types
; into EDX and EAX, and not __int64, which might seem logical
; after a cursory analysis of the program. The second surprise is
; that the char[4] type is returned via the register,
; not via the pointer or the reference. This is fortunate: If the
; structure were declared as struct XT {short int a, char b, char c},
; as many as three variables of two types
; would be returned into the EAX.

mov esp, ebp
pop ebp
retn
MyFunc endp

main proc near ; CODE XREF: start+AF↓p

var_8 = dword ptr -8
var_4 = dword ptr -4
; These are two local variables of the int type.
; Their type has been determined by calculating their respective sizes.

push ebp
mov ebp, esp
; The stack frame is opened.

sub esp, 8
; Eight bytes are allocated for local variables.

push 666h
; An int argument is passed to the MyFunc function.
; Therefore, arg_4 is of the int type, which wasn't obvious
; from the called function's code - arg_4 easily could
; be the pointer. Hence, the function returns
; an int type into the EDX register.

push offset aHelloSailor ; "Hello, Sailor!"
; A pointer to the string is passed to MyFunc.
; Caution! The string occupies more than 4 bytes; therefore,
; I don't recommend making this example "live."

call MyFunc
add esp, 8
; The MyFunc function is called. Somehow, it modifies
; the EDX and EAX registers. The returned value types
; are already known, so it only remains to make sure
; that the calling function uses them "correctly."

mov [ebp+var_8], eax
; The contents of the EAX register are placed
; into the var_8 local variable.

mov [ebp+var_4], edx
; The contents of the EDX register are placed
; into the var_4 local variable.
; It seems that the function will return __int64.

mov eax, [ebp+var_4]
; The contents of var_4 are loaded to EAX (i.e., this is loading
; the contents of the EDX returned by the MyFunc function) and...

push eax
; ...passed to the printf function.
; According to the format-specification string,
; var_4 is of the int type.
; Hence, the function has returned int,
; or at least its higher part, into EDX.

lea ecx, [ebp+var_8]
; The pointer to the var_8 variable is loaded into ECX.
; This pointer stores the value returned by the function
; via the EAX register. The format-specification string
; indicates it's a pointer to a string. Thus, the values
; returned via the EDX:EAX registers are of different types.
; With a little thought, it is possible
; to reconstruct the original prototype:
; struct X{char a [4]; int b} MyFunc(char* c, int d);

push ecx
push offset aSX ; "%s %x\n"
call _printf
add esp, 0Ch

mov esp, ebp
pop ebp
; The stack frame is closed.

retn
main endp

Now, let's modify the XT structure slightly, replacing char s0[4] with char* s0[10] (which won't fit into the EDX:AAX registers), then see how the code changes.

Listing 107: The Disassembled Code of the Modified and Compiled Version for Returning a Structure by Value

main proc near ; CODE XREF: start+AF↓p

var_20 = byte ptr - 20h
var_10 = dword ptr -10h
var_C = dword ptr - 0Ch
var_8 = dword ptr -8
var_4 = dword ptr -4

push ebp
mov ebp, esp
; The stack frame is opened.

sub esp, 20h
; Here, 0x20 bytes are allocated for local variables.

push 666h
; The rightmost int argument is passed
; to the MyFunc function.

push offset aHelloSailor ; "Hello, Sailor!"
; The second argument from the right (a pointer
; to the string) is passed to the MyFunc function.

lea eax, [ebp+var_20]
; The address of a local variable is loaded into EAX.

push eax
; A pointer is passed to the var_20 variable.
; This argument was not present in the function's prototype!
; Where has it come from? The compiler has inserted it
; to return the structure by value. The previous sentence
; could have been placed in quotation marks to accentuate
; its irony: The structure that will be returned by value
; actually is returned by reference.

call MyFunc
add esp, 0Ch
; The MyFunc function is called.

mov ecx, [eax]
; The function has loaded into ECX a pointer to the structure
; returned to it by reference. This trick is used only by
; Microsoft Visual C++; most compilers leave the value of EAX
; undefined or equal to zero. In any case, ECX will contain
; the first double word pointed to by the pointer placed in ECX.
; At first glance, this is an element of the int type.
; However, it is unwise to draw hasty conclusions.

mov [ebp+var_10], ecx
; The contents of ECX are saved in the var_10 local variable.

mov edx, [eax+4]
; EDX is loaded with the second double word pointed to
; by the EAX pointer.

mov [ebp+var_C], edx
; It is copied to the var_C variable.
; The second element of the structure likely has the
; int type as well. A comparison with the source code of the
; program under consideration shows something is wrong.

mov ecx, [eax+8]
; The third double word using the EAX pointer is loaded, and...

mov [ebp+var_8], ecx
; ...it is copied to var_8. Yet another element of the int type?
; Where are they coming from? The original had one!
; And where is the string?

mov edx, [eax+0Ch]
mov [ebp+var_4], edx
; Yet another element of the int type is moved from the structure
; into the local variable. This is too much!

mov eax, [ebp+var_4]
; EAX is loaded with the value of the var_4 local variable.

push eax
; The value of var_4 is passed to the printf function.
; The format-specification string shows
; var_4 really has the int type.

lea ecx, [ebp+var_10]
; A pointer to the var_10 is obtained, and...

push ecx
; ...it is passed to the printf function. According to
; the format-specification string, ECX is of the char * type;
; hence, var_10 is the string we are looking for. Intuition
; suggests that var_C and var_8, located below var_10
; (i.e., at higher addresses), also contain strings. The compiler,
; instead of calling strcpy, has decided it would be faster
; to copy the structure that has caused confusion.
; Never be hasty when identifying the types of elements of
; structures! Carefully check how each byte is initialized
; and used. The operations of transfer to local variables alone
; are not informative!

push offset aSX ; "%s %x\n"
call _printf
add esp, 0Ch

mov esp, ebp
pop ebp
; The stack frame is closed.

retn
main endp

MyFunc proc near ; CODE XREF: main+14↑p

var_10 = dword ptr -10h
var_C = dword ptr - 0Ch
var_8 = dword ptr -8
var_4 = dword ptr -4

arg_0 = dword ptr 8
arg_4 = dword ptr 0Ch
arg_8 = dword ptr 10h
; Note that three arguments are passed to the function,
; not two, as declared in the prototype.

push ebp
mov ebp, esp
; The stack frame is opened.

sub esp, 10h
; Memory is allocated for local variables.

mov eax, [ebp+arg_4]
; EAX is loaded with the second argument from the right.

push eax
; The pointer to arg_4 is passed to the strcpy function.

lea ecx, [ebp+var_10]
; ECX is loaded with the pointer to the var_10 local variable.

push ecx
; The pointer to the var_10 local variable is passed
; to the strcpy variable.

call strcpy
add esp, 8
; The string passed to the MyFunc function,
; via the arg_4 argument, is copied.

mov edx, [ebp+arg_8]
; EDX is loaded with the value of the rightmost argument
; passed to MyFunc.

mov [ebp+var_4], edx
; The arg_8 argument is copied to the var_4 local variable.

mov eax, [ebp+arg_0]
; The value of the arg_0 argument is loaded into EAX.
; As you already know, the compiler uses this argument to pass
; a pointer to the local variable without notifying the programmer.
; The function places the structure returned by value
; in this variable.

mov ecx, [ebp+var_10]
; The contents of the var_10 local variable are loaded into ECX.
; As previously mentioned, a string has been copied to the var_10
; local variable; thus, this is likely double-word copying!

mov [eax], ecx
mov edx, [ebp+var_C]
mov [eax+4], edx
mov ecx, [ebp+var_8]
mov [eax+8], ecx
; Exactly! The var_10 local variable is copied "manually"
; to the *arg_0 local variable, without using strcpy!
; In total, 12 bytes have been copied; hence,
; the first element of the structure looks like this:
; char s0[12]. The source code contained 'char s0[10]'.
; When the compiler aligned the elements of the structure
; by the addresses that are multiples of four, it placed
; the second element, int x, at the address base+012, creating
; a "hole" between the end of the line and the beginning of
; the second element. It is not possible to reconstruct
; the structure's actual form by analysing the disassembled
; listing. The only thing that can be stated for sure is that
; the string length s0 falls in the range of 9 to 12.

mov edx, [ebp+var_4]
mov [eax+0Ch], edx
; The var_4 variable, which contains the arg_8 argument,
; is copied into [EAX+0C]. The second element of the structure,
; int x, is at an offset of 12 bytes from the start.

mov eax, [ebp+arg_0]
; The pointer to the arg_0 argument is returned to EAX. This
; argument contains the pointer to the returned structure.

mov esp, ebp
pop ebp
; The stack frame is closed.

retn
; The function's prototype looks like this:
; struct X {char s0[12], int a} MyFunc(struct X *x, char *y, int z)

MyFunc endp

How are structures that contain hundreds or thousands of bytes returned? They are copied to the local variable, which the compiler, using the MOVS instruction, has implicitly passed by reference. This can be confirmed by replacing chars0[10] in the source code of the previous example with char s0 [0x666]. The result of recompiling the example should look like this:

Listing 108: The Disassembled Code, Remodified and Recompiled, for Returning a Structure by Value

MyFunc proc near ; CODE XREF: main+1C↑p

var_66C = byte ptr -66Ch
var_4 = dword ptr -4
arg_0 = dword ptr 8
arg_4 = dword ptr 0Ch
arg_8 = dword ptr 10h

push ebp
mov ebp, esp
; The stack frame is opened.

sub esp, 66Ch
; Memory is allocated for local variables.

push esi
push edi
; The registers are saved on the stack.

mov eax, [ebp+arg_4]
push eax
lea ecx, [ebp+var_66C]
push ecx
call strcpy
add esp, 8
; The string passed to the function is copied
; to the var_66C local variable.

mov edx, [ebp+arg_8]
mov [ebp+var_4], edx
; The arg_8 argument is copied to the var_4 local variable.

mov ecx, 19Bh
; The 0x19B value is placed into ECX; the purpose of this is unclear.

lea esi, [ebp+var_66C]
; ESI is set to point to the var_66C local variable.

mov edi, [ebp+arg_0]
; The EDI register is set to point to the variable
; referenced by the pointer passed to the arg_0 argument.

repe movsd
; The ECX double words are copied from the ESI address to the EDI
; address. In bytes, this is 0x19B*4 = 0x66C.
; Thus, both the var_66C variable and the var_4 variable are copied.

mov eax, [ebp+arg_0]
; The pointer to the returned structure
; is returned to EAX.

pop edi
pop esi

mov esp, ebp
pop ebp
; The stack frame is closed.
retn
MyFunc endp

Note that many compilers (such as Watcom) use registers, rather than the stack, to pass the pointer that references the buffer allocated for the return value of the function. Furthermore, these compilers use the register intended for this purpose (Watcom, for example, uses the ESI register), rather than choosing registers from the queue of candidates in the order of preference. (See Table 7).

Returning floating-point values The cdecl and stdcall conventions require floating-point values (float, double, long double) to be returned via the coprocessor stack. The EAX and EDX registers may store any values when the function exits the execution. (In other words, the functions that return real values leave the EAX and EDX registers in an uncertain state.)

Theoretically, fastcall functions can return floating-point variables via registers as well. In practice, this rarely occurs. The coprocessor can't read the main processor's registers directly. They should be pushed through the RAM, which brings no benefit from the fastcall.

The following example illustrates this:

Listing 109: Returning Floating-Point Values

#include

float MyFunc(float a, float b)
{
return a+b;
}

main()
{
printf("%f\n", MyFunc(6.66,7.77));
}

The disassembled listing of this example, compiled using Microsoft Visual C++, looks as follows:

Listing 110: The Disassembled Code for Returning Floating-Point Values Compiled with Visual C++

main proc near ; CODE XREF: start+AF↑p

var_8 = qword ptr -8

push ebp
mov ebp, esp
; The stack frame is opened.

push 40F8A3D7h
push 40D51EB8h
; Arguments are passed to the MyFunc function.
; Their type has yet to be determined.

call MyFunc

fstp [esp+8+var_8]
; The floating-point value, placed into the coprocessor
; stack by the MyFunc function, is retrieved. To determine
; the instruction's type, look at its opcode: DD 1C 24.
; According to Table 10, its type must be double. But wait!
; Is it really double? The function should return float!
; In theory, this is true. However, the type is converted
; implicitly when the argument is passed to the printf
; function, which is expecting double. Note where the return
; value of the function is placed: [esp+8-8] == [esp].
; It is allocated on the top of the stack, the
; equivalent of pushing it using the PUSH instructions.

call _printf
add esp, 0Ch

pop ebp
retn
main endp

MyFunc proc near ; CODE XREF: main+D↑p

arg_0 = dword ptr 8
arg_4 = dword ptr 0Ch

push ebp
mov ebp, esp
; The stack frame is opened.

fld [ebp+arg_0]
; The arg_0 argument is placed on the top of the stack.
; To determine its type, let's look at the FLD instruction's opcode:
; D9 45 08. Its type must be float.

fadd [ebp+arg_4]
; The arg_0 argument just placed on the top
; of the coprocessor stack is added to arg_4.
; The result is placed on the same stack, and...

pop ebp
retn
; ...it is returned from the function. The result of adding
; two floats is left on the top of the coprocessor stack.
; Strangely, the same code would have been obtained
; if the function had been declared double.
MyFunc endp

Returning values in Watcom C Watcom C allows the programmer to choose manually the register or registers in which the function will return the value. This seriously complicates the analysis. Conventionally, the function should not spoil the EBX, ESI, and EDI registers (BX, SI, and DI in 16-bit mode). When you see, next to the function call, the instruction for reading the ESI register, it is tempting to conclude it was initialized before the function was called — as is typical in most cases. Watcom, however, may force the function to return the value in any general-purpose register except EBP (BP). Because of this, it is necessary to analyze both the calling and the called functions.

The register or registers used by default are marked in bold. Note that only the size of the returned value can be determined by the register used. The type of this value cannot be determined directly. In particular, the EAX register may be used to return an int variable, as well as a structure consisting of four char variables, two char variables, or one short int variable.

What does this mean? Consider the following example:

Listing 111: Returning a Value via Any Valid Register

#include
int MyFunc(int a, int b)
{
#pragma aux MyFunc value [ESI]
// The AUX pragma, along with the value keyword, allows us
// to define manually the register via which
// the result will be returned.
// In this case, the result will be returned via ESI.
return a+b;
}

main()
{
printf("%x\n", MyFunc(0x666, 0x777));
}

The disassembled code of the compiled version of this the example should look as follows:

Listing 112: The Disassembled Code for Returning a Value via Any Valid Register

main_ proc near ; CODE XREF: __CMain+40↓p
push 14h
call __CHK
; This is a check for stack overflow.

push edx
push esi
; ESI and EDX are saved.
; This is evidence that the given compiler obeyed the convention
; on saving ESI. There's no instruction for saving ESI, however.
; This register isn't modified by this particular function;
; therefore, there's no need to save it.

mov edx, 777h
mov eax, 666h
; Two arguments of the int type are passed to the MyFunc function.

call MyFunc
; The MyFunc function is called. By convention, after existing
; the function, EAX, EDX, and, sometimes, ECX may contain
; values that are uncertain or returned by the function.
; Generally, the remaining registers must be retained.

push esi
; The value of the ESI register is passed to the printf function.
; It is impossible to know whether it contains the value
; returned by the function, or it was initialized
; prior to calling the function.

push offset asc_420004 ; "%x\n"
call printf_
add esp, 8

pop esi
pop edx

retn
main_ endp

MyFunc proc near ; CODE XREF: main_+16↑p
push 4
call __CHK
; The stack overflow is checked.

lea esi, [eax+edx]
; Here is a familiar, artful trick with addition.
; The pointer to EAX+EBX is loaded into ESI.
; However, the pointer to EAX+EBX also is
; the sum of them (i.e., this instruction is equivalent to
; ADD EAX, EDX/MOV ESI, EAX). This is the value returned
; by the function as ESI was modified, not saved!
; As required, the calling function passed
; the sum of 0x666 and 0x777 to printf, using
; the PUSH ESI instruction.

retn
MyFunc endp

Returning values by inline assembler functions The creator of the assembly function is free to return values in any register. However, because the calling functions of a high-level language expect to see the computation result in strictly defined registers, the creator must observe certain conventions. Internal assembly functions are another matter — they may not follow any rules, as shown in the following example:

Listing 113: Returning Values by Inline Assembler Functions

#include
// This is a naked function that has no prototype;
// the programmer should take care of everything!
__declspec( naked ) int MyFunc()
{
__asm{
lea ebp, [eax+ecx] ; The sum of EAX and ECX is returned into EBP.
; Such a trick can be used only if
; the function to be called from the assembly
; function knows which registers
; will be used to pass arguments,
; and into which registers
; the computation result will be returned.
ret
}
}

main()
{
int a=0x666;
int b=0x777;
int c;
__asm{
push ebp
push edi

mov eax,[a];
mov ecx,[b];
lea edi,c

call MyFunc;
; The MyFunc function is called from the assembler function.
; The arguments are passed to it
; via whatever registers it "wants."

mov [edi],ebp
; The value returned is received into EBP and saved
; in a local variable.

pop edi
pop ebp
}
printf("%x\n", c);
}

The disassembled code of this example, compiled using Microsoft Visual C++ (other compilers will fail to compile it because they don't support the naked keyword), looks as follows:

Listing 114: The Disassembled Code for Returning Values by Inline Assembler Functions

MyFunc proc near ; CODE XREF: main+25↓p

lea ebp, [eax+ecx]
; Arguments are received via the EAX and ECX registers.
; Their sum is returned via EBP.
; This example is artificial,
; but illustrative.

retn
MyFunc endp

main proc near ; CODE XREF: start+AF↓p

var_C = dword ptr - 0Ch
var_8 = dword ptr -8
var_4 = dword ptr -4

push ebp
mov ebp, esp
; The stack frame is opened.

sub esp, 0Ch
; Memory is allocated for local variables.

push ebx
push esi
push edi
; The modified registers are saved.

mov [ebp+var_4], 666h
mov [ebp+var_8], 777h
; The var_4 and var_8 variables are initialized.

push ebp
push edi
; Are the registers saved, or passed to the function?
; This question cannot be answered yet.

mov eax, [ebp+var_4]
mov ecx, [ebp+var_8]
; The value of the var_4 variable is loaded into EAX.
; and the value of var_8 is loaded into ECX.

lea edi, [ebp+var_C]
; The pointer to the var_C variable is loaded into EDI.

call MyFunc
; The MyFunc function is called. It's unclear from the analysis
; of the calling function how the arguments are passed to it:
; via the stack, or via registers.
; Only an analysis of the code of MyFunc confirms that the latter
; assumption is true. Yes, the arguments are passed via registers!

mov [edi], ebp
; What does this mean? An analysis of the calling function
; can't give an exhaustive answer.
; Only an analysis of the called function suggests
; that it returns the computation result via EBP.

pop edi
pop ebp
; The modified registers are restored.
; This is evidence that the registers were saved previously on
; the stack; they were not passed to the function as arguments.

mov eax, [ebp+var_C]
; The contents of the var_C variable are loaded into EAX.

push eax
push offset unk_406030
call _printf
add esp, 8
; Calling printf

pop edi
pop esi
pop ebx
; The registers are restored.

mov esp, ebp
pop ebp
; The stack frame is closed.

retn
main endp

Returning values via arguments passed by reference The identification of values returned via arguments passed by reference is linked closely with identification of the arguments themselves. (See the "Function Arguments" section.) Let's detect pointers among the arguments passed to the function and include them on the list of candidates for returned values.

Now, let's see whether there are any pointers to uninitialized variables among them; obviously, they're initialized by the called function itself. However, pointers to initialized variables, especially those equal to zero, should not be discounted; they also may return values. An analysis of the called function can clarify the situation: All operations that modify the variables passed by reference will be of interest, and should not be confused with modification of the variables passed by value. These automatically cease when function execution is completed (or, more precisely, when the arguments are eliminated from the stack). They are local variables of the function, which may change them as it sees fit.

Listing 115: Returning Values via Variables Passed by Reference

#include
#include

void Reverse{char *dst, const char *src)
{
strcpy(dst,src);
_strrev( dst);
}
// The src string is reversed and written
// into the dst string.

void Reverse(char *s) {
_strrev( s );
}
// The s string is reversed.
// (The result is written into the same s string.)

int sum(int a,int b)
//This function returns the sum of two arguments.

{
a+=b; return a;
}
// The arguments passed by value can be modified
// and treated as standard local variables.

main()
{
char s0[]="Hello, Sailor!";
char s1 [100];

Reverse(&s1[0], &s0[0]);
printf("%s\n", &s1[0]);
// The s0 string is reversed and written into s1.

Reverse(&s1[0]);
printf("%s\n", &s1 [0]);
// The s1 string is rewritten and, therefore, reversed.

printf("%x\n", sum(0x666, 0x777));
// The sum of two numbers is printed.
}

The disassembled code of the compiled version of the previous example should look as follows:

Listing 116: The Disassembled Code for Returning Values via Variables Passed by Reference

main proc near ; CODE XREF: start+AF↓p

var_74 = byte ptr -74h
var_10 = dword ptr -10h
var_C = dword ptr -0Ch
var_8 = dword ptr -8
var_4 = word ptr -4

push ebp
mov ebp, esp
; The stack frame is opened.

sub esp, 74h
; Memory is allocated for local variables.

mov eax, dword ptr aHelloSailor ; "Hello, Sailor!"
; The first 4 bytes of the string "Hello, Sailor!" are placed
; into EAX. The compiler probably copies the string
; into a local variable.

mov [ebp+var_10], eax
mov ecx, dword ptr aHelloSailor+4
mov [ebp+var_C], ecx
mov edx, dword ptr aHelloSailor+8
mov [ebp+var_8], edx
mov ax, word ptr aHelloSailor+0Ch
mov [ebp+var_4], ax
; The string "Hello, Sailor!" is copied to the var_10
; local variable of char s[0x10] type as expected.
; The 0x10 has been obtained by counting the number
; of bytes copied - 4 iterations with 4 bytes
; in each make a total of 16!

lea ecx, [ebp+var_10]
; The pointer to the var_10 local variable, which contains
; the "Hello, Sailor!" string, is loaded into ECX.

push ecx ; int
; The pointer to the "Hello, Sailor!" string is passed
; to the Reverse_1 function. IDA has determined
; the type incorrectly: What kind of int has char *?
; However, recalling how the string was copied
; clarifies why IDA made a mistake.

lea edx, [ebp+var_74]
; The pointer to the uninitialized var_74 local variable
; is loaded into ECX.

push edx ; char *
; The pointer to the uninitialized char variable s1[100]
; is passed to the Reverse_1 function. The value 100 was obtained
; by subtracting the offset of the var_74 variable from the offset
; of the var_10 variable, which is next to it and contains
; the "Hello, Sailor!" string: 0x74 - 0x10 = 0x64, which is 100
; in decimal representation. Passing a pointer to the unassigned
; variable suggests that the function will return
; some value in it - something that should be noted.

call Reverse_1
add esp, 8
; The Reverse_1 function is called.

lea eax, [ebp+var_74]
; The pointer to the var_74 variable is loaded into EAX.

push eax
; The pointer to the var_74 variable is passed to the printf
; function. Because the calling function has not initialized this
; variable, it can be assumed that the called function has returned
; its value via the variable. The Reverse_1 function might modify
; the var_10 variable as well, but it is impossible to be certain
; about this before the function's code is studied.

push offset unk_406040
call _printf
add esp, 8
; The printf function is called for the string output.

lea ecx, [ebp+var_74]
; ECX is loaded with the pointer to the var_74 variable, which
; apparently contains the value returned by the Reverse_1 function.

push ecx ; char *
; The pointer to the var_74 variable is passed to the Reverse_2
; function. Reverse_2 also may return its value into the var_74
; variable, may modify the variable, or may not return any value!
; Analyzing the code of the called function will clarify this.

call Reverse_2
add esp, 4
; The Reverse_2 function is called.

lea ecx, [ebp+var_74]
; The pointer to the var_74 variable is loaded into EDX.

push edx
; The pointer to the var_74 variable is passed to the printf function.
; Since the value returned by the function via the EDX:EAX
; registers isn't used, the function may return
; it into the var_74 variable, rather than via
; the registers. However, this is only an assumption.

push offset unk_406044
call _printf
add esp, 8
; The printf function is called.

push 777h
; The 0x777 value of the int type is passed to the Sum function.

push 666h
; The 0x666 value of the int type is passed to the Sum function.

call Sum
add esp, 8
; The Sum function is called.

push eax
; The EAX register contains the value returned by the Sum function.
; It is passed to the printf function as an argument.

push offset unk_406048
call _printf
add esp, 8
; The printf function is called.

mov esp, ebp
pop ebp
; The stack frame is closed.

retn
main endp

; int __cdecl Reverse_1 (char *,int)
; Note that the function's prototype is defined incorrectly!
; Actually, as already inferred from the analysis of the calling
; function, it looks like this: Reverse(char *dst, char *src).
; The names of arguments are based on the fact that the left
; argument is a pointer to an uninitialized buffer,
; and is probably used as a destination;
; the right argument is a source in such a case.

Reverse_1 proc near ; CODE XREF: main+32↑p

arg_0 = dword ptr 8
arg_4 = dword ptr 0Ch

push ebp
mov ebp, esp
; The stack frame is opened.

mov eax, [ebp+arg_4]
; The arg_4 argument is loaded into EAX.

push eax
; The arg_4 argument is passed to the strcpy function.

mov ecx, [ebp+arg_0]
; The value of the arg_0 argument is loaded into ECX.

push ecx
; The arg_0 argument is passed to the strcpy function.

call strcpy
add esp, 8
; The contents of the string pointed to by arg_4
; are copied to the buffer pointed to by arg_0.

mov edx, [ebp+arg_0]
; EDX is loaded with the contents of the arg_0 argument, which
; points to the buffer that contains the string just copied.

push edx ; char *
; The arg_0 argument is passed to the __strrev function.

call __strrev
add esp, 4
; The strrev function reverses the string pointed to
; by arg_0. Therefore, the Reverse_1 function returns
; its value via the arg_0 argument passed by reference.
; The string pointed to by arg_4 remains unchanged.
; Therefore, the prototype of the Reverse_1 function
; looks like this: void Reverse_1 (char *dst, const char *src).
; The const qualifier should never be neglected: It presents
; clear evidence that the given pointer references
; a read-only variable. This considerably facilitates work with a
; disassembler listing, especially if you return to it after some
; time and have forgotten the algorithm of the analyzed program.

pop ebp
; The stack frame is closed.

retn
Reverse_1 endp
; int __cdecl Reverse_2(char *)
; This time, the function's prototype is defined correctly
; (apart for the returned type being void, not int).

Reverse_2 proc near ; CODE XREF: main+4F↑p

arg_0 = dword ptr 8

push ebp
mov ebp, esp
; The stack frame is opened.

mov eax, [ebp+arg_0]
; The contents of the arg_0 argument are loaded into EAX.

push eax ; char *
; The arg_0 argument is passed to the strrev function.

call __strrev
add esp, 4
; The string is reversed. The result is placed at the same location.
; Therefore, the Reverse_2 function returns the value
; via arg_0, and our hypothesis proves to be correct.

pop ebp
; The stack frame is closed.

retn
; According to the last investigation, the prototype of
; the Reverse_2 function looks like this: void Reverse_2 (char *s)

Reverse_2 endp

Sum proc near ; CODE XREF: main+72↑p

arg_0 = dword ptr 8
arg_4 = dword ptr 0Ch

push ebp
mov ebp, esp
; The stack frame is opened.

mov eax, [ebp+arg_0]
; The value of the arg_0 argument is loaded into EAX.
add eax, [ebp+arg_4]
; The arg_0 and arg_4 arguments are added.
; The result is placed into EAX.

mov [ebp+arg_0], eax
; The sum of arg_0 and arg_4 is copied back into arg_0.
; Inexperienced hackers may think that this is the return of
; values via an argument. However, this assumption is incorrect.
; The arguments passed to the function are popped off the stack.
; They "die" after this action is completed.
; An important point to remember:
; The arguments passed by value behave as local variables do.

mov eax, [ebp+arg_0]
; Now, the returned value really is copied to the EAX register.
; Therefore, the prototype of the function looks like this:
; int Sum(int a, int b);

pop ebp
; The stack frame is closed.

retn
Sum endp

Returning values via the heap Returning values via an argument passed by reference barely decorates the function prototype. Such an argument is not intuitive; it demands detailed explanations such as, "You don't need to pass anything with this argument; on the contrary, be ready to receive something from it." (Who said that being a programmer is easy?) Clarity and aesthetics aside, there is a more serious problem. The size of the returned data is unknown beforehand in some cases; it often is figured out only at the run time of the called function. Should the buffer be allocated a surplus? This is an ugly and inexpedient solution; even in systems with virtual memory, size is limited. It would be much simpler if the called function were able to allocate itself as much memory as it needed, then return a pointer to it. This is, actually, very easy. Many novice programmers make the mistake of trying to return pointers to local variables; unfortunately, these variables "die" as soon as the function is completed, and the pointers end up pointing at nothing. The correct solution to this problem is to allocate memory on the heap (dynamic memory) — for example, by calling the malloc or new functions. Memory thus allocated "survives" until it is released forcefully by the free or delete function.

This memory-allocation mechanism is not essential for analyzing the program; the main role is played by the type of the returned value. It is easy to distinguish a pointer from other types: Only a pointer can be used in an address expression.

Let's consider the following example:

Listing 117: Returning a Value via the Heap

#include
#include
#include

char* MyFunc(int a)
{
char *x;
x = (char *) malloc(100);

_ltoa(a, x, 16);
return x;
}

main()
{
char *x;
x=MyFunc(0x666);
printf("0x%s\n", x);
free(x);
}

The disassembled code of the compiled version of the previous example looks as follows:

Listing 118: The Disassembled Code for Returning a Value via the Heap

main proc near ; CODE XREF: start+AF↓p

var_4 = dword ptr -4

push ebp
mov ebp, esp
; The stack frame is opened.

push ecx
; Four bytes of memory are allocated
; for a local variable. (See var_4.)

push 666h
; The 666 value of the int type is passed to the MyFunc function.

call MyFunc
add esp, 4
; The MyFunc function is called. Note that no argument
; has been passed to the function by reference!

mov [ebp+var_4], eax
; The value returned by the function is copied to var_4.

mov eax, [ebp+var_4]
; Outstanding! The value returned by the function
; is loaded back into EAX!

push eax
; The value returned by the function is passed to the printf
; function. The qualifier indicates that the returned value is
; of the char * type. Since none of the arguments were passed
; to MyFunc function by reference, it allocated memory
; on its own, then wrote the received string to that memory.
; What if one or more arguments had been passed by
; reference to the MyFunc function? The function
; could have modified, then returned, one of these arguments.
; However, modification could not occur.
; For example, pointers to two strings could be passed
; to the function, which could return the pointer to
; the shorter string, or to the string that contained more vowels.
; Therefore, not every case of returning the pointer
; is a sign of modification.

push offset a0xS ; "0x%s\n"
call _printf
add esp, 8
; The printf function is called; the string returned
; by the MyFunc function is printed.

mov ecx, [ebp+var_4]
; ECX is loaded with the value of the pointer returned
; by the MyFunc function.

push ecx ; void *
; The pointer returned by MyFunc is passed to the free function.
; This means that MyFunc allocated memory by
; calling malloc.

call _free
add esp, 4
; Memory allocated by MyFunc
; to return the value is released.

mov esp, ebp
pop ebp
; The stack frame is closed.

retn
; Thus, the prototype of MyFunc looks like this:
; char* MyFunc(int a)

main endp

MyFunc proc near ; CODE XREF: main+9↑p

var_4 = dword ptr -4
arg_0 = dword ptr 8

push ebp
mov ebp, esp
; The stack frame is opened.

push ecx
; Memory is allocated for local variables.

push 64h ; size_t
call _malloc
add esp, 4
; On the heap, 0x64 bytes are allocated, either for the needs
; of the function, or for returning the result. Because
; the analysis of the calling function's code has shown
; that MyFunc returns a pointer, malloc likely
; allocates memory for this purpose.
; However, there might be several calls of malloc,
; and the pointer might be returned only by one of them.

mov [ebp+var_4], eax
; The pointer is saved in the var_4 local variable.

push 10h ; int
; The rightmost argument, 0x10, is passed to the __1toa function, and
; the scale of notation for converting the number is specified.

mov eax, [ebp+var_4]
; EAX is loaded with the contents of the pointer to memory
; allocated on the heap.

push eax ; char *
; The pointer to the buffer is passed to the 1toa function
; for returning the result.

mov ecx, [ebp+arg_0]
; The value of the arg_0 argument is loaded into EAX.

push ecx ; __int32
; The int argument, arg_0, is passed to
; the ltoa function.

call __ltoa
add esp, 0Ch
; The ltoa function converts the number into the string and
; writes it into the buffer referenced by the returned pointer.

mov eax, [ebp+var_4]
; The pointer is returned to memory area
; that has been allocated by MyFunc on the heap
; and that contains the result of work of ltoa.

mov esp, ebp
pop ebp
; The stack frame is closed.

retn
MyFunc endp

Returning values via global variables In general, the use of global variables is bad style. Such a programming style is indicative mainly of programmers whose minds have been irreversibly crippled by the ideology of BASIC, with its poor mechanism for calling subroutines.

The identification of global variables is considered in more detail in the "Global Variables" section of this chapter. Here, you'll learn the mechanisms of returning values via global variables.

All global variables can be implicit arguments of each called function and, at the same time, returned values. Any function may arbitrarily read and modify them. Neither passing nor returning global variables can be revealed by an analysis of the code of the calling function; instead, a careful investigation of the code of the called function is required. In particular, it is necessary to determine if the called function manipulates global variables, and which ones are modified. The problem can be approached from the other side: By reviewing the data segment, it may be possible to find all the global variables and their offsets, then, via a context search on the whole file, to reveal the functions that reference them. (See the "Global Variables" section for more details.)

Besides global variables, there are static ones. These also reside in the data segment, but they are directly accessible only to the function that has declared them. This limitation is not imposed on the variables, but rather on their names. To give other functions access to their own static variables, it is enough to pass a pointer. Fortunately, this trick doesn't create any problems for hackers (although some spoilsports call it "a hole in the protection"). The absence of immediate access to the static variables of "another," and the necessity for cooperation with the function owner via a predictable interface (a returned pointer), allows a program to be divided into independent units that may be analyzed separately. The following example provides an illustration of this:

Listing 119: Returning Values via Global and Static Variables

#include
char* MyFunc(int a)
{
static char x[7][16]=("Monday", "Tuesday", "Wednesday", \
*** "Thursday", "Friday", "Saturday", "Sunday"};
return &x[a-1][0];
}

main()
{
printf("%s\n", MyFunc(6));
}

The disassembled code of this example, compiled using Microsoft Visual C++ 6.0 with default settings, looks as follows:

Listing 120: The Disassembled Code for Returning Values via Global and Static Variables

MyFunc proc near ; CODE XREF: main+5↓p

arg_0 = dword ptr 8

push ebp
mov ebp, esp
; The stack frame is opened.

mov eax, [ebp+arg_0]
; The value of the arg_0 argument is loaded into EAX.

sub eax, 1
; EAX is decremented by one. This is indirect evidence that arg_0
; is not a pointer, although mathematical operations over
; pointers are allowed and used actively in C language.

shl eax, 4
; Here, (arg_0 -1) is multiplied by 16.
; A shift of 4 bits to the right is the equivalent of
; raising 2 to the 4th power, or 16.

add eax, offset aMonday; "Monday"
; The obtained value is added to the base pointer that references
; the table of strings in the data segment. The data segment
; contains either static or global variables. Since the value
; of the arg_0 argument is multiplied by some value
; (in this case, by 16), you can assume this is
; a two-dimensional array of fixed length strings.
; Thus, EAX contains a pointer to the string that has
; the index arg_0 -1, or arg_0, if the count starts from one.

pop ebp
; The stack frame is closed, and the pointer is returned to
; the corresponding element of the array via the EAX register.
; As you can see, there is no basic difference between returning
; the pointer to the memory area allocated on the heap
; and returning the pointer to the static variables
; allocated in the data segment.

retn
MyFunc endp

main proc near ; CODE XREF: start+AF↓p

push ebp
mov ebp, esp
; The stack frame is opened.

push 6
; This value of the int type is passed to the MyFunc function.
; (The sixth day is Saturday.)

call MyFunc
add esp, 4
; The MyFunc function is called.

push eax
; The value returned by MyFunc is passed to the printf function.
; The format-specification string indicates that
; this is a pointer to the string.

push offset aS ; "%s\n"
call _printf
add esp, 8

pop ebp
; The stack frame is closed.

retn
main endp

aMonday db 'Monday',0, 0, 0, 0, 0 ; DATA XREF: MyFunc+C↑o
; The presence of a cross-reference to one function suggests
; that this variable is of the static type.

aTuesday db 'Tuesday',0,0,0,0,0,0,0,0,0
aWednesday db 'Wednesday',0,0,0,0,0,0,0,0,0,0,0
aThursday db 'Thursday',0,0,0,0,0,0,0,0,0
aFriday db 'Friday',0,0,0,0,0,0,0,0,0
aSaturday db 'Saturday',0,0,0,0,0,0,0,0,0
aSunday db 'Sunday',0,0,0,0,0
aS db '%s', 0Ah, 0 ; DATA XREF: main+E↑o

Compare that example with one that uses true global variables.

Listing 121: Returning a Value via a Global Variable

#include
int a;
int b;
int c;
MyFunc ( )
{
c=a+b;
}
main ( )
{
a=0x666;
b=0x777;
MyFunc ( );
printf ("%x\n", c);
}

The disassembled code of the compiled version of the previous example looks as follows:

Listing 122: The Disassembled Code Returning a Value via a Global Variable

main proc near ; CODE XREF: start+AF↓p
push ebp
mov ebp, esp
; The stack frame is opened.

call MyFunc
; The MyFunc function is called. Note that nothing is passed
; to the function, and nothing is returned. Therefore,
; the preliminary conclusion is that its prototype
; looks like this: void MyFunc ( )

call Sum
; The Sum function is called. This function doesn't receive
; or return any values. Its preliminary prototype
; looks like this: void Sum( )

mov eax, c
; The value of the c global variable is loaded into EAX.
; Now, examine the data segment. The c local variable
; equals zero. However, this value should be questioned
; because previously called functions could have changed it.
; The assumption about modification is strengthened by a pair of
; cross-references, one of which points to the Sum function.
; The w suffix that ends the cross-reference indicates that Sum
; assigns some value to the c variable, which can be worked out
; by analyzing the code of the Sum function.

push eax
; The value returned by the Sum function is passed
; to the printf function via the c global variable.
; The format-specification string indicates that
; the argument is of the int type.

push offset asc_406030 ; "%x\n"
call _printf
add esp, 8
; The result returned by Sum is printed.

pop ebp
; The stack frame is closed.

retn
main endp

Sum proc near ; CODE XREF: main+8↑p
; The Sum function doesn't receive any arguments via the stack!

push ebp
mov ebp, esp
; The stack frame is opened.

mov eax, a
; The value of the a global variable is loaded into EAX.
; Now, find a in the data segment. A cross-reference
; to MyFunc assigns something to the a variable.
; Since MyFunc was called prior to the call of Sum,
; MyFunc presumably has returned some value into a.

add eax, b
; EAX (which stores the value of the a global variable) is added
; to the contents of the b global variable.

mov c, eax
; The result of a + b is assigned to the c variable.
; As you already know (from the analysis of the main function),
; the Sum function uses the c variable to return its results;
; now, it is clear which results.

pop ebp
; The stack frame is closed.

retn
Sum endp

MyFunc proc near ; CODE XREF: main+3↑p
push ebp
mov ebp, esp
; The stack frame is opened.

mov a, 666h
; The 0x666 value is assigned to the a global variable.

mov b, 777h
; The 0x777 value is assigned to the b global variable.
; As you discovered by analyzing the two previous functions,
; the MyFunc function returns its computation result into
; the a and b variables. Now, the result in question has been
; clarified. You also know how these three functions interact.
; First, main ( ) calls MyFunc ( ), which initializes the a and b
; global variables. Then, main ( ) calls Sum ( ), placing
; the sum of a and b into the c global variable. Finally,
;main ( ) takes c and passes it via the stack to printf.
; Even an elementary example of three functions creates a knotty
; problem! What can be said about a real program that
; incorporates thousands of such functions, whose calling
; order and behavior are far from obvious?

pop ebp

retn
MyFunc endp
a dd 0 ; DATA XREF: MyFunc+3w Sum+3↑r
b dd 0 ; DATA XREF: MyFunc+Dw Sum+8↑r
c dd 0 ; DATA XREF: Sum+Ew main+D↑r
; The cross-references indicate that all three variables are global;
; each can be accessed by more than one function.

Returning values via processor flags Assembly functions typically use the CPU flags register to return the result (success or failure) of the function execution. By convention, the carry flag (CF) indicates an error. The zero flag (ZF) is the next most popular one. Other flags practically are not used.

The carry flag is set by the STC instruction, or by any mathematical operation that results in a carry (for example, CMP a, b, where a < b). This flag is reset by the CLC instruction, or by any appropriate mathematical operation.

The carry flag is usually checked by the JC xxx and JNC xxx jump instructions, executed, respectively, depending on whether the carry is present or not. The JB xxx and JNB xxx branches are their syntactic synonyms, which give identical code after assembling.

Listing 123: Returning Values via Processor Flags

#include

Err () { printf ("-ERR: DIV by Zero\n") ; }
// This function gives a division-error message.

Ok (int a) (printf ("%x\n", a) ; }
// The result of division is printed.

__declspec (naked) MyFunc ( )
{
// This assembler function implements division.
// It divides EAX by EBX, then returns
// the result into EAX and the remainder into EDX.
// An attempt to divide by zero causes the function to set
// the carry flag.

__asm{
xor edx, edx ; EDX is zeroed. That is, the div instruction
; expects the dividend to be in EDX:EAX.
test ebx, ebx ; The divisor is checked for zero.
jz _err ; If the divisor is equal to zero,
; jump to _err.

div ebx ; EDX:EAX is divided by EBX.
; (EBX is not equal to zero.)

ret ; Upon exiting, the quotient is returned into EAX
; and the remainder is returned into EDX.
_err: ; This code takes control
; when an attempt is made to divide by zero.
stc ; The carry flag is set, which signals
; the error and...
ret ; ...quits.
}
}
// This is a wrapper for MyFunc.
// Two arguments - the dividend and the divisor -
// are received via the stack. The result of division
// (or the error message) is displayed on the screen.
__declspec(naked) MyFunc_2 (int a, int b)
{
__asm{
mov eax, [esp+4] ; The contents of the a argument
; are loaded into EAX.
mov ebx, [esp+8] ; The contents of the b argument
; are loaded into EDX.

call MyFunc ; This is an attempt to divide a by b.
jnc_ok ; If the carry flag is reset,
; the result is displayed; otherwise,...

call Err ; ...the error message is displayed.

ret ; Returning
_ok:
push eax ; The result of division is passed and
call Ok ; displayed on the screen.
add esp, 4 ; The stack is cleared.

ret ; Returning
}
}

main ( ) {MyFunc_2(4, 0);}

Local Stack Variables
Local variables are placed onto the stack (also known as automatic memory), then removed when the function completes execution. First, any arguments passed to the function are placed onto the stack. The CALL instruction that calls this function places the return address on top of the arguments. Upon gaining control, the function opens the stack frame (i.e., saves the previous value of the EBP register and sets it equal to the ESP register, which points to the top of the stack). The free stack area is above EBP (i.e., at a lower address); the service data (the stored value of EBP and the return address), as well as the arguments, are below it.

The stack area above the stack-top pointer (the ESP register) can be erased or distorted. For example, it can be used by hardware-interrupt handlers called at unpredictable places in the program and at unpredictable times. If the function uses the stack (to save registers or pass arguments), stack corruption will result. The way to avoid this is to move the stack pointer upward until it occupies this area of the stack.

The integrity of the memory "below" ESP is guaranteed (against unintentional distortions): The next call of the PUSH instruction will place the data on top of the stack without erasing local variables.

At the end of execution, the function is obliged to return the ESP value to its former place. If it does not return this value, the RET function will be unable to read the return address off the stack; rather, it will read the value of the "uppermost" local variable, and will pass control nowhere.

Note The left part of Fig. 15 shows the stack at the moment the function is called. The function opens the stack frame, saving the old value of the EBP register and setting it equal to ESP. The right part of Fig. 15 represents the allocation of 0x14 bytes of the stack memory for local variables. This is done by moving the ESP register upward — into the area of lower addresses. Local variables are allocated in the stack as if they were pushed there by the PUSH instruction. After execution, the function increases the value of the ESP register and returns the value to its former position, releasing memory occupied by local variables. Then, the function restores the EBP value from the stack, closing the stack frame.

Figure 15: The mechanism for allocating local variables in the stack
Addressing local variables Local variables and stack arguments are addressed similarly. (See the "Function Arguments" section.) The only difference is arguments are located "below" EBP, and local variables reside "above" it. In other words, arguments have a positive offset relative to EBP, and local variables have a negative offset. Therefore, they can easily be distinguished. For example, [EBP+xxx] is an argument, and [EBP−xxx] is a local variable.

The register that points at the stack frame serves as a barrier: The function's arguments are on one side of it, and the local variables are on the other (Fig. 16). It's clear why ESP is copied to EBP when the stack frame is opened: If copying didn't occur, the addressing of local variables and arguments would be complicated considerably. Compiler developers are humans (strange as it may seem), and they don't want to complicate their lives unnecessarily. However, optimizing compilers are capable of addressing local variables and arguments directly via ESP, freeing the EBP register for more useful work.

Figure 16: Addressing local variables
Implementation details There are plenty of ways to allocate and clear memory for local variables. For example, SUB ESP, xxx can be used at the input, and ADD ESP, xxx can be used at the output. Striving, perhaps, to be distinguished, Borland C++ and some other compilers allocate memory by increasing ESP, not decreasing it… by a negative number! By default, most disassemblers interpret this as a large positive number. When allocating a small amount of memory, optimizing compilers replace SUB reg with PUSH reg, which is shorter by a few bytes. This creates identification problems: Is this saving registers on the stack, passing arguments, or allocating memory for local variables?

The algorithm for clearing memory is also ambiguous. In addition to encountering an increase in the register of the stack-top pointer due to the ADD ESP, xxx instruction (or a decrease in it due to a negative number, as previously mentioned), you may find the construction MOV ESP, EBP. (When the stack frame was opened, ESP was copied to EBP, but EBP was not modified during the execution of the function.) Finally, memory may be released by the POP instruction, which pops out local variables one by one into any unused register. (Such a method is justified only when the number of local variables is small.)

Identifying the mechanism that allocates memory Using the SUB and ADD instructions, memory allocation is consistent and interpreted unequivocally. If memory is allocated using the PUSH instruction and is cleared by POP, this construction becomes indistinguishable from simple allocation and deallocation of registers on and from the stack. As a complication, the function also contains instructions for allocating registers, mingled with memory-allocation instructions. Is it possible to ascertain how many bytes are allocated for local variables, or whether any bytes have been allocated? (The function may not contain local variables.)

The search for references to the memory locations "above" the EBP register (i.e., with a negative relative offset) might be helpful. Let's consider two examples.

Listing 124: Identifying the Memory-Allocation Mechanism

push ebp
push ecx
xxx
xxx
xxx
pop ecx
pop ebp
ret

push ebp
push ecx
xxx
mov [ebp-4], 0x666
xxx
pop ecx
pop ebp
ret

In the left-hand example, there is no reference to local variables; in the right-hand code, the MOV [EBP-4], 0x666 construction copies the 0x666 value to the var_4 local variable. If there's a local variable, memory must have been allocated for it. As there are no instructions such as SUB ESP, xxx or ADD ESP, xxx in the body of the function, the memory must have been allocated by PUSH ECX. (The contents of the ECX register are stored on the stack 4 bytes "above" EBP.) Only one instruction, PUSH ECX, can be cited, because PUSH EBP is not fit for the role of "allocator." What can be done if there are several "suspects?"

The amount of allocated memory can be determined by the offset of the "highest" local variable in the body of the function. In other words, of all the [EBP-xxx] expressions, the greatest xxx offset generally is equal to the number of bytes of memory allocated for local variables. However, local variables can be declared and not used. Memory is allocated for them (although optimizing compilers remove such variables as superfluous), but no reference occurs to them. In this case, the algorithm for calculating the amount of allocated memory produces a result that is too low. However, this error has no effect on the results of analyzing the program.

Initializing local variables There are two ways to initialize local variables: Assign the necessary value by the MOV instruction (such as MOV [EBP-04], 0x666), or directly push the values onto the stack using the PUSH instruction (such as PUSH 0x777). This allows the allocation of memory for local variables to be favorably combined with their initialization (if there are only a few of these variables).

In most cases, popular compilers perform initialization using MOV; perverse assemblers are more likely to use PUSH, sometimes in protection aimed at misleading hackers (although any hacker led astray by such a trick must be a beginner).

Allocating structures and arrays tructures and arrays (i. e., their elements) are placed consecutively on the stack in adjacent memory locations. The smaller index of an array is at the smaller address, but it is addressed by a larger offset relative to the pointer-register of the stack frame. This is no surprise; because local variables are addressed by a negative offset, [EBP-0x4] > [EBP-0x10].

The mess grows because IDA omits the minus sign when it gives names to local variables. For example, of the variables var_4 and var_10, the latter occupies the smaller address, the index of which is larger. If var_4 and var_10 are two ends of an array, instinct would place var_4 at the head of an array, and var_10 at the end, although they belong in the opposite locations.

Alignment in the stack In some cases, elements of a structure, an array, or even particular variables must be aligned by addresses that are multiples of a specific power of 2. However, the stack-top pointer value is not defined beforehand. How can the compiler, which does not know the index value, fulfill this requirement? It simply discards the lower bits of ESP.

The lower bit of even numbers is zero. To ensure that the value of the stack-top pointer is divisible by two without a remainder, simply force its lower bit to zero. If two lower bits are set to zero, the resulting value will be a multiple of four; if three lower bits are set to zero, the resulting value will be a multiple of eight; and so on.

In most cases, bits are reset using the AND instruction. For example, AND ESP, FFFFFF0 makes ESP a multiple of 16. How do we obtain this? Let's convert 0xFFFFFFF0 to a binary form, which will give the following: 111111111 11111111 11110000. The four trailing zeroes mean that four lower bits of any number will be masked. The number will be divisible by 2 to the power of 4, which equals 16.

How IDA identifies local variables Although local variables have been used in the previous listings, an example of how they are identified may be helpful.

Listing 125: Identifying Local Variables

#include
#include

int MyFunc (int a, int b)
{
int c; // A local variable of the int type.
char x[50] // An array (shows the method of
// allocating arrays in memory)

c = a + b; // The sum of a + b is placed into c.

ltoa (c, &x[0], 0x10) ; // The sum of a + b is converted into
// a string.

printf ("%x == %s == ", c, &x[0]); // The string is displayed.
return c;
}
main ( )
{
int a = 0x666; // The a and b local variables are declared,
int b = 0x777; // demonstrating the mechanism by which
// the compiler initializes them.

int c [1]; // Tricks like this are necessary
// to prevent the optimizing compiler from placing
// the local variable into the register. (See the
// "Register and Temporary Variables" section.)
// Because the pointer to c is passed to the printf
// function, and a pointer to the register can't be
// passed, the compiler has to leave the variable
// in memory.

c [0] = MyFunc (a, b);
printf ("%x\n", &c [0]);

return 0;
}

The disassembled code of this example, compiled using Microsoft Visual C++ 6.0 with default settings, should look as follows:

Listing 126: The Disassembled Code for Identifying Local Variables Compiled Using Visual C++ 6.0

MyFunc proc near ; CODE XREF: main+1C↓p

var_38 = byte ptr -38h
var_4 = dword ptr -4
; Local variables are allocated at the negative offset, relative to
; EBP; function arguments are allocated at the positive offset.
; Note that the "higher" the variable's location,
; the larger the absolute value of its offset.

arg_0 = dword ptr 8
arg_4 = dword ptr 0Ch

push ebp
mov ebp, esp
; The stack frame is opened.

sub esp, 38h
; The ESP value by is decreased 0x38, and 0x38 bytes
; are allocated for local variables.

mov eax, [ebp+arg_0]
; The value of the arg_0 argument is loaded into EAX.
; This clearly is an argument, as shown by
; its positive offset relative to the EBP register.

add eax, [ebp+arg_4]
; The value of the arg_0 argument is added to EAX.
mov [ebp+var_4], eax
; Here is the first local variable.
; It is just a local variable, as shown by its negative
; offset relative to the EBP register. Why is it negative?
; Look how IDA has determined var_4. It would be better
; if the negative offsets of local variables were marked clearly.

push 10h ; int
; The 0x10 value (the radix of the numeration system)
; is passed to the Itoa function.

lea ecx, [ebp+var_38]
; The pointer to the var_38 local variable is loaded into ECX.
; What kind of a variable is this? Let's scroll
; the disassembler screen upward to find the description
; of local variables that IDA has recognized:
; var_38 = byte ptr -38h
; var_4 = dword ptr -4
;
; The nearest lower local variable has an offset equal to -4.
; The var_38 variable has an offset equal to -38. Subtracting
; the latter from the former gives the size of var_38.
; It is easy to calculate that the size is equal to 0x34.
; Nevertheless, the Itoa function is expecting the pointer
; to char*. Thus, it is possible to write the following comment
; to var_38: "char s [0x34]". This is done as follows:
; Open the Edit menu, then open the Functions submenu.
; Select Stack variables, or press the +
; key combination. A window will open that lists all
; recognized local variables. Bring the cursor to var_34,
; press <;> to enter a recurring comment, and write
; "char s[0x34]". Now, hit the + key combination
; to finish input. Then, hit the key to close
; the local-variables window. Now, each reference to var_34
; will be accompanied by the "char s [0x34]" comment.

push ecx ; char *
; The pointer to the local buffer for var_38
; is passed to the Itoa function.

mov edx, [ebp+var_4]
; The value of the var_4 local variable is loaded into EDX.

push edx ; __int32
; The value of the var_38 local variable is passed
; to the Itoa function. Using the prototype of this function,
; IDA already has determined that the variable type is int.
; Press the + key combination, and comment var_4.

call __ltoa
add esp, 0Ch
; The contents of var_4 are converted to a hexadecimal number
; represented as a string. The result is placed
; in the local buffer for var_38.

lea eax, [ebp+var_38] ; char s [0x34]
; The pointer to the local buffer for var_34 is loaded into EAX.

push eax
; The pointer to var_34 is passed to the printf function,
; which displays the contents on the screen.

mov ecx, [ebp+var_4]
; The value of the var_4 local variable is loaded into ECX.

push ecx
; The value of the var_4 local variable is passed to printf.

push offset aXS ; "%x == %s == "
call _printf
add esp, 0Ch

mov eax, [ebp+var_4]
; The value of the var_4 local variable is returned into EAX.

mov esp, ebp
; Memory occupied by local variables is released.

pop ebp
; The former value of EBP is restored.

retn
MyFunc endp

main proc near ; CODE XREF: start+AF↓p

var_C = dword ptr -0Ch
var_8 = dword ptr -8
var_4 = dword ptr -4

push ebp
mov ebp, esp
; The stack frame is opened.

sub esp, 0Ch
; The local variables are allocated 0xC bytes of memory.

mov [ebp+var_4], 666h
; The var_4 local variable is initialized and assigned the value 0x666.

mov [ebp+var_8], 777h
; The var_8 local variable is initialized and assigned the value 0x777.
; Note that the order of local variables in the memory
; is the reverse of the order in which they were referenced -
; not declared. The variables are not always
; allocated in this order; this depends on the compiler,
; which is why it should never be relied on.

mov eax, [ebp+var_8]
; The value of var_8 is copied to the EAX register.

push eax
; The value of var_8 is passed to the MyFunc function.

mov ecx, [ebp+var_4]
; The value of var_4 is copied to the ECX register.

push ecx
; The value of var_4 is passed to the MyFunc function.

call MyFunc
add esp, 8
; The MyFunc function is called.

mov [ebp+var_C], eax
; The returned value is copied to the var_C local variable.

lea edx, [ebp+var_C]
; The pointer to the var_C local variable is loaded into EDX.

push edx
; The pointer to var_C is passed to the printf function.

push offset asc_406040 ; "%x\n"
call _printf
add esp, 8

xor eax, eax
; Zero is returned.

mov esp, ebp
; Memory occupied by local variables is released.

pop ebp
; The stack frame is closed.

retn
main endp

That was rather easy, wasn't it? The disassembled code of this example compiled using Borland C++ 5.0 will be more difficult.

Listing 127: The Disassembled Code for Identifying Local Variables Compiled Using Borland C++ 5.0

MyFunc proc near ; CODE XREF: _main+14↓p

var_34 = byte ptr -34h
; Note that there is one local variable, although as many as three
; were declared! Where are the others? This compiler
; has placed them into the registers, rather than onto the stack,
; to speed up the process of addressing them.
; (See the "Register
; and Temporary Variables" section for more details.)

push ebp
mov ebp, esp
; The stack frame is opened.

add esp, 0FFFFFFCC
; After this allocation, press <-> in IDA to convert the number
; into the signed one, which gives -34. Therefore, 0x34 bytes
; were allocated for local variables. Note that memory
; was allocated using ADD, not SUB!

push ebx
; Does this store EBX on the stack, or does it allocate memory
; for local variables? Because memory previously was allocated
; using ADD, PUSH must save the register onto the stack.

lea ebx, [edx+eax]
; This tricky addition gives the sum of EDX and EAX.
; Because EDX and EAX were not initialized explicitly,
; the arguments were passed via them.
; (See the "Function Arguments" section.)

push 10h
; A radix of the choosen numeration system
; is passed to the Itoa function.

lea eax, [ebp+var_34]
; The pointer to the local buffer for var_34 is loaded into EAX.

push eax
; The pointer to the buffer for writing the result
; is passed to the ltoa function.

push ebx
; The sum of two arguments (not the pointer)
; is passed to the MyFunc function.

call _ltoa
add esp, 0Ch

lea edx, [ebp+var_34]
; The pointer to the local buffer for var_34 is loaded into EDX.

push edx
; The pointer to the local buffer for var_34,
; which contains the sum of MyFunc's arguments converted
; into a string, is passed to the printf function.

push ebx
; The sum of the arguments is passed to the MyFunc function.

push offset aXS ; format
call _printf
add esp, 0Ch

mov eax, ebx
; The sum of the arguments is returned into EAX.

pop ebx
; EBX is popped off the stack, restoring its former state.

mov esp, ebp
; Memory occupied by local variables is released.

pop ebp
; The stack frame is closed.

retn
MyFunc endp
; int__cdec1 main (int argc, const char **argv, const char *envp)
_main proc near ; DATA XREF: DATA:00407044↓o

var_4 = dword ptr -4
; IDA has recognized at least one local variable,
; which should be noted.

argc = dword ptr 8
argv = dword ptr 0Ch
envp = dword ptr 10h

push ebp
mov ebp, esp
; The stack frame is opened.

push ecx
push ebx
push esi
; The registers are saved on the stack.

mov esi, 777h
; The value 0x777 is placed into the ESI register.

mov ebx, 666h
; The value 0x666 is placed into the EBX register.
mov edx, esi
mov eax, ebx
; The arguments are passed to MyFunc via the registers.
call MyFunc

; The MyFunc function is called.

mov [ebp+var_4], eax
; The result returned by MyFunc is copied to the var_4 local
; variable. Wait! Which local variable? How has memory
; been allocated for it? Only one of the PUSH instructions
; could have done this. But which one? Look at the offset
; of the variable: It resides 4 bytes higher than EBP,
; and its memory area is occupied by the contents of
; the register saved by the first PUSH instruction, which
; came after the stack frame was opened. (The second PUSH places
; the value of the register at an offset of -8, and so on.)
; The first instruction was PUSH ECX. Therefore, this
; does not save the register on the stack; it allocates
; memory for a local variable. Since the var_8 and var_C
; local variables do not seem to have been accessed,
; the PUSH EBX and PUSH ESI instructions likely
; save the registers.

lea ecx, [ebp+var_4]
; The pointer to the var_4 local variable is loaded into ECX.

push ecx
; The pointer to var_4 is passed to the printf function.

push offset asc_407081 ; format
call _printf
add esp, 8

xor eax, eax
; Zero is returned into EAX.

pop esi
pop ebx
; The values of the ESI and EBX registers are restored.

pop ecx
; Memory allocated for the var_4 local variable is released.

pop ebp
; The stack frame is closed.

retn

_main endp

Frame Pointer Omission (FPO) The EBP register traditionally is used to address local variables. As there are only seven general-purpose registers, it is undesirable to designate one of them permanently for addressing local variables. Is there another, more elegant solution?

Consideration of this problem leads to the conclusion that a dedicated register for addressing local variables is not necessary; this goal can be reached (with some tricks) by using one ESP — the stack-pointer register.

The only problem is the floating stack frame. After allocating memory for local variables, ESP may point to the top of the allocated area. In this case, the buff variable (Fig. 17) will appear at the ESP+0xC address. As soon as something is placed onto the stack (an argument of the calling function, or the register contents for temporary storage), the frame will "move," and buff will appear at the ESP+0x10 address, not at ESP+0xC!

Figure 17: Addressing local variables via the ESP register forms a floating stack frame

Figure 18: Types of operands
Contemporary compilers are capable of addressing local variables via ESP, and dynamically tracing the ESP value (unless tricky assembler inserts in the function's body unpredictably modify the ESP value).

This complicates an analysis of the code. After pointing to any part of the code, it is impossible to determine which local variable is being addressed; the whole function must be thoroughly worked out, and the ESP value must be watched closely. (Often, massive errors will nullify all preceding work.) Fortunately, the IDA disassembler knows how to treat such variables. Nevertheless, hackers never rely entirely on automatics; rather, they try to understand how things work.

Let's turn to our good old file simple.c and compile it with the /02 key, which optimizes performance by having the compiler use all registers and address local variables via ESP.

>cl sample.c /02
00401000: 83 EC 64 sub esp,64h

Memory is allocated for local variables. Note that there are no instructions such as PUSH EBP or MOV EBP, ESP!

00401003: A0 00 69 40 00 mov al, [00406900] ; mov al, 0
00401008: 53 push ebx
00401009: 55 push ebp
0040100A: 56 push esi
0040100B: 57 push edi

The registers are saved.

0040100C: 88 44 24 10 mov byte ptr [esp+10h], al

The zero value is placed into the [ESP+0x10] variable. (Let's call it buff.)

00401010: B9 18 00 00 00 mov ecx, 18h
00401015: 33 CO xor eax, eax
00401017: 8D 7C 24 11 lea edi, [esp+11h]

EDI is set to point to the local variable [ESP+0x11] (an uninitialized tail of buff).

0040101B: 68 60 60 40 00 push 406060h ; "Enter password"

The offset of the "Enter password" string is placed onto the stack. Note that the ESP register creeps 4 bytes upward.

00401020: F3 AB rep stos dword ptr [edi]
00401022: 66 AB stos word ptr [edi]
00401024: 33 ED xor ebp, ebp
00401026: AA stos byte ptr [edi]

The buffer is zeroed.

00401027: E8 F4 01 00 00 call 00401220

The "Enter password" string is displayed on the screen. Note that the arguments have not been popped off the stack!

0040102C: 68 70 60 40 00 push 406070h

The offset of the pointer to the stdin pointer is placed onto the stack. Note that ESP creeps another 4 bytes upward.

00401031: 8D 4C 24 18 lea ecx, [esp+18h]

The pointer to the [ESP+0x18] variable is loaded into ECX. Is this just another buffer? No; this is the [ESP+0x10] variable, which has changed its appearance because ESP has been modified. Subtracting 8 bytes (which ESP crept upward) from 0x18 gives 0x10, our old acquaintance [ESP+0x10]. (Should old acquaintance be forgot?)

Analyzing a procedure that contains a dozen lines is fairly straightforward, but a program of a million lines would be enough to drive anyone mad. The alternative is to use IDA. Consider the following example:

.text:00401000 main proc near ; CODE XREF: start+AF↓p
.text:00401000
.text:00401000 var_64 = byte ptr -64h
.text:00401000 var_63 = byte ptr -63h

IDA revealed two local variables located at the offsets 63 and 64, relative to the stack frame; that's why they were given the names var_63 and var_64.

.text:00401000 sub esp, 64h
.text:00401003 mov al, byte_0_406900
.text:00401008 push ebx
.text:00401009 push ebp
.text:0040100A push esi
.text:0040100B push edi
.text:0040100C mov [esp+74h+var_64], al

IDA automatically combined the local variable name and its offset in the stack frame.

.text:00401010 mov ecx, 18h
.text:00401015 xor eax, eax
.text:00401017 lea edi, [esp+74h+var_63]

IDA failed to recognize the initialization of the first byte of the buffer and mistook it for a separate variable. Only a human can figure out how many variables are used here.

.text:0040101B push offset aEnterPassword ; "Enter password:"
.text:00401020 repe stosd
.text:00401022 stosw
.text:00401024 xor ebp, ebp
.text:00401026 stosb
.text:00401027 call sub_0_401220
.text:0040102C push offset off_0_406070
.text:00401031 lea ecx, [esp+7Ch+var_64]

Note that IDA correctly recognized that the var_64 variable was accessed, even though its offset, 0x7C, differs from 0x64.

Register and Temporary Variables
In an attempt to minimize the number of memory access operations, optimizing compilers place the most intensively used local variables into general-purpose registers, saving them on the stack only in extreme cases (and, ideally, never).

What kind of difficulties does this create during analysis? First, it introduces a context dependence into the code. In an instruction such as MOV EAX, [EBP+var_10], the contents of the var_10 variable are being copied to the EAX register. The variable type can be found by searching the function body for every occurrence of var_10, which may indicate the purpose of the variable.

This trick, however, will not work with register variables. Suppose that we encountered the MOV EAX, ESI instruction and want to trace all references to the variable of the ESI register. Searching the function body for the substring "ESI" gives nothing or, even worse, produces a set of false hits. What can be done?

One register — in this case, ESI — may be used to store many different variables temporarily. There are only seven general-purpose registers; EBP is assigned to the stack frame, and EAX and EDX are used for the returned value of the function. Therefore, only four registers are available to store local variables. There are even fewer free registers when programs written in C are executed — one of these four registers is used as the pointer to the virtual table, and another is the pointer to an instance of this. Pressing ahead with just two registers is not really possible; there are dozens of local variables in a typical function. This is why the compiler uses registers as a cache. Cases of each local variable residing in a register are exceptional; variables often are scattered chaotically around the registers, sometimes stored on the stack, and frequently popped off into a different register (rather than the one in which the contents were stored).

No contemporary disassembler (including IDA) is capable of tracing the "migration" of register variables; this operation has to be done manually. It is simple, although tiresome, to determine the contents of a particular register at any point in the program: Just work through the program mentally from startup to the point in question, tracing all the passing operations. It is more difficult to find out how many local variables are stored in a particular register. When a large number of variables are mapped on a small number of registers, it becomes impossible to reconstruct the map unambignously. For example, the programmer declares the a variable, and the compiler places it into the X register. Later, the programmer declares the b variable. If the a variable is no longer used (as is often the case), the compiler may place the b variable into the X register without worrying about saving the value of a. As a result, one variable is lost. At first glance, there are no problems; losing one variable is not a disaster. But if a was sufficient, why has the programmer introduced b? If the a and b variables are of the same type, no problems arise; if they are different, the analysis of the program becomes extremely complicated.

Let's look at techniques for identifying register variables. Many hacker manuals assert that register variables differ from other variables in that they never deal with memory. This is incorrect. Register variables can be stored on the stack temporarily by the PUSH instruction and restored by the POP instruction. In some ways, a variable of this sort ceases to be a register variable; nevertheless, it does not become a stack variable. To avoid defining hundreds of variable classes, let's agree that the register variable is a variable contained in the general-purpose register that may be stored on the stack, but always at its top; it can never be stored in the stack frame. In other words, register variables are never addressed via EBP. If the variable is addressed via EBP, it "lives" in the stack frame and is a stack variable. Is this correct? No; If the value of the b stack variable is assigned to the a register variable, the compiler will generate code similar to the following: MOV REG, [EBP-xxx]. Accordingly, the assignment of the value of the register variable to the stack variable will look like this: MOV [EBP-xxx], REG. Despite reference to the stack frame, the REG variable remains a register variable. Consider the following code:

Listing 128: Distinguishing Register Variables from Temporary Variables

...
mov [ebp-0x4], 0x666
mov esi, [ebp-0x4]
mov [ebp-0x8], esi
mov esi, 0x777
sub esi, [ebp-0x8]
mov [ebp-Oxc] , esi

This code can be interpreted in two ways: Either there is an ESI register variable (the source code shown in the left part of Listing 129), or the ESI register[i] is being used as a temporary variable for passing data (the source code shown in the right part of Listing 129).

Listing 129: The Source Code When ESI Is a Register Variable (Left) and a Temporary Variable (Right)

int var_4=0x66;
int var_8=var_4;
int vac_C=0x777 - var_8

int var_4=0x666;
register ESI = var_4;
int var_8=ESI;
ESI=0x777-var_8;
int var_C = ESI

Although the algorithms of the listings are identical, the code on the left is substantially more illustrative than the code on the right. The main objective of disassembling is to reconstruct the algorithm of a program, not to reproduce the source code of a program. It does not matter whether ESI represents a register or a temporary variable. The main thing is that everything works smoothly. In general, you should choose the most understandable interpretation if there are several versions.

Before examining temporary variables in detail, let's summarize our knowledge of register variables by analyzing the following example:

Listing 130: Identifying Register Variables

main ()
{
int a=0x666;
int b=0x777;
int c;
c=a+b;
printf ("%x + %x = %x\n", a, b, c);
c = b - a;
printf ("%x - %x = %x\n", a, b, c);
}

The disassembled code of this example, compiled using Borland C++ 5.x, gives the following result:

Listing 131: The Disassembled Code for Identifying Register Variables

; int _ _cdecl main(int argc, const char **argv, const char *envp)
_main proc near ; DATA XREF: DATA:00407044↓o

argc = dword ptr 8
argv = dword ptr 0Ch
envp = dword ptr 10h
; Note that no stack variable has been recognized by IDA,
; although several were declared in the program. It seems likely
; that the compiler has allocated them in registers.

push ebp
mov ebp, esp
; The stack frame is opened.
push ebx
push esi
; What happened here? Were the registers saved
; on the stack, or was memory allocated for the stack
; variables? Since no stack has been recognized by IDA,
; this code likely saved the registers.

mov ebx, 666h
; The register is initialized. Compare this with Listing 126
; (in the "Local Stack Variables" section),
; which contained the following line:
; mov [ebp+var_4], 666h
; Hence, EBX is likely a register variable.
; The variable's existence can be proven: Had the
; value 0x666 been passed directly to the function - for example,
; printf ("%x %x %x\n", 0x666) - the compiler would have placed
; the PUSH 0x666 instruction into the code.
; This did not occur; therefore, the value 0x666 is passed via
; the variable. Thus, the reconstructed source code should contain:
; 1. int a=0x666

mov esi, 777h
; Similarly, ESI likely represents a register variable:
; 2. int b=0x777

lea eax, [esi+ebx]
; The sum of ESI and EBX is loaded into EAX.
; EAX is not a pointer; this is just a tricky addition.

push eax
; The sum of the ESI and EBX register variables is passed to the
; printf function. However, the contents of EAX are interesting:
; They could be an independent variable, or the sum
; of the a and b variables, which is passed
; to the printf function directly.
; For better readability, let's choose the latter:
; 3. printf (,,,, a+b)

push esi
; The register variable ESI, denoted as b
; in the preceding code, is passed to the printf function.
; 3. printf (,,, b, a+b)

push ebx
; The register variable EBX, denoted as a
; in the preceding code, is passed to the printf function.
; 3. printf (,, a, b, a+b)

push offset aXXX ; "%x + %x = %x"
; The pointer to the format-specification string
; is passed to the printf function. This string indicates
; that all three variables are of the int type.
; 3. printf ("%x + %x = %x", a, b, a+b)

call _printf
add esp, 10h

mov eax, esi
; The register variable, previously denoted as b,
; is copied to EAX.
; 4. int c=b

sub eax, ebx
; The value of the variable contained in EBX (a) is subtracted
; from the variable contained in the EAX register (c).
; 5. c=c-a

push eax
; The difference between the variables contained
; in EAX and EBX is passed to the printf function.
; Because this difference between the b and a values was passed
; directly, it is clear that the c variable is unnecessary.
; Line 5 can be omitted (and, thus, a rollback can be peformed).
; Instead of line 4, the following can be inserted:
; 4. printf (,,,, b-a)

push esi
; The value of the variable in the ESI register (b)
; is passed to the printf function.
;4. printf (,,, b, b-a)

push ebx
; The value of the variable in the EBX register (a)
; is passed to the printf function.
; 4. printf (,, a, b, b-a)

push offset aXXX_0 ; "%x + %x = %x"
; The pointer to the format-specification string is passed to
; the printf function. This string indicates that all three
; variables are of the int type.
; 4. printf ("%x + %x = %x", a, b, b-a)

call _printf
add esp, 10h

xor eax, eax
; The zero value is returned into the EAX register.
; return 0

pop esi
pop ebx
; The registers are restored.

pop ebp
; The stack frame is closed.

retn
; The reconstructed code should look as follows:
; 1. int a=0x666
; 2. int b=0x777
; 3. printf ("%x + %x = %x", a, b, a + b)
; 4. printf ("%x + %x = %x", a, b, b - a)
;
; Comparing the result with the original soure code shows that
; removing the c variable introduced a slight mistake.
; This did not ruined the work. On the contrary,
; it improved the order of the listing, making it easier
; to understand. To stick more closely to the assembler code,
; the c variable can be reintroduced. Doing this
; has the benefit of removing the rollback (i.e.,
; already reconstructed lines need not be rewritten
; to remove a superfluous variable).
_main endo

Temporary variables Here, temporary variables will be defined as variables embedded into the code of a program by the compiler. Why are they necessary? Consider the following example: int b=a. If a and b are stack variables, assigning a value directly to them is impossible because the "memory-memory" addressing mode is not available in 80x86 microprocessors. Therefore, the operation must be carried out in two stages: memory to register, followed by register to memory. Actually, the compiler generates the following code:

register int tmp=a; mov eax, [ebp+var_4]
int b=tmp; mov [ebp+var_8], eax

Here, tmp is a temporary variable created to execute the operation b=a, then eliminated as superfluous.

Compilers (especially optimizing ones) tend to allocate temporary variables in registers; they only push temporary variables onto the stack in extreme cases. Mechanisms for allocating memory and the techniques for reading and writing temporary variables vary.

Typically, compilers react to an acute lack of registers by saving variables on the stack. Most often, integer variables are showered on the top of the stack by the PUSH instruction, then pulled from there by the POP instruction. It is possible to assert with confidence that an integer temporary variable is being dealt with if a program's code contains this sort of "push-pop" situation – not, however, saving the contents of the initialized register in a function's stack argument. (See the "Function Arguments" section.) In most cases, the allocation of memory for floating-point variables and the initialization of these variables occur separately. This is because an instruction that allows the compiler to transfer data from the top of the coprocessor stack to the top of the CPU stack doesn't exist; the operation must be carried out manually. First, the stack-top pointer (in the ESP register) is "lifted" slightly (usually by the SUB ESP, xxx instruction). Then, the floating-point value is written in the allocated memory (usually, FSTP [ESP] ). Finally, when the temporary variable becomes unnecessary, it is deleted from the stack by the ADD ESP, xxx instruction or something similar (such as SUB, ESP, -xxx).

Advanced compilers (such as Microsoft Visual C++) are capable of allocating variables in the arguments that remain on the top of the stack after the most recently called function completes execution. This trick applies to cdecl functions, but not to stdcall functions; the latter clear arguments from the stack independently. (See the "Function Arguments" section for more details.) This type of trick appeared during the analysis of the mechanism of returning values by functions (in the "Values Returned by Functions" section).

Temporary variables larger than 8 bytes (strings, arrays, structures, objects) almost always are allocated on the stack. They are distinguished from other types by their initialization mechanism: Instead of the traditional MOV, one of the cyclic move instructions, such as MOVSx, is used. If necessary, it is preceded by the REP recurrence prefix (Microsoft Visual C++, Borland C++). Alternatively, several MOVSx instructions can be used consecutively (Watcom C).

The mechanism of allocating memory for temporary variables is almost identical to the mechanism of allocating memory for stack local variables. Nevertheless, correct identification is not a problem. First, memory is allocated for stack variables immediately after the stack frame is opened. For temporary variables, memory allocation takes place at any point of the function. Second, temporary variables are addressed not via the stack-frame pointer, but via the pointer to the stack top.

Different compilers create temporary variables in different instances. However, it is possible to identify two instances in which the creation of temporary variables is unavoidable: when performing assignment, addition, or multiplication operations, and when an argument of a function or a part of an expression is another function. Let's consider each case in more detail.

Creating temporary variables when moving data or computing expressions As previously mentioned, 80x86 microprocessors do not support the direct transfer of data from memory to memory. Therefore, assigning one variable's value to another variable requires a temporary register variable (if there are no other register variables).

Computing expressions (especially complex ones) requires temporary variables to store intermediate results. How many temporary variables are required to compute the following expression?

int a=0x1; int b=0x2;
int c = 1/ ((1-a) / (1-b));

Let's begin from the parentheses, and rewrite the expression in the following way: int tmp_d = 1; tmp_d = tmp_d-a; and int tmp_e = 1; tmp_e=tmp_e-b; then int tmp_f = tmp_d/tmp_e; and, finally, tmp_j = 1; c = tmp_j/tmp_f. It turns out that there are four temporary variables. This seems a little excessive; is it possible to write it in a shorter way?

int tmp_d = 1; tmp_d=tmp_d-a; // (1-a);
int tmp_e=1; tmp_e=tmp_e-b; // (1-b);
tmp_d=tmp_d/tmp_e; // (1-a) / (1-b);
tmp_e=1; tmp_e=tmp_e/tmp_d;

We can manage with two temporary variables. What if the expression were more complex, employing ten pairs of parentheses, rather than three: How many temporary variables would that require?

There is no need to count: No matter how complex the expression is, two temporary variables are sufficient. If the parentheses are removed, we can manage with one variable, although excessive computation will be required. (This question will be considered in more detail in the "Mathematical Operators" section.) Now, let's see the results of compilation.

Listing 132: The Disassembled Code for Computing Complex Expressions

mov [ebp+var_4], 1
mov [ebp+var_8], 2
mov [ebp+var_C], 3
; The local variables are initialized.

mov eax, 1
; Here, the first variable is introduced.
; An intermediate value is placed into it, since the SUB
; instruction always places the result of computation
; at the location of the minuend because of architectural
; peculiarities of the 80x86 microprocessors.
; The minuend cannot be a direct value;
; therefore, a temporary variable must be introduced.

sub eax, [ebp+var_4]
; tEAX := 1 - var_4
; The computed value (1-a).

mov ecx, 1
; Yet another temporary variable is introduced
; because EAX is already occupied.

sub ecx, [ebp+var_8]
; tECX := 1 - var_8
; The computed value (1-b) is stored in the ECX register.

cdq
; The double word that resides in EAX is converted into
; a quad word and placed into EDX:EAX.
; (The idiv machine instruction always expects to see the
; dividend in these registers.)

idiv ecx
; The computed value (1-a) is divided by (1-b), and the quotient
; is placed into tEAX. Inevitably, the old value of the temporary
; variable has been overwritten. This does not create a problem,
; because it is not needed for further computation.

mov ecx, eax
; The value (1-a) / (1-b) is copied to the ECX register.
; This is a new temporary variable, t2ECX, located in the same
; register. (The old contents of ECX are no longer needed.)
; The 2 index is given after the t prefix to show
; that t2ECX is not the same as tECX, even though
; these temporary variables are stored in the same register.

mov eax, 1
; The immediate value 1 is placed into EAX.
; This is yet another temporary variable: t2EAX.
cdq
; EDX is zeroed.

idiv ecx
; The value 1 is divided by ((1-a) / (1-b)).
; The quotient is placed into EAX.

mov [ebp+var_10], eax
; c :=1 / ((1-a) / (1-b))
; Thus, only four temporary variables and two general-purpose
; registers were required to compute this expression.

Creating temporary variables to store a value returned by a function and the results of computing expressions Most high-level languages (including C/C++) allow functions and expressions to be used as immediate arguments, such as myfunc (a + b, myfunc_2 (c)). Prior to calling myfunc, the compiler should compute the value of the expression a + b. This is straightforward, but there is a problem: Where should the result of addition be written? Let's see how the compiler solves this.

Listing 133: The Disassembled Code Illustrating How the Compiler Stores the Results of Computing Expressions and Values Returned by Functions

mov eax, [ebp+var_C]
; A temporary variable, tEAX, is created. The value
; of the var_C local variable is copied into it.

push eax
; The tEAX temporary variable is stored on the stack.
; The value of the var_C local variable is passed as an argument
; to the myfunc function. (Theoretically, the var_C local
; variable could be passed directly to the PUSH [ebp+var_4]
; function without using temporary variables.)

call myfunc
add esp, 4
; The value of the myfunc function is returned into the EAX register.
; This can be regarded as a kind of temporary variable.

push eax
; The results returned by the myfunc function
; are passed to the myfunc_2 function.

mov ecx, [ebp+var_4]
; The value of the var_4 local variable is copied into ECX.
; ECX is yet another temporary variable. However, it is
; unclear why the compiler did not use the EAX register.
; The previous temporary variable is no longer needed;
; therefore, the EAX register that it occupied has become free.

add ecx, [ebp+var_8]
; ECX := var_4 + var_8

push ecx
; The sum of the two local variables
; is passed to the myfunc_2 function.

call _myfunc_2

The scope of temporary variables. Temporary variables are, to a certain extent, local variables. In most cases, their scope is limited to several lines of code; outside of this context, temporary variables are meaningless. In general, a temporary variable only obscures the code (myfunc (a+b) is much shorter and more intelligible than int tmp=a+b; myfunc (tmp)). Therefore, to avoid cluttering the disassembler listing, temporary variables should not be used in comments; it is better to substitute actual values for them. It is a good idea to denote temporary variables with a prefix, for example tmp_(or t, for those who love brevity).

Listing 134: An Example of Good Comment Style

mov eax, [ebp+var_4] ; var_8 := var_4
; ^ tEAX := var_4
add eax, [ebp+var_8], ; ^ tEAX += var_8

push eax ; MyFunc (var_4 + var_8)
call MyFunc

Global Variables
Tackling a program stuffed with global variables is probably the worst task for hackers. Instead of a tree with a strict hierarchy, the program components are interlaced. To solve one algorithm, the entire listing must be combed out and searched for cross-references. No disassembler, not even IDA, is capable of reconstructing cross-references perfectly.

Identifying global variables is much easier than identifying any other construction in high-level languages. Global variables give themselves away immediately by addressing memory directly. In other words, references to them look like MOV EAX, [401066], where 0x401066 is just the address of the global variable.

It is more difficult to understand the purpose of this variable and its content at a certain moment. Unlike local variables, global variables are context dependent. Each local variable is initialized by its parent function; it is not dependent on the functions called before it. On the contrary, a global variable can be modified by anyone — its value is not defined at any point of the program. To work it out, it is necessary to analyze all the functions that handle it and to reconstruct the order in which they were called. This question will be considered in more detail further on; now, let's examine how to reconstruct cross-references.

Reconstructing cross-references. In most cases, IDA copes well with cross-reference reconstruction, and manual reconstruction becomes unnecessary. However, even IDA makes mistakes occasionally, and not everyone has this disassembler at hand. Therefore, you should learn to deal with global variables manually.

Tracking references to global variables by searching for their offsets in the code (data) segment. Addressing global variables directly facilitates the search for the machine instructions that handle them. Consider the construction MOV EAX, [0x41B904]. Assembling it gives A1 04 B9 41 00. The offset of a global variable is written "as is" (while observing the reverse byte order: A higher byte is placed at a greater address, and a lower one is set at a smaller address).

A simple context search will reveal all references to the global variable of interest. You can find its offset, rewrite it from right to left and… get a load of garbage along with the useful information. Every number that coincides with the offset of a global variable is not a pointer to it. In addition, 04 B9 41 00 returns the following:

83EC04 sub esp,004
B941000000 mov ecx,000000041

The mistake is obvious: The value that we have found is not an operand of the instruction. Moreover, it spans two instructions! Rejecting all occurrences that cross instruction boundaries immediately removes a significant part of the garbage. The problem is how to determine the instruction boundaries; it is impossible to say anything about the instruction if you only have a part of it.

Consider the following construction: 8D 81 04 B9 41 00 00. Ignoring the trailing zero, this sequence can be interpreted as LEA EAX, [ECX+0x41B904]. If, however, 0x8D belongs to the "tail" of the previous instruction, the following instruction will be ADD D, [ECX] [EDI]*4, 000000041. There even may be several instructions here.

The most reliable way to determine the boundaries of machine instructions is to disassemble with tracing; unfortunately, this operation demands lots of resources, and not every disassembler is capable of tracing the code. Therefore, another method is required.

Machine code can be represented figuratively as typewritten text printed without spaces. An attempt to read from a random position likely will start in the middle of a word, and won't produce anything intelligible. The first several syllables may form an intelligent word (or even two), but continuous nonsense will appear further on.

The differences between constants and pointers, or salvaging the remaining garbage. At last, we have removed all the false hits. The heap of garbage has diminished appreciably, but artifacts such as PUSH 0x401010 keep turning up. What is 0x401010 — a constant or an offset? It could be either; it is impossible to tell until we reach the code that handles it. If 0x401010 is addressed by the handling code as a value, it is a constant; if it is addressed by reference, it is a pointer. (Here, it is an offset.)

This problem will be discussed in detail in the "Constants and Offsets" section. For now, I would like to note — with great relief — that the minimal address for loading a file in Windows 9x is 0x400000, and there are few constants expressed by such a large number.

Note The minimal address for loading a file in Windows NT is 0x10000. However, for a program to work successfully under both Windows NT and Windows 9x, loading should start from an address no lower than 0x400000.

The trials and tribulations of 16-bit mode It is not as simple to distinguish a constant from a pointer in 16-bit mode as it is in 32-bit mode. In 16-bit mode, one or more segments of 0x10000 bytes are allocated for data. Admissible values of offsets are confined to a narrow range — 0x0 to 0xFFFF — and most variables have offsets that are very small and visually indistinguishable from constants.

Another problem is that one segment often cannot accommodate all the data; therefore, several segments must be initialized. Two segments are tolerable: One is addressed via the DS register, the other is addressed via ES, and no difficulties arise in determining which variable points to which segment. For example, if all references to the X global variable, located in the base segment at the 0x666 offset, are of interest, all instructions such as MOV AX, ES: [0x666] can be rejected at once. In this case, the base segment is addressed via DS (by default), and this segment refers to ES. However, addressing also may occur in two stages, such as MOV BX, 0x666/xxx—xxx/MOV AX, ES: [BX]. Having seen MOV BX, 0x666, it will be impossible to determine a segment, and even to tell whether this is an offset. Nevertheless, this does not overcomplicate the analysis.

The situation becomes worse if there are a dozen data segments in a program. (It is conceivable that 640 KB of static memory could be required.) No number of segment registers will be sufficient for this; they will have to be reassigned many times. To figure out which segment is being addressed, the value of the segment register must be determined. The simplest way to do this is to scroll the disassembler screen slightly upward and look for the initialization of the segment register in question. Bear in mind that initialization often is done by POP, rather than by the MOV segREG, REG instruction. Note that PUSH ES/POP DS is equivalent to MOV DS, ES. Unfortunately, there is no equivalent of the MOV segREG, segREG instruction in the language of the 80x86 microprocessors. There is no MOV segREG, CONST instruction either, which is why it must be emulated manually, or as follows: MOV AX, 0x666/MOV ES, AX. Another possible method is the following: PUSH 0x666/POP ES.

Thankfully, 16-bit mode almost has become a thing of the past, and its problems have been buried by the sands of time. Programmers and hackers breathed a sigh of relief after the transition to 32-bit mode.

Addressing global variables indirectly Often, a claim is made that global variables are always addressed directly. However, the programmer may address a variable as desired in the inserts written in the assembler language. The situation is far from simple. If a global variable is passed by reference to a function (there is no reason by which a programmer cannot pass a global variable by reference), it will be addressed indirectly, via a pointer. At this point, an objection may be raised: Why should a global variable be passed explicitly to a function? Surely, any function can address a global variable without passing it. This is true, but only if the function knows about this beforehand. Suppose that the xchg function swaps its arguments, and two global variables urgently need to be swapped. The xchg function can access all global variables, but it does not know which of them to change, or whether doing so is necessary. This is why global variables sometimes must be explicitly passed as arguments to functions. This also means that it is impossible to find all the references to global variables by using a simple context search. IDA Pro will not find them either; to do so, it would need a full-featured processor emulator, or at least one capable of emulating its basic instructions — as can be seen in the following example.

Listing 135: Passing Global Variables Explicitly

#include

int a; int b; // Global variables a and b

xchg (int *a, int *b)
// The function that swaps the values of the arguments
{
int c; c=*a; *b=*a; *b=c;
// The arguments are addressed indirectly,
// using a pointer. If the arguments
// of the function are global variables,
// they will be addressed indirectly.
}

main ( )
{
a=0x666; b=0x777; // The global variables are addressed directly.
xchg (&a, &b); // The global variables are passed by reference.
}

The disassembled code of this example, compiled using Microsoft Visual C++, will look as follows:

Listing 136: The Disassembled Code for Passing Global Variables Explicitly

main proc near ; CODE XREF: start+AF↓p
push ebp
mov ebp, esp
; The stack frame is opened.

mov dword_405428, 666h
; The dword_405428 global variable is initialized.
; The indirect addressing indicates that this is
; a global variable.

mov dword_40542C, 777h
; The dword_40542C global variable is initialized.

push offset dword_40542C
; Note that this passes the offset of the dword_40542C global
; variable to the function as an argument (i.e., it is passed
; by reference). This means that the function will address the
; variable indirectly - via the pointer - in the same way as
; it addresses local variables.

push offset dword_405428
; The offset of the dword_405428 global variable
; is passed to the function.

call xchg
add esp, 8

pop ebp
retn
main endp

xchg proc near ; CODE XREF: main+21↑p

var_4 = dword ptr -4
arg_0 = dword ptr 8
arg_4 = dword ptr 0Ch

push ebp
mov ebp, esp
; The stack frame is opened.

push ecx
; Memory is allocated for the var_4 local variable.

mov eax, [ebp+arg_0]
; The contents of the arg_0 argument are loaded into EAX.

mov ecx, [eax]
; A global variable is addressed indirectly. Now you can see
; that, in contrast to common opinion, this can happen.
; Only analysis of the code of the calling function can reveal
; that a global variable was addressed (and which one).

mov [ebp+var_4], ecx
; The *arg_0 value is copied into the var_4 local variable.

mov edx, [ebp+arg_4]
; The contents of the arg_4 argument are loaded into EDX.

mov eax, [ebp+arg_0]
; The contents of the arg_0 argument are loaded into EAX.

mov ecx, [eax]
; The *arg_0 argument is copied into ECX.

mov [edx], ecx
; The arg_0[0] value is copied into [arg_4].

mov edx, [ebp+arg_4]
; The arg_4 value is loaded into EDX.

mov eax, [ebp+var_4]
; The value of the var_4 local variable is loaded into EAX
; (stores *arg_0).

mov [edx], eax
; The *arg_0 value is loaded into *arg_4.

mov esp, ebp
pop ebp
retn
xchg endp

dword_405428 dd 0 ; DATA XREF: main+3↑w main+1C↑o
dword_40542C dd 0 ; DATA XREF: main+D↑w main+17↑o
; IDA has found all the references to both global variables.
; The first two, main+3↑w and main+D↑w, reference
; the initialization code. (The w character
; stands for "write," which refers to addressing for writing.)
; The second two are main+1C↑o and main+17↑o.
; (The o stands for "offset," which refers to obtaining an offset
; to a global variable.)

If there are references with the offset-designating suffix "o" — analogous to the offset assembler instruction — among the cross-references to global variables, these should be noted immediately. An offset means that the global variable has been passed by reference. Passing by reference signifies indirect addressing, which entails tiresome manual analysis, and no advances tools will be helpful.

Static variables Static variables are similar to global variables, but they have a limited scope: They are accessible only from the function in which they were declared. In all other respects, static and global variables are nearly identical: Both are placed in the data segment, both are addressed directly (except when addressed by reference), and so on.

There is only one essential difference: Any function may address a global variable, but only one function may address a static one. But what type of global variable is used by one function? This exposes a flaw in the program's source code: If a variable is used by one function, it does not need to be declared as global.

A memory location addressed directly is a global variable (although there are exceptions), but not all global variables are addressed directly.

Constants and Offsets
The 80x86 microprocessor family supports three types of operands: immediate, register, and memory. An operand's type is specified in a special field of the computer instruction, called mod; therefore, it is not difficult to identify operand types.

You likely know what a register looks like: Conventionally, a pointer to the memory location is enclosed in square brackets, and an immediate operand is written outside them. For example:

mov ecx, eax ; Register operand
mov ecx, 0x666 ; The left operand is register. The right operand
; is immediate.
mov [0x401020], eax ; The left operand is a pointer. The right
; operand is register.

In addition, 80x86 microprocessors support two memory-addressing modes: direct and register indirect. If the operand is immediate, addressing is direct. If the operand is a pointer stored in a register, the addressing is register indirect. For example:

mov ecx, [0x401020] ; Direct addressing mode
mov ecx, [EAX] ; Register indirect addressing mode

To initialize the register pointer, microprocessor developers introduced a special command, LEA REG, [addr], that computes the value of the addr address expression and writes it into the REG register. For example:

lea eax, [0x401020] ; The value 0x401020 is written into the EAX
; register.
mov ecx, [EAX] ; Indirect addressing: The double word
; at the offset 0x401020 is loaded into ECX.

The right operand of the LEA instruction always represents a near pointer (except when LEA is used to sum the constants). Everything would be fine, except an internal representation of the near pointer is equal to a constant of the same value. Hence, LEA EAX, [0x401020] is the equivalent of MOV EAX, 0x401020. For certain reasons, MOV has surpassed LEA in popularity, and has knocked it almost completely out of use.

The expulsion of LEA has given rise to a fundamental problem of assembling — the offset problem. It is impossible to distinguish the syntax of constants and offsets (near pointers). A construction such as MOV EAX, 0x401020 may load EAX either with the constant 0x401020 (an example of the corresponding C code would be a=0x401020), or with the pointer to the memory location at the offset 0x401020 (an example of the corresponding C code would be a=&x). Obviously, a=0x401020 is different from a=&x. What would happen if the x variable in the newly assembled program appears at another offset, not at 0x401020? The program would fail, because the a pointer still points to the memory location 0x401020, which contains a different variable.

Why may a variable change its offset? There are two principal reasons. First, the assembler language is ambiguous and allows interpretation. For example, the ADD EAX, 0x66 construction may be represented by two machine instructions — 83 C0 66 and 05 66 00 00 00 — of 3 and 5 bytes. The compiler may choose either instruction, which may not be the one in the initial program (before it was disassembled). If the compiler picks the wrong-size instruction, then all other instructions, as well as data, will float away. Second, modifying the program — really changing it, not just substituting JNZ for JZ — will inevitably cause the pointers to float away. The offset instruction may help return the program to a functioning state. If MOV EAX, 0x401020 loads a pointer into EAX, not a constant, then a label such as loc_401020 needs to be created at the offset 0x401020, and MOV EAX, 0x401020 needs to be replaced with MOV EAX, offset loc_401020. Now, the EAX pointer is not bound to the fixed offset; rather, it is bound to the label.

What happens if the offset instruction is put before a constant that has been misidentified as a pointer? The program will fail, or it will work incorrectly. Imagine that the number 0x401020 represents the volume of water in a pool that has an inlet pipe and an outlet pipe. Replacing the constant with the pointer makes the volume of the pool equal the offset of the label in the newly assembled program, and computation becomes impossible.

Thus, it is important to determine every immediate operand's type, and even more important to determine it correctly. One mistake can kill a program's operability, and a typical program contains thousands or tens of thousands of operands. Two questions arise: How are operand types determined? Is it possible to determine them automatically, or at least semiautomatically?

Determining the type of an immediate operand An immediate operand of the LEA instruction is a pointer. (However, to mislead hackers, some forms of protection use LEA to load a constant.)

Immediate operands of the MOV and PUSH instructions may be either constants or pointers. To determine the type of an immediate operand, it is necessary to analyze how its value is used in the program. If it is used for addressing memory operands, it is a pointer; otherwise, it is a constant.

Suppose that the MOV EAX, 0x401020 instruction turns up in the code of a program (Fig. 19). What is it — a constant or a pointer? The answer to this question is given by the MOV ECX, [EAX] line, which indicates that the value 0x401020 is used to address the memory indirectly. Hence, the immediate operand can only be a pointer.

Figure 19: Addressing modes
There are two types of pointers: pointers to data and pointers to a function. Pointers to data are used to extract values from memory locations. They occur in arithmetic and move instructions (such as MOV, ADD, SUB). Pointers to functions are used in indirect call instructions (CALL) and, less often, in indirect jump instructions (JMP).

Consider the following example:

Listing 137: An Example of Constants and Pointers

main ( )
{
static int a=0x777;
int *b = &a;
int c=b [0];
}

Disassembling the compiled code of this example gives the following:

Listing 138: The Disassembled Code That Illustrates Constants and Pointers

main proc near

var_8 = dword ptr -8
var_4 = dword ptr -4

push ebp
mov ebp, esp
sub esp, 8
; The stack frame is opened.

mov [ebp+var_4], 410000h
; The value 0x410000 is loaded into the var_4 local variable.
; As yet, it is not possible to say
; whether it is a constant or a pointer.

mov eax, [ebp+var_4]
; The contents of the var_4 local variable
; are loaded into the EAX register.

mov ecx, [eax]
; ECX is loaded with the contents of the memory location pointed
; to by the EAX pointer. This means that EAX is a pointer.
; Therefore, the var_4 local variable from which it was loaded
; is also a pointer, and the immediate operand 0x410000
; is a pointer, not a constant. To preserve the program's
; operability, the loc_410000 label must be created at the offset
; 0x410000. The label will convert the memory location at this
; address into a double word. In addition,
; the MOV [ebp+var_4], 410000h instruction must be replaced with
; MOV [ebp+var_4], offset loc_410000.

mov [ebp+var_8], ecx
; The value of *var_4 (offset loc_41000)
; is assigned to the var_8 local variable.

mov esp, ebp
pop ebp
; The stack frame is closed.

retn
main endp

The following example calls a procedure indirectly:

Listing 139: An Indirect Call of a Procedure

func(int a, int b)
{
return a+b;
};

main ( )
{
int (*zzz) (int a, int b) = func;

// The function is called indirectly using the zzz pointer.
zzz (0x666, 0x777);
}

The disassembled code of the compiled example looks as follows:

Listing 140: The Disassembled Code That Illustrates Indirect Procedure Calls

.text:0040100B main proc near ; CODE XREF: start+AF↓p
.text:0040100B
.text:0040100B var_4 dword ptr -4
.text:0040100B
.text:0040100B push ebp
.text:0040100C mov ebp, esp
.text:0040100C ; The stack frame is opened.
.text:0040100C
.text:0040100E push ecx
.text:0040100E ; Memory is allocated for
.text:0040100E ; the var_4 local variable.
.text:0040100E
.text:0040100F mov [ebp+var_4], 401000h
.text:0040100F ; The value 0x401000 is assigned to
.text:0040100F ; the local variable.
.text:0040100F ; It is not yet possible to say
.text:0040100F ; whether it is a constant or an offset.
.text:0040100F
.text:00401016 push 777h
.text:00401016 ; The value 0x777 is placed onto the stack.
.text:00401016 ; Is it a constant, or a pointer?
.text:00401016 ; This cannot be determined
.text:00401016 ; before the called function is analyzed.
.text:00401016
.text:0040101B push 666h
.text:0040101B ; The immediate value 0x666 is placed
.text:0040101B ; onto the stack.
.text:0040101B
.text:00401020 call [ebp+var_4]
.text:00401020 ; The function is called indirectly.
.text:00401020 ; Hence, the var_4 variable is a pointer.
.text:00401020 ; Therefore, the immediate value
.text:00401020 ; assigned to it, 0x401000, is also a pointer.
.text:00401020 ; In addition, 0x401000 is the address
.text:00401020 ; where the called function is located.
.text:00401020 ; Let's name it MyFunc,
.text:00401020 ; and replace mov [ebp+var_4], 401000h with
.text:00401020 ;mov [ebp+var_4], offset MyFunc.
.text:00401020 ; Now, the program can be modified
.text:00401020 ; without any fear of collapse.
.text:00401020
.text:00401023 add esp, 8
.text:00401023
.text:00401026 mov esp, ebp
.text:00401028 pop ebp
.text:00401028 ; The stack frame is closed.
.text:00401028
.text:00401029 retn
.text:00401029 main endp

.text:00401000 MyFunc proc near
.text:00401000 ; Here is the indirectly called MyFunc function.
.text:00401000 ; Let's examine it to determine type of the
.text:00401000 ; immediate values passed to it.
.text:00401000
.text:00401000 arg_0 = dword ptr 8
.text:00401000 arg_4 = dword ptr 0Ch
.text:00401000 ; Here are the arguments.
.text:00401000
.text:00401000 push ebp
.text:00401001 mov ebp, esp
.text:00401001 ; The stack frame is opened.
.text:00401001
.text:00401003 mov eax, [ebp+arg_0]
.text:00401003 ; The value of the arg_0 argument is loaded into EAX.
.text:00401003
.text:00401006 add eax, [ebp+arg_4]
.text:00401006 ; The value of the arg_0 argument is added to
.text:00401006 ; EAX (arg_0). This operation indicates that
.text:00401006 ; at least one of the two arguments is not
.text:00401006 ; a pointer; adding two pointers is senseless.
.text:00401006
.text:00401009 pop ebp
.text:00401009 ; The stack frame is closed.
.text:00401009
.text:0040100A retn
.text:0040100A ; The sum of the two arguments
.text:0040100A ; is returned into EAX. The immediate
.text:0040100A ; values 0x666 and 0x777 were used neither here
.text:0040100A ; nor in the calling function for addressing
.text:0040100A ; memory, which means that they are constants.
.text:0040100A
.text:0040100A MyFunc endp
.text:0040100A

Complex cases of addressing, or arithmetic operations over pointers C/C++ and some other programming languages allow arithmetic operations over pointers that complicate the identification of the direct operand type. If such operations on pointers were forbidden, the occurrence of any arithmetic instruction that handles an immediate operand would indicate a constant-type operand.

Fortunately, even in languages that allow arithmetic operations over pointers, only a limited number of such operations are carried out. For example, it makes no sense to add two pointers — and even less to multiply or divide them. Subtraction is another matter. The compiler allocates functions in memory in the order they were declared in the program; therefore, it is possible to calculate the size of a function by subtracting the pointer to the function from the pointer to the next function (Fig. 20). Such a trick sometimes is used in packers (unpackers) of executable files and protection with self-modifying code, but it is rarely used in application programs.

Figure 20: Subtracting pointers to calculate the size of a function (a data structure)

Figure 21: The main types of strings

Figure 22: A schematic representation of the nest
A pointer also may be combined with a constant. These combinations are so popular that 80x86 microprocessors have a special addressing mode for the purpose. Suppose that we have a pointer to an array and the index of a certain element of the array. To obtain the value of the element, the index, multiplied by the size of the element, must be added to the pointer. Subtraction of a constant from the pointer is used rarely; it is required by a smaller scope of calculations, it often results in serious problems. The following technique is popular among beginners: To get an array whose index begins with one, they declare a standard array, obtain a pointer to it, and decrease the pointer by one. This appears to be an elegant solution. Nevertheless, consider what happens if the pointer to the array is equal to zero. In this situation, "the snake will bite itself by the tail", and the index will become a large positive number. Generally, under Windows NT/9x, an array cannot be allocated at an offset of zero. However, it is unwise to get used to the tricks that work on one platform and not on others.

"Normal" programming languages forbid the mixing of different types. Such mixing can result in a mishmash and a fundamental problem of disassembling — determining types in combined expressions. Consider the following example:

mov eax, 0x...
mov ebx, 0x...
add eax, ebx
mov ecx, [eax]

It seems to be a two-headed camel! The sum of two immediate values is used for indirect addressing. It is logical to assume that both values cannot be pointers. One of the immediate values must be a pointer to an array (a data structure or an object); the other one must be an index to this array. To preserve the program's operability, the pointer must be replaced with the offset to the label, and the index must be left unchanged because it is of a constant type.

How can the pointer be distinguished from the index? Unfortunately, there is no universal answer; it is impossible in the context of the above example.

Instead, consider the following example:

Listing 141: Determining Types in Combined Expressions

MyFunc (char *a, int i)
{
a[i]='\n';
a[i+1]=0;
}
main ( )
{
static char buff [ ] ="Hello, Sailor!";
MyFunc (&buff[0], 5);

The disassembled code of this example, compiled using Microsoft Visual C++, gives the following:

Listing 142: The Disassembled Code for Determining Types in Combined Expressions Compiled Using Visual C++

main proc near ; CODE XREF: start+AF↓p
push ebp
mov ebp, esp
; The stack frame is opened.

push 5
; The immediate value 0x5 is passed to MyFunc.

push 405030h
; The immediate value 0x405030 is passed to MyFunc.

call MyFunc
add esp, 8

; The MyFunc (0x405030, 0x5) function is called.

pop ebp
; The stack frame is closed.

retn
main endp

MyFunc proc near ; CODE XREF: main+A↑p

arg_0 = dword ptr 8
arg_4 = dword ptr 0Ch
push ebp
mov ebp, esp
; The stack frame is opened.

mov eax, [ebp+arg_0]
; The value of the arg_0 argument is loaded into EAX.
; (The arg_0 argument contains the immediate value 0x405030.)

add eax, [ebp+arg_4]
; The value of the arg_4 argument is added to EAX.
; (The arg_4 argument contains the value 0x5.)
; This operation indicates that one argument is a constant,
; and the other is either a constant or a pointer.

mov byte ptr [eax], OAh
; The sum of immediate values is used to indirectly address
; the memory, meaning that this is a case of a constant and a pointer.
; But which is which? To answer this question, it is necessary
; to understand the sense of the program code: What did
; the programmer want to achieve by adding pointers?
; Assume that the value 0x5 is a pointer. Is this logical?
; Not quite; if this is a pointer, then where does it point?
; The first 64 KB of the address space of Windows NT
; are reserved for "catching" uninitialized and null pointers.
; It is clear that a pointer cannot be equal to five in any case,
; unless the programmer has used some cunning trick.
; And if 0x401000 is a pointer? It looks like a fair
; and legal offset. But what is coming up now?
; 00401000 db 'Hello,Sailor!',0
;
; Now everything matches - a pointer to the "Hello, Sailor!" string
; (value 0x401000) and the index of a character of this string
; (value 0x5) are passed to the function; the function has added
; the index to the pointer, and has written the \n character into
; the memory location thus obtained.

mov ecx, [ebp+arg_0]
; The value of the arg_0 argument is placed into ECX
; (a pointer, as was established previously).

add ecx, [ebp+arg_4]
; The arg_0 and arg_4 arguments are added.
; (The arg_4 argument is an index, as was established previously.)

mov byte ptr [ecx+1], 0
; This is the sum stored in ECX to indirectly address the memory
; (or, to be more exact, for the indirect-based addressing,
; because 1 is added to the sum of the pointer and index, and 0 is
; placed into this memory location). As suspected, the pointer to
; the string and the index of the first character of the string
; being cut off are passed to the function. Therefore, to
; preserve the program's operability, a loc_s0 label needs to be
; created at the offset 0x401000. In addition, PUSH 0x401000 must
; be replaced in the calling function with PUSH offset loc_s0.
pop ebp
retn
MyFunc endp

Now let's compile the same example in Borland C++ 5.0, and see the difference compared to the code obtained from Microsoft Visual C++. To save space, the code of only one function, MyFunc, is presented; the main function is almost identical to the one in the previous listing.

Listing 143: The Disassembled Code for Determining Types in Combined Expressions Compiled Using Borland C++ 5.0

MyFunc proc near ; CODE XREF: _main+D↑p
push ebp
; The empty stack frame is opened; there are no local variables.

mov byte ptr [eax+edx], OAh
; Borland C++ has immediately summed the pointer and the constant
; right in the address expression! Which register stores
; the constant, and which stores the pointer? As in
; the previous listing, this needs an analysis of their values.
mov byte ptr [eax+edx+1], 0
mov ebp, esp
pop ebp
; The stack frame is closed.

retn
MyFunc endp

The order of indexes and pointers A little secret: When summing an index with a constant, most compilers put a pointer in the first position and a constant in the second, regardless of their order in the program. In other words, the expressions a [i], (a+i) [0], * (a+i), and * (i+a) are compiled into the same code. Even if (0) [i+a] is used, the compiler will put a in the first place. Why? The answer is ridiculously simple: The addition of a pointer to a constant gives a pointer. Therefore, the result of computation is always written into a pointer-type variable.

Let's return to the previous listing and apply this new rule in the analysis.

Listing 144: The Result of Adding the Constant to the Pointer Is Written into the Pointer-Type Variable

mov eax, [ebp+arg_0]
; The value of the arg_0 argument is loaded into EAX.
; (Thearg_0 argument contains the immediate value 0x405030.)

add eax, [ebp+arg_4]
; The value of the arg_4 argument (containing
; the value 0x5) is added to EAX. This operation
; indicates that one argument is a constant,
; while the other is either a constant or a pointer.

mov byte ptr [eax], 0Ah
; The sum of immediate values is used to address
; the memory indirectly, hence it is either a constant or
; a pointer. But which one? EAX is most likely to be
; a pointer because it is positioned in the first place,
; and var_4 is likely an index because it comes second.

Using LEA to sum constants The LEA instruction is widely used by compilers not only to initialize indexes, but also to sum constants. Because the internal representation of constants and indexes is identical, the result of adding two indexes is the same as the sum of the constants that match them (i.e., LEA EBX, [EBX+0x666] == ADD EBX, 0x666). However, the functionality of LEA considerably outperforms ADD. Consider LEA ESI, [EAX*4+EBP-0x20]. Try to feed the same to the ADD instruction.

After you encounter the LEA instruction in the code of a program, do not hurry to stick the tag "pointer" on it; the instruction may be a constant. If the "suspect" is never used to address expressions indirectly, it is not a pointer; rather, it is a true constant.

Identifying the constants and pointers "visually" Here are some hints that may help you distinguish pointers from constants:

In 32-bit Windows programs, pointers can accept a limited range of values. The region of address space accessible to processors begins with the offset 0x1.00.00 and stretches to the offset 0x80.00.00.00; in Windows 9x/ME, the accessible space is even smaller — from 0x40.00.00 to 0x80.00.00.00. Therefore, all immediate values smaller than 0x1.00.00 and larger than 0x80.00.00 represent constants, rather than pointers. There is one exception: the number 0, which designates the null pointer[i].

If an immediate value looks like an index, check where it points. If a function prolog or a meaningful text string is located at this offset, it is likely that this is a pointer, although this may be coincidence only.

Look at the table of relocatable elements. (See "Step Four: Getting Acquainted with the Debugger.") If the address of the "suspected" immediate value is present in the table, it is a pointer. However, most executable files are not relocatable. Such an approach can be used only to examine DLLs, since these are relocatable by definition.

Incidentally, the IDA Pro disassembler uses all three methods just described to identify the pointers automatically.

Literals and Strings
At first glance, identifying strings would seem to present few difficulties: If the object referred to by a pointer (see "Constants and Offsets") looks like a string, it certainly is a string. Moreover, in most cases, strings are revealed and identified simply by looking through the dump of a program (if it is not encrypted; encryption is a theme for a separate discussion). This is all true, but there are some complications.

The first task is automatic detection of strings in the program — megabyte-size dumps cannot be examined manually. There is a set of algorithms for identifying strings. The simplest, although not the most reliable, is based on the following two ideas:

The string consists of a limited set of characters. As a rough approximation, the characters are digits and letters of the alphabet (including blanks), punctuation marks, and control characters, such as tabulation or carriage-return characters.

The string should consist of at least several characters.

Let's agree that if the minimal length of the string is N bytes, it is enough to find all sequences of N or more valid string characters. If N is small (about 3 or 4 bytes, for example), the search will generate plenty of false hits. If N is large (about 6 or 8 bytes, for example), the number of false hits will be close to zero and can be ignored, but all short strings (such as "OK", "YES", or "NO") will not be recognized. In addition to digits and letters, strings may contain pseudo-graphic elements (an especially frequent feature in console applications), faces, arrows, marks — almost everything that the ASCII table contains. Is there, therefore, any difference between a string and a random sequence of bytes? Frequency analysis is useless here; for normal work, it needs at least 100 bytes of text, not strings of just two or three characters. The problem can be approached from the other side as well: If a string is present in a program, there must be a reference to it. It is possible to search among immediate values for the pointer to the recognized string. If it is found, then the chances that it is a string, and not just a random sequence of bytes, increase sharply.

However, it is not quite that easy. Consider the following example:

Listing 145: A Text String within a Program

BEGIN
WriteLn ('Hello, Sailor!');
END

Compile this example using any suitable Pascal compiler (Delphi or Free Pascal, for example). After loading the compiled file into the disassembler, walk through the data segment. Soon, the following will appear:

Listing 146: The Contents of the Data Segment of the Compiled Example

.data:00404040 unk_404040 db OEh ;
.data:00404041 db 48h ; H
.data:00404042 db 65h ; e
.data:00404043 db 6Ch ; 1
.data:00404044 db 6Ch ; 1
.data:00404045 db 6Fh ; 0
.data:00404046 db 2Ch ; ,
.data:00404047 db 20h ;
.data:00404048 db 53h ; s
.data:00404049 db 61h ; a
.data:0040404A db 69h ; i
.data:0040404B db 6Ch ; 1
.data:0040404C db 6Fh ; o
.data:0040404D db 72h ; r
.data:0040404E db 21h ; !
.data:0040404F db 0 ;
.data:00404050 word_404050 dw 1332h

This is the sought string, and there is no doubt that it is a string. Now, let's try to work out how it was referred to. In IDA Pro, this is done by using the ALT+I key combination and entering the offset of the beginning of the string — 0x404041 — into the search field.

"Search Failed?" How can that be? What is passed to the WriteLn function in that case? Has IDA become faulty? Looking through the disassembled code also fails to return a result.

It fails because in Pascal, there is a byte at the beginning of strings that contains the length of the string. The value 0xE (14 in the decimal system) is contained in the dump at the offset 0x404040. And how many characters are there in the string "Hello, Sailor!"? Fourteen. Pressing the ALT+I combination again and searching for the immediate operand equal to 0x404040 returns the following:

Listing 147: The Result of Searching for the Immediate Operand

.text:00401033 push 404040h
.text:00401038 push [ebp+var_4]
.text:0040103B push 0
.text:0040103D call FPC_WRITE_TEXT_SHORTSTR
.text:00401042 push [ebp+var_4]
.text:00401045 call FPC_WRITELN_END
.text:0040104A push offset loc_40102A
.text:0040104F call FPC_IOCHECK
.text:00401054 call FPC_DO_EXIT
.text:00401059 leave
.text:0040105A retn

Identifying a string appears to be insufficient; in addition, at least its boundaries must be determined.

The following types of strings are most popular: C strings, ending in zero; DOS strings, ending in $; and Pascal strings, beginning with a one-, two-, or four-byte field that contains the string length. Let's consider each of these types in more detail.

C strings Also called ASCIIZ strings (Z means Zero at the end), C strings are widely used in operating systems of the Windows and Unix families. The character "\0" (not to be confused with "0") has a special task, and is interpreted in a special way — as a string terminator. The length of ASCIIZ strings is limited only by the size of the address space allocated for the process, or by the size of the segment. Accordingly, the maximum size of an ASCIIZ string is only a little less than 2 GB in Windows NT/9x, and it is about 64K in Windows 3.1 and MS-DOS. The ASCIIZ string is only 1 byte longer than the initial ASCII string. Despite these advantages, ASCIIZ strings have certain drawbacks. First, an ASCIIZ string cannot contain zero bytes; therefore, it is not suitable for processing binary data. Second, performing copying, comparison, and concatenation over C strings incurs significant overhead. Working with single bytes is not the best variant for modern processors; it is better for them to deal with double words. Unfortunately, the length of ASCIIZ strings is unknown beforehand; it must be computed "on the fly", checking each byte to see whether or not it is a string terminator. However, certain compilers use a trick: They terminate the string with seven zeros, making it possible to work with double words, thus increasing the speed noticeably. Initially, it seems strange to add seven trailing zeros rather than four, as a double word contains 4 bytes. However, if the last significant character of the string falls on the first byte of the double word, its end will be taken up with three 0 bytes, but the double word will not equal zero any more because of the intervention of the first character. Therefore, the following double word should be given four more 0 bytes, in which case it certainly will equal zero. However, seven auxiliary bytes for each string is too much.

DOS strings In MS-DOS, the function that outputs lines reads the "$" character as the end-of-line character, which is why programmers call them DOS strings. The term is not absolutely correct — all other MS-DOS functions work exclusively with ASCIIZ strings. This strange terminator character was chosen when there was no graphic interface in sight, and console terminal was considered a rather advanced system for interaction with the user. could not be used to end the line, since it was sometimes necessary to enter several lines into the program at once. Combinations like + or +<000> were also unsuitable since many keyboards at that time did not contain the and keys. Computers were mainly used to solve engineering tasks, not accounting ones, and the dollar sign was the least-used character. Therefore, it was used to signal that the user had finished entering the line — in other words, as a string terminator. (Yes, the string terminator was entered by the user; it was not added by the program, as is the case with ASCIIZ strings). Now, DOS strings are encountered very rarely.

Pascal strings Pascal strings have no terminator character; instead, they are preceded by a special field containing the string length. The advantages of this approach are the possibility of storing any characters in the string (including 0 bytes), and the high speed of processing the string variables. Instead of constantly checking each byte to find a terminator, memory is addressed only once — when the string length is read. If the string length is known, then it is possible to work with double words that are the native data type for 32-bit processors, not with single bytes. The only question is how many bytes to allocate for the size field. Allocating only 1 byte is economical, but the maximum length of the string will be limited to 255 characters, an insufficient amount in many cases. This type of string is used by practically all Pascal compilers (Borland Turbo Pascal and Free Pascal, for example); therefore, such strings are called Pascal strings, or, more exactly, short Pascal strings.

Delphi strings Realizing the absurdity of restricting the length of Pascal strings to 255 characters, the Delphi developers expanded the size field to 2 bytes, thus increasing the greatest possible length to 65,535 characters. Although such strings are supported by other compilers (Free Pascal, for example), they are traditionally called Delphi strings or two-byte Pascal strings.

The restriction to more than 60K can hardly be called a restriction. Most strings are much shorter, and the heap (dynamic memory), as well as a number of specialized functions, can be used to process large data files (text files, for example). The overhead (two auxiliary bytes for each string variable) is not substantial enough to be taken into account. Therefore, Delphi strings, which combine the best features of C and Pascal strings (practically unlimited length and high processing speed, respectively), seem to be the most convenient and practical type.

Wide Pascal strings Wide Pascal strings have as many as 4 bytes for the size field, thus "limiting" the length to 4,294,967,295 characters, or 4 GB, even more than the amount of memory that Windows NT/9x allocates for "personal use" by an application process. However, this luxury comes at a high price, as each string has four extra bytes, three of which will remain empty in most cases. The overhead incurred by using Wide Pascal strings becomes rather substantial; therefore, this type is rarely used.

Combined types Certain compilers use a combined C-Pascal type. On one hand, the combined C-Pascal type allows you to process strings at a high speed and store any characters in such strings. On the other hand, it provides compatibility with a huge quantity of C libraries that work with ASCIIZ strings. Each combined string is forcefully terminated with zero, but this zero does not appear in the string. Regular libraries (operators) of the language work with it, as with a Pascal string. When calling functions of C libraries, the compiler passes a pointer to the first character of the string, not to its true beginning.

Determining string types It is rather difficult to determine the type of a string by its appearance. The presence of zero terminators at the end of a string is not a sufficient reason to label it an ASCIIZ string: Pascal compilers often add one or several zeroes to the end of a string to align data on boundaries that are multiples of power-of-2 values.

The string type can be determined roughly by the type of compiler (C or Pascal), and precisely by the processing algorithm (i.e., by an analysis of the code that handles it). Consider the following example.

Listing 148: Identifying Strings

VAR
s0, s1 : String;

BEGIN
s0 :='Hello, Sailor!';
s1 :='Hello, World!';
IF s0=s1 THEN WriteLN('OK') ELSE Writeln('Woozl');
END.

After compiling this using Free Pascal, look in the data segment, where the following line can be found:

.data:00404050 aHelloWorld db 0Dh, 'Hello, World!',0 ; DATA XREF:_main+2B↑o

Isn't it reminiscent of an ASCIIZ string? Even if the compiler has yet to be identified, no one would think that 0xD is the length field rather than the carriage-return character. To test the hypothesis concerning type, proceed according to the cross-reference found by IDA Pro, or find the immediate operand 0x404050 (the offset of the string) in the disassembled code manually.

push offset _S1 ; A pointer is passed
; to the string destination.
push offset aHelloWorld ; "\rHello, World!" A pointer is passed
; to the string source.
push 0FFh ; This is the maximum length
; of the string.
call FPC_SHORTSTR_COPY

The pointer to the string was passed to the FPC_SHORTSTR_COPY function. From the documentation supplied with Free Pascal, it is clear that this function works with short Pascal strings. Therefore, the 0xD byte is not a carriage-return character, but the string length. How would it be possible to discover this without the Free Pascal documentation? It is hardly possible to get documentation for every compiler. Incidentally, the regular delivery of IDA Pro, including version 4.17, does not contain the signatures of FPP libraries, which have to be created manually.

When the string function is unidentified or does not have a description, the only way out is to investigate the code to find its operation algorithm. This is shown in the following example.

Listing 149: The Code of the FPC_SHORTSTR_COPY Function

FPC_SHORTSTR_COPY proc near ; CODE XREF: sub_401018+21↑p

arg_0 = dword ptr 8 ; Maximum length of the string
arg_4 = dword ptr 0Ch ; A source string
arg_8 = dword ptr 10h ; A destination string

push ebp
mov ebp, esp
; The stack frame is opened.

push eax
push ecx
; Registers are saved.

cld
; The direction flag is reset (i.e.,
; the LODS, STOS, and MOVS instructions are forced to increment
; the register pointer).

mov edi, [ebp+arg_8]
; The value of the arg_8 argument is loaded into the EDI register
; (the offset of the destination buffer).

mov esi, [ebp+arg_4]
; The value of the arg_4 argument is loaded into the ESI register
; (the offset of the source string).

xor eax, eax
; The EAX register is forced to be zero.

mov ecx, [ebp+arg_0]
; The value of the arg_0 argument is loaded into ECX
; (the maximum allowable length).

lodsb
; The first byte of the source-string pointer is loaded
; to the ESI register, and ESI is incremented by one.
cmp eax, ecx
; The first byte of the string is compared with the maximum string
; length. It is already clear that the first character of the
; string is the length. However, let's pretend the purpose
; of the arg_0 argument was unclear, and continue the analysis.

jbe short loc_401168
; if (ESI[0] <= arg_0) goto loc_401168

mov eax, ecx
; The ECX value is copied to EAX.
loc_401168: ; CODE XREF: sub_401150+14↑j
stosb
; The first byte of the source string is written into the
; destination buffer, and EDI is incremented by one.

cmp eax, 7
; The string length is compared with the 0x7 constant.

j1 short loc_401183
; Is the string length less than 7 bytes?
; Then it is being copied byte by byte!

mov ecx, edi
; ECX is loaded with the pointer to the destination buffer,
; which was incremented by one. (It was incremented
; by the STOSB instruction when a byte was written.)

neg ecx
; ECX is complemented to zero, NEG(0xFFFF) = 1;
; ECX :=1

and ecx, 3
; The three least significant bits are left in ECX,
; and the others are reset. ECX :=1

sub eax, ecx
; The "castrated" ECX is subtracted from EAX (which contains
; the first byte of the string).
repe movsb
; ECX bytes are copied from the source string
; into the destination buffer. In this case, 1 byte is copied.

mov ecx, eax
; Now, ECX contains the value of the first byte of the string,
; which is decremented by one.

and eax, 3
; The three least-significant bits are left in EAX.
; The others are reset.

shr ecx, 2
; Using the cyclic shift instruction,
; ECX is divided by four (2 to the second power is 4).

repe movsd
; ECX double bytes are copied from ESI to EDI.
; It becomes clear that ECX contains the string length.
; Since the value of the first byte of the string is loaded
; into ECX, it is possible to state confidently
; that the first byte of the string (just
; the byte, not the word) contains the length of this string.
; Therefore, it is a short Pascal string.

loc_401183: ; CODE XREF: sub_401150+1C↑j
mov ecx, eax
; If the string length is less than 7 bytes, then EAX contains
; the string length for its byte-by-byte copying (see the branch
; jbe short loc_401168). Otherwise, EAX contains the remainder of
; the string's "tail," which could not fill the last double word
; with itself. In one way or another, ECX is loaded
; with the number of bytes to be copied.

repe movsb
; ECX bytes are copied from ESI to EDI.

pop ecx
pop eax
; The registers are restored.

leave
; The stack frame is closed.

retn 0Ch
FPC_SHORTSTR_COPY endp

The next example will be helpful in the identification of C strings.

Listing 150: Identifying C Strings

#include
#include

main()
{
char s0[]="Hello, World!";
char s1[]="Hello, Sailor!";
if (strcmp(&s0[0], &s1[0])) printf("Woozl\n"); else printf("OK\n");
}

Compile this example using any suitable C compiler, such as Borland C++ 5.0. Microsoft C++ does not fit in the case. Then, look for the strings in the data segment. It should not take long to find them.

DATA:00407074 aHelloWorld db 'Hello, World!',0 ; DATA XREF:_main+16↑o
DATA:00407082 aHelloSailor db 'Hello, Sailor!',0 ; DATA XREF:_main+22↑o
DATA:00407091 aWoozl db 'Woozl',0Ah, 0 ; DATA XREF:_main+4F↑o
DATA:00407098 aOk db 'OK',0Ah,0 ; DATA XREF: _main+5C↑o

Note that the strings follow one another, each ends in a "0" character, and the value of the first byte of the string does not match its length. These are indubitably ASCIIZ strings. However, analyzing the code that handles them will not hurt.

Listing 151: An Analysis of the Code That Handles ASCIIZ Strings

_main proc near ; DATA XREF: DATA:00407044↓o
var_20 = byte ptr -20h
var_10 = byte ptr -10h
push ebp
mov ebp, esp
; The stack frame is opened.

add esp, 0FFFFFFE0h
; Space is allocated for local variables.

mov ecx, 3
; The value 0x3 is placed into the ECX register.

lea eax, [ebp+var_10]
; EAX is loaded with the pointer to the var_10 local buffer.

lea edx, [ebp+var_20]
; EDX is loaded with the pointer to the var_20 local buffer.

push esi
; The ESI register is saved -
; not passed to the function,
; since ESI has not been initialized yet!

push edi
; The EDI register is saved.

lea edi, [ebp+var_10]
; EDI is loaded with the pointer to the var_10 local buffer.

mov esi, offset aHelloWorld ; "Hello, World!"
; IDA has recognized the immediate operand as an offset
; of the "Hello, World!" string. If it had not, it would be
; possible to do it manually, given that
; the immediate operand coincides with the offset of the string,
; and that the next instruction uses ESI
; to address memory indirectly.
; Hence, a pointer is loaded into ESI.

repe movsd
; ECX double words are copied from ESI to EDI.
; What does ECX equal? It equals 0x3.
; To convert double words into bytes, multiply 0x3 by 0x4.
; This obtains 0xC, which is one byte shorter than the copied
; "Hello, World!" string, pointed to by ESI.

movsw
; The last byte of the "Hello, World!" string
; is copied, with the terminating zero.

lea edi, [ebp+var_20]
; EDI is loaded with the pointer to the var_20 local buffer.

mov esi, offset aHelloSailor ; "Hello, Sailor!"

; The ESI register is loaded with the pointer to the
; "Hello, Sailor!" string.

mov ecx, 3
; ECX is loaded with the number of complete double words contained
; in the "Hello, Sailor!" string.

repe movsd
; The 0x3 double words are copied.

movsw
; A word is copied.

movsb
; The last byte is copied.
; A function for comparing the strings.

loc_4010AD: ; CODE XREF: _main+4B↓j
mov cl, [eax]
; The contents of the next byte of the
; "Hello, World!" string are loaded.

cmp cl, [edx]
; Is CL equal to the contents of the next byte of the
; "Hello, Sailor!" string?

jnz short loc_4010C9
; If the characters of both strings do not match, jump
; to the loc_4010C9 label.

test cl, cl
jz short loc_4010D8
; Is the CL register equal to zero? (In other words, has
; the "0" character been seen in the string?).
; If so, jump to loc_4010D8.
; Now the string type can be determined.
; The first byte of the string contains the first character
; of the string, not the string length. In addition, each byte of
; the string is checked for being a "0" character. Hence, these are
; ASCIIZ strings!

mov cl, [eax+1]
; The next character of the "Hello, World!" string is loaded into CL.

cmp cl, [edx+1]
; It is compared with the next character of "Hello, Sailor!".

jnz short loc_4010C9
; If the characters do not match, the comparison finishes.

add eax, 2
; The pointer of the "Hello, World!" string is moved ahead
; by two characters.

add edx, 2
; The pointer of the "Hello, Sailor!" string is moved ahead
; by two characters.
test cl, cl
jnz short loc_4010AD
; Repeat matching until the terminating character of the string
; is reached.

loc_4010C9: ; CODE XREF: _main+35↑j _main+41↓j
jz short loc_4010D8
; See the "Conditional IF-THEN-ELSE Statements" section.

; Outputting the string "Woozl"
push offset aWoozl ; format
call _printf
pop ecx
jmp short loc_4010E3

loc_4010D8: ; CODE XREF: _main+39↑j _main+4D↓j
; Outputting the string "OK"
push offset aOk ; format
call _printf
pop ecx

loc_4010E3: ; CODE XREF: _main+5A↑j
xor eax, eax
; The function returns zero.

pop edi
pop esi
; The registers are restored.

mov esp, ebp
pop ebp
; The stack frame is closed.

retn
_main endp

Turbo-initialization of string variables Distinguishing strings is not always simple. To illustrate this, it is enough to compile the previous example in the Microsoft Visual C++ and, using any suitable disassembler (IDA Pro, for example), open the file obtained.

It will take you an eternity to scroll through the data section. There is no trace of strings like "Hello, Sailor!" and "Hello, World!". The striking feature, however, is a strange bulk of double words:

.data:00406030 dword_406030 dd 6C6C6548h
.data:00406034 dword_406034 dd 57202C6Fh
.data:00406038 dword_406038 dd 646C726Fh
.data:0040603C word_40603C dw 21h
.data:0040603E align 4
.data:00406040 dword_406040 dd 6C6C6548h
.data:00406044 dword_406044 dd 53202C6Fh
.data:00406048 dword_406048 dd 6F6C6961h
.data:0040604C word_40604C dw 2172h
.data:0040604E byte_40604E db 0

What can they be? They are not pointers, since they do not point anywhere. Nor are they int-type variables, since none were declared in the program. Pressing the key to move into hex mode reveals the strings:

.data:00406030 48 65 6C 6C 6F 2C 20 57-6F 72 6C 64 21 00 00 00 "Hello, World!..."
.data:00406040 48 65 6C 6C 6F 2C 20 53-61 69 6C 6F 72 21 00 00 "Hello, Sailor!.."
.data:00406050 57 6F 6F 7A 6C 0A 00 00-4F 4B 0A 00 00 00 00 00 "Woozlo..OKo....."

So why has IDA Pro treated them like double words? Analyzing the code that handles the string will help to answer this. Before doing so, let's convert these double words into normal ASCIIZ strings. (The 'U' key converts double words into a chain of typeless bytes, and the "A" key converts them into strings.) Bring the cursor to the first cross-reference and press the key.

Listing 152: An Analysis of the Code That Manipulates Strings

main proc near ; CODE XREF: start+AF↓p
var_20 = byte ptr -20h
var_1C = dword ptr -1Ch
var_18 = dword ptr -18h
var_14 = word ptr -14h
var_12 = byte ptr -12h
var_10 = byte ptr -10h
var_C = dword ptr -0Ch
var_8 = dword ptr -8
var_4 = word ptr -4
; Where have so many variables come from?
push ebp
mov ebp, esp
; The stack frame is opened.

sub esp, 20h
; Memory is allocated for local variables.

mov eax, dword ptr aHelloWorld ; "Hello, World!"
; EAX is loaded - not with a pointer to the string
; "Hello, World!", but with the first 4 bytes of this string.
; Now it is obvious why IDA Pro has made a mistake -
; the original code (before it was converted into the string)
; looked like this:
; mov eax, dword_406030
; This is illustrative: If it were someone else's program under
; examination, this disassembler trick would be confusing.

mov dword ptr [ebp+var_10], eax
; The first 4 bytes of the string are copied
; into the var_10 local variable.

mov ecx, dword ptr aHelloWorld+4
; The 4th through 8th bytes of the string "Hello, World!"
; are loaded into ECX.

mov [ebp+var_C], ecx
; These bytes are copied into the var_C local variable.
; However, we already know that this is not a var_C variable,
; but a part of the string buffer.

mov edx, dword ptr aHelloWorld+8
; The 8th through 12th bytes of the string "Hello, World!"
; are loaded into EDX.

mov [ebp+var_8], edx
; These bytes are copied into the var_8 local variable or,
; to be more accurate, into the string buffer.

mov ax, word ptr aHelloWorld+0Ch
; The remaining two-byte tail of the string is loaded into AX.

mov [ebp+var_4], ax
; The tail is written into the var_4 local variable. Thus,
; fragments of the string are copied into the following local
; variables: int var_10; into var_0C; int var_8; short int var_4.
; Hence, this is actually one local variable:
; char var_10[14].

mov ecx, dword ptr aHelloSailor ; "Hello, Sailor!"
; The same copy operation is performed on the
; "Hello, Sailor!" string.

mov dword ptr [ebp+var_20], ecx
mov edx, dword ptr aHelloSailor+4
mov [ebp+var_1C], edx
mov eax, dword ptr aHelloSailor+8
mov [ebp+var_18], eax
mov cx, word ptr aHelloSailor+0Ch
mov [ebp+var_14], cx
mov dl, byte_40604E
mov [ebp+var_12], dl
; The "Hello, Sailor!" string is copied
; into the char var_20[14] local variable.

lea eax, [ebp+var_20]
; The register is loaded with the pointer to the var_20 local
; variable, which contains the "Hello, Sailor!" string.

push eax ; const char *
; It is passed to the strcmp function.
; From this, it can be inferred that var_20 actually stores
; a string, not a value of the int type.

lea ecx, [ebp+var_10]
; The pointer to the var_10 local variable, which
; stores the "Hello, World!" string, is loaded into ECX.

push ecx ; const char *
; It is passed to the strcmp function.

call _strcmp
add esp, 8
; strcmp("Hello, World!", "Hello, Sailor!")

test eax, eax
jz short loc_40107B
; Are the strings equal?

; Displaying the "Woozl" string
push offset aWoozl ; "Woozl\n"
call _printf
add esp, 4
jmp short loc_401088

; Displaying the "OK" string
loc_40107B: ; CODE XREF: sub_401000+6A↑j
push offset aOk ; "OK\n"
call _printf
add esp, 4

loc_401088: ; CODE XREF: sub_401000+79↑j
mov esp, ebp
pop ebp
; The stack frame is closed.
retn
main endp

Conditional IF- THEN -ELSE Statements
There are two kinds of algorithms, unconditional and conditional. The order of performing operations in unconditional algorithms is always invariable and does not depend on the input data — for example: a = b + c. The order of operations in conditional algorithms, on the contrary, depends on the input data. For example: IF c is not zero THEN a = b/c, ELSE send an error message.

Take note of the keywords IF, THEN, and ELSE marked in bold. These are called branches. No program can manage without them. (Simple examples like "Hello, World!" do not count.) Branches are the heart of any programming language. Therefore, it is extremely important to identify them correctly.

Without going into syntactic details of particular programming languages, branch statements can be schematically represented in this general form:

IF (condition) THEN {statement1; statementN;} ELSE {statement11;
statement1M;}

The task of the compiler is to translate this statement into a sequence of machine instructions that execute statement1 and statementN if condition is true and, respectively, statement11 and statement1M if it is false. Microprocessors of the 80x86 family, however, support rather modest set of conditional statements limited to conditional jumps. Programmers who are familiar only with PCs based on the 80x86 family will not consider such a limitation unnatural. However, there are plenty of processors supporting a prefix of conditional instruction execution: Instead of writing TEST ECX, ECX/JNZ xxx/MOV EAX, 0x666, it is possible to write TEST ECX, ECX/IFZ MOV EAX, 0x666. The IFZ is just the prefix of conditional execution; it allows the execution of the following instruction only if the zero flag is set.

In this sense, 80x86 microprocessors can be compared with the early dialects of BASIC, which did not admit any statement except GOTO into branches. Compare the following listings.

Listing 153: The New (Left) and Old (Right) BASIC Dialects

IF A=B THEN PRINT "A=B"

10 IF A=B THEN GOTO 30
20 GOTO 40
30 PRINT "A=B"
40 ... // The rest of code

Anyone familiar with old dialects of BASIC will probably remember that it is better to execute GOTO if the condition is false, and to continue the normal execution of the program otherwise. (Contrary to popular opinion, knowledge of BASIC programming is not entirely useless, especially in disassembling.)

Most compilers (even nonoptimizing ones) invert the value of the condition, converting the statement IF (condition) THEN {statement1; statementN;} into the following pseudo-code:

Listing 154: The Pseudo-Code Produced from the IF-THEN Branch

IF (NOT condition) THEN continue
statement1;
...
statementN;
continue:

...

Hence, to reconstruct the source code of the program, the condition must be inverted, and the block of statements {statement1; statementN;} must be stuck to the THEN keyword. Suppose the compiled code looked like this:

Listing 155: A Reconstruction of the Source Code of a Program

10 IF A<>B THEN 30
20 PRINT "A=B"
30 ...// The rest of the code

In this case, the source code must have contained the following lines: IF A=B THEN PRINT "A=B". However, could the programmer check the variables A and B for an inequality (i.e., IF A<>B THEN PRINT "A<>B")? The compiler would invert the value of the condition and generate the following code:

Listing 156: The Compiler Inverts the Condition

10 IF A=B THEN 30
20 PRINT "A<>B"
30 ...// The rest of the code

Certainly, you might encounter compilers that suffer from verbosity. They are easy to recognize by the unconditional jump that immediately follows the branch.

Listing 157: A Verbose Complier Is Recognized by the Unconditional Jump

IF (condition) THEN do
GOTO continue
do:
statement1;
...
statementN;
continue:

In cases like this, the conditional value does not need to be inverted. However, nothing terrible will happen if it is inverted; the code of the program merely may become less understandable.

Now let's consider how the complete statement IF (condition) THEN {statement1; statementN;} ELSE {statement11; statement1M;} may be converted. Some compilers act like this:

Listing 158: The Result of Converting the Complete IF-THEN-ELSE Statement

IF (condition) THEN do_it
// The ELSE branch is executed.
statement11;
...
statement1N;
GOTO continue:

do_it:
// The IF branch is executed.
statement1;
...
statementM;
continue:

Others convert it like this:

Listing 159: An Alternate Result of Converting the IF-THEN-ELSE Statement

IF (NOT condition) THEN else
// The IF branch is executed.
statement1;
...
statementM;
GOTO continue

else:
// The ELSE branch is executed.
statement11;
...
statement1M;
continue:

The latter inverts the condition value; the former does not. Therefore, without knowing the compiler's preferences, it will be impossible to figure out what the original code of the program looked like. However, this does not create any problems, since it is always possible to write the condition in a convenient form. For example, if you don't like the statement IF (c<>0) THEN a=b/c ELSE PRINT "Error!", you can write IF (c==0) THEN PRINT "Error!" ELSE a=b/c.

Types of conditions Conditions can be simple (elementary) or complex (compound). An example of the former is if (a==b)…; an example of the latter is if ((a==b) && (a!=0)) …. Thus, any complex conditional expression can be decomposed into several simple conditional expressions. Let's start with simple conditions.

There are two types of elementary conditions: relational conditions (with the operators "less", "equal", "less than or equal", "not equal", "greater than", and "greater than or equal", designated as <,==,>, <=, !=, and >=, respectively) and logical conditions (with the operators AND, OR, NOT, and exclusive OR, designated in C notation as &, |, !, and ^, respectively). Well-known hacking authority Matt Pietrek adds testing the bits in here as well. In this book, we will cover this topic separately.

A true expression returns the Boolean value TRUE; a false one returns FALSE. The internal (physical) representation of the Boolean variables can vary, depending on a particular implementation. Generally, FALSE is represented by zero, and TRUE is represented by a nonzero value. TRUE is often represented by 1 or −1, but this is not always the case. For example, IF ((a>b) !=0) … is correct, and IF ((a>b)==1) … is bound to a particular implementation, which is undesirable.

Note that IF ((a>b) !=0)… does not check the a and b variables for an inequality to zero; rather, it checks the result of their comparison. Consider the following example: IF ((666==777)==0) printf("Woozl!"). What will be displayed on the screen by launching this example? "Woozl!", of course.

Neither 666 nor 777 is equal to zero, but 666!=777. Therefore, the condition (666==777) is false and equal to zero. Incidentally, writing IF ((a=b)==0)… would give a different result: The value of the variable b would be assigned to the variable a, and then checked for equality to zero.

Logical conditions mostly are used to bind two or more elementary relational conditions into a compound condition (for example, IF ((a==b) && (a!=0)) …). When compiling the program, the compiler always resolves compound conditions into simple ones (in this example, as IF a==b THEN IF a=0 THEN…). At the second stage, the conditional statements are replaced by GOTO.

Listing 160: The Compiler Resolves Compound Conditions into Simple Ones

IF a!=b THEN continue
IF a==0 THEN continue
...// The code of the condition
:continue
... // The rest of code

The order of computing the elementary conditions in a complex expression is at the compiler's discretion. The only guarantee is that the conditions bound by the logical AND will be tested from left to right in the order they appear in the program. If the first condition is false, the next one will not be computed, which allows us to write code like if ((filename) and (f=fopen(&filename[0], "rw")))… If the filename pointer points to nonallocated memory area (i.e., contains zero, a logical FALSE), the fopen function is not called and the crash does not occur. These types of computations have been called fast Boolean operations.

Now, let's proceed to the problem of identifying logical conditions and analyzing complex expressions. Let's take the following expression: if ((a==b) &&(a!=0))… and see what happens when it is compiled.

The first variant is rather distinctive, comprising a series of tests (without inverting the condition), which pass control to the label preceding the code that is executed if the condition is true, with an unconditional jump at the end of the series, which passes control to the label that follows this code.

However, optimizing compilers eliminate the unconditional-jump instruction by inverting the test of the last condition in the chain and changing the jump address. Beginners often take this construction for a mix of OR and AND. Consider address. Beginners often take this construction for a mix of OR and AND. Consider the result of compiling the if ((a==b) || (a==c) && a(!=0))… statement.

Listing 164: A Compilation of the if ((a==b) || (a==c) && a(!=0)) Statement

IF a==b THEN check_null
IF a!=c THEN continue
check_null:
IF a==0 THEN continue
...// The code is executed if at least one of the
// conditions in the last two IF statements is true.
continue:
...// The rest of the code follows.

How can a single, readable compound condition be obtained from the impenetrable jungle of elementary conditions? Let's start from the beginning (i.e., from the first comparison operation). If the a==b condition happens to be true, it makes the a!=c condition "quit the game." Such a construction is typical for the OR operation (i.e., if one of two conditions is true, it is enough for the code to work). Keeping if ((a==b) || …) in mind, let's move on. If the (a!=c) condition is true, all further tests cease and control is passed to the label, located after the code, pertaining to conditions. It is reasonable to assume that this is the last OR operation in a chain of comparisons, which is typical behavior. Therefore, let's invert the expression of the condition and continue to write: if ((a==b) || (a==c)…). The last stage tests the a==0 condition. It will be impossible to execute code pertaining to conditions if the a==0 condition is bypassed. Hence, it is not OR; it is AND. But AND always inverts the condition. Consequently, the original code should look like this: if ((a==b) || (a==c) && (a!=0)).

Do not be deluded: This example is elementary. In reality, optimizing compilers can have you troubled.

Representing complex conditions as a tree A statement consisting of three or four elementary conditions can be analyzed mentally. However, patterns of five or more conditions form a labyrinth that is difficult to comprehend immediately. The ambiguity of translating complex conditions brings about an ambiguity of interpretation. This results in varying analyses, and progressively more information must be remembered with each step. It would be easy to go mad in such a case, or to get confused and obtain incorrect results.

The way out is to use a two-level system of translation. At the first stage, the elementary conditions are converted into an intermediate form that clearly and consistently represents the interrelation of the elementary operations. Then, the final translation is put into any suitable notation (for example, C, BASIC, or Pascal).

The only problem is how to choose the successful intermediate form. There are many options, but to save paper, let's consider just one: trees.

Let's represent each elementary condition as a node with two branches going to the appropriate states: the condition is true and the condition is false. For clarity, let's designate "false" as a triangle and "true" as a square. Let's agree always to place "false" on the left and "true" on the right. We will call the obtained design a nest.

Nests may be joined into trees. Each node can join only one nest, but each nest can join several nodes. Let's consider this in more detail.

Let's join two elementary conditions by a logical AND operation, taking the example ((a==b) && (a!=0)). Let's take the first condition from the left, (a==b), and place it in a nest with two branches, the left one corresponding to a!=b (i.e., the condition a==b is false), and the right one corresponding to the opposite case. Then, let's do the same with the second condition, (a!=0). This gives two nests; the only thing that remains is to join them with the logical AND operation. As you know, AND tests the second condition only when the first condition is true. Hence, the (a!=0) nest should be hitched to the right branch of (a==b). The right branch (a!=0) will correspond to the ((a==b) && (a!=0)) expression being true, and both left branches will correspond to this expression being false. Let's designate the first case with the do_it label, and the second case with the continue label. As a result, the tree should look like the one shown in Fig. 23.

Tuesday, September 22, 2009

Hackers Disassembling 1.1.7.9(Function Arguments)

Hacking KING

Hits on this blog from 6 july 2009

Blog Archive

hit counter