Sunday, November 22, 2009

Hacker Web Exploitation (Chapter 2: Vulnerabilities in Scripts)

Vulnerabilities in Scripts


Definition

A script is a small program written in an interpreted language such as PHP. A script can run on a server or on a client.

Data-Sending Methods

As I told you earlier, the cause of holes or vulnerabilities in Web applications is dynamic content, that is, different responses of scripts to different external conditions. By external conditions I mean data sent by a client or a browser using HTTP. Therefore, you should know how a client can send data, how HTTP regulates sending data to the server, and what the differences are among various data-sending methods.

The HTTP GET Method

The HTTP GET method is the most popular and the simplest method of sending data from a client to the server.


Definition

GET is a method for sending data using HTTP. According to this method, data are preceded by the address of the requested page and a question mark.

GET parameters can be edited by editing the address of the requested page. They are delimited with ampersands, and the name of a parameter is separated from its value with an equal sign.

For example, in the http:/localhost/2/1.php?test=hello&id=2 request, two GET parameters are sent to the http://localhost/2/1.php script. The first is test with the hello value, and the second is id with the 2 value.

The GET method sets the maximum length of the sent data. You cannot use GET to send files.

Here are a few examples of how data are sent as GET parameters.

The simplest way to create a request to a script involves using the HTML tag:



In the first request, two parameters, id = 21 and test = hello, are sent to the http://localhost/2/1.php script using the HTTP GET method.

In the second request, the same parameters are sent to the Image from book 1.PHP script located in the same directory on the same server as the current script.

In the third request, these parameters are sent to the /2/1.php script on the same server. The absolute path to the script from the root directory is specified.

Another example is the use of a form to send data:

If the action parameter isn't specified in the form header, the data will be sent to the current script. If the method parameter is omitted, the default method is GET.

In this example, two GET parameters are sent to the http://localhost/2/1.php script.

Look at data sent from a client to the Web server when the GET method is used. Here is an actual header sent by Mozilla 1.7.1. in the Windows 2000 operating system:

   GET /2/1.php?id=21&test=hello HTTP/1.1
Host: localhost
User-Agent: Mozilla/5.0 (Windows NT 5.0; en-US; rv:1.7.1)
Gecko/20040707
Accept: */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 3000
Connection: keep-alive

The first line is the most interesting. The first word in the first line is the method used to send the data. It is followed by the script address (relative to the server root) and the protocol. In this example, HTTP/1.1 is used.

The second line specifies the name of the server whose script is requested.

In the third line, the browser identifies itself. You can see the type of the browser, the operating system, and the browser version.

The next lines contain information about the types of documents the browser "understands," the language and encoding it "prefers," and the allowed types of data compression.

The last two lines tell the server that it shouldn't disconnect after it sends the requested document but should keep the connection for the specified time.

Detailed knowledge of header fields sent during a GET request will allow you to simulate HTTP sessions, that is, write programs that can request documents on a server but cannot be differentiated from a common browser.

The HTTP POST Method

Another method for sending data using HTTP is POST.


Definition

POST is a method for sending data using HTTP. With this method, data are sent after all headers are sent from a client to the server.

You can send data with the POST method from an HTML page only using a form. The syntax of the form is identical to the form for the GET request except that the POST method is specified:

In this example, two parameters, id and test, are sent to the specified script using the HTTP POST method.

If the action parameter isn't specified in the form header, the data will be sent to the current script. The value of the action parameter can be shortened. If no server is specified, the data will be sent to the current server. If you don't specify the server and the path, the data will be sent to the script in the same directory on the same server.

Look at data sent from a browser when the HTTP POST method is used. Here is an actual header sent by Mozilla:

   POST /2/1.php HTTP/1.1
Host: localhost
User-Agent: Mozilla/5.0 (Windows NT 5.0; en-US; rv:1.7.1) Gecko/20040707
Accept: */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 3000
Connection: keep-alive
Referer: http://localhost/2/1.php?id=21&test=hello
Content-Type: /x-www-form-urlencoded
Content-Length: 16

id=53&test=hello

As you can see, this request differs from an HTTP GET request. It begins with the POST word, telling the server that it should wait for POST parameters.

Like in the GET method, the method name is followed by the address of the requested script, the protocol (HTTP/1.1), the server, the browser identifications, the type of pages, and so on.

Both the GET and the POST method can include the Referer field. The value of this field is the address of the last page visited by the user.

The Referer field is followed by two fields, Content-Type and Content-Length. Although the GET request to the server had only the header and didn't have a body (i.e., contents), the contents of the POST request are the data sent with the POST method. Therefore, two header fields are necessary.

Content-Type is the type of data sent within the body. In this example, the value of this field is application/x-www-form-urlencoded. It indicates that the body contains data that are uniform resource locator (URL) encoded from a World Wide Web (WWW) form.

Content-Length is the length of the data. This parameter is required so that the server can detect the end of the data.

The empty line indicates the end of the header. It is followed by the POST parameters.

URL encoding means that certain characters are encoded to avoid collisions. For example, suppose that you need to send the text variable with the "help&x=y" value using the POST method. What will happen if you send the data without URL encoding (i.e., text=help&x=y)? The script will parse this sequence as two parameters, text=help and x=y, rather than one variable text with the "help&x=y" value.

To avoid similar collisions when data are sent using the HTTP GET or POST method, certain characters are encoded in a special way. A character is substituted with a sequence of the %XX form. Here, xx is the two-character hexadecimal code of the character being encoded.

For example, the & character is encoded as %26, and the = character is encoded as %3D. The % character is substituted with %25, and so on, for many control characters. In general, you can encode all characters you're sending. However, it is common to encode only necessary characters because this operation increases the size of the data.

Therefore, the string in this example should be encoded as help%26×%3Dy. The text=help%26×%3Dy parameter will be sent to the server, and the script will parse it correctly.

The POST method allows you to send files.

Combining the GET and POST Methods

Now, when you know the format of requests sent to the server using the HTTP GET and POST methods, you might be asking, What if I combine these two methods?

What will happen if the following request is sent to the server?

   POST /2/1.php?id=88&test=tested HTTP/1.1
Host: localhost
User-Agent: Mozilla/5.0 (Windows NT 5.0; en-US; rv:1.7.1) Gecko/20040707
Accept: */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 3000
Connection: keep-alive
Referer: http://localhost/2/1.php?id=21&test=hello
Content-Type: application/x-www-form-urlencoded
Content-Length: 16



id=53&test=hello

On the one hand, the POST request type is specified; on the other hand, some parameters are sent to the script as if they were GET parameters.

It turns out that PHP and many other interpreters of scripts adequately respond to such unusual requests. Parameters sent in such a way are registered as GET parameters. For example, you can access these parameters in PHP with the $_GET global variable. The data sent as POST parameters are registered as you would expect: In PHP, you can access them with the $_POST global variable.

This request to the server was made intentionally. Can you make your browser send such a request? As you know, only one data-sending method can be specified in a form.

A logical approach to this question would involve creating the following form:

This form is implemented in the http://localhost/2/1.php script. You can easily make sure that the browser displays and submits this form correctly. What's more, it creates a request almost identical to the one created manually.

You might be wondering about the purpose of these manipulations. Consider another example, http://localhost/2/2.php. Here is the source code of this script:



This is a modified script from Chapter 1. A person's ID is assumed to be a GET parameter, and the script checks whether the id parameter is an integer. Then the automatically registered variable $id is used. Suppose that the PHP interpreter is configured so that POST parameters have the highest priority. When the script receives a GET request, it works reliably.

Imagine a situation, in which POST parameters are sent in addition to GET ones. For example, send the following form (http://localhost/2/form1.html):

In this example, the browser creates an HTTP POST request to the Image from book 2.PHP script with the id GET parameter equal to an integer, but the POST parameter is requested from the user.

As you can see from the code of the Image from book 2.PHP script, the id GET parameter is filtered properly, and control is passed to the block of code that makes a query to a database.

This piece of code uses the automatically-registered value of the id parameter, which is taken from POST parameters in accordance with the PHP interpreter configuration. So, GET parameters are filtered, but POST parameters are used in the query. In other words, you can bypass filtration and insert any data into the database query.

To check this conjecture, send ID values that aren't integers. You'll receive the following message:

   Warning: mysql_fetch_object(): supplied argument is not a valid MySQL
result resource in x:\localhost\2\2.php on line 17

records not found

The text of this message allows you to infer that the script has a vulnerability that can be exploited by sending GET and POST parameters in the same request.

Methods for exploiting this vulnerability are described in Chapter 3 devoted to SQL injection.

This is an example demonstrating a rare situation, in which different pieces of a script explicitly or implicitly use variables whose values were obtained from different types of requests. HTTP describes a few other types of requests. However, they aren't popular in the Web, and their description is beyond the scope of this book.

COOKIE Parameters

Another popular method of data exchange between the client and the server is the use of HTTP COOKIE parameters.


Definition

Cookies are data stored on the client in small files or in the computer memory.

COOKIE parameters are sent within the header. The server sends a cookie in the response header, and the client sends it in the request header.

Here is an example of a server response header, in which the server sets the test cookie variable to the hello value:

   HTTP/1.1 200 OK
Date: Thu, 01 Sep 2004 12:00:00 GMT
Server: Apache/1.3.12 (Win32)
X-Powered-By: PHP/4.3.3
Set-Cookie: test=hello
Set-Cookie: id=88
Keep-Alive: timeout=15, max=100
Connection: keep-alive
Transfer-Encoding: chunked
Content-Type: text/html

Like in any server response, the first line contains the protocol version followed by a code and its explanation. Response codes and their exploitations can be found in the HTTP specification. Here are a few of them:

  • 200 — The document is on the server. The server should return the document.

  • 301 — The document is removed. The Location field with a new path is expected. The browser should request the document from the new location without saving the current document in the history.

  • 302 — The document is temporarily removed. The Location field with a new path is expected. The browser should request the document from the new location without saving the current document in the history.

  • 401 — Authorization is required. After receiving this response, most browsers display a form suggesting that the user enter a name and a password. These will be sent in the next request.

  • 403 — Access is denied. This doesn't imply that the document is missing from the server.

  • 404 — The document isn't found on the server.

  • 500 — An internal server error. It occurs when there is a collision between a common gateway interface (CGI) program and the server. For example, a Perl script didn't return the expected Content-Type: text/html header. This can happen when an error message is displayed before the header is output. This response code can be an indication of an error in the script.

The second line contains the date and time in Greenwich Mean Time (GMT).

The Server field identifies the server. This information can be interesting to an attacker who can try to find vulnerabilities in a particular version of the server. This is why the system administrator should bar the output of the full server version. He or she can output only the server name or even a random string. For example, in the Apache server configuration the administrator should edit or add the following string:

   ServerTokens ProductOnly.

Configuring other types of HTTP servers is beyond the scope of this book.

The x-Powered-By: PHP/4.3.3 field indicates that the page is generated by a PHP script. The PHP interpreter version is output. This information can be useful to the attacker, so the system administrator should configure the PHP interpreter so that it doesn't generate this header. To do this, the administrator would add the expose_php = Off line to the Image from book PHP.INI configuration file.

In the next line, the Set-Cookie: test=hello field sets the cookie variable test to the hello value.

As you can see, this field can be repeated for other variables. After the cookie values are set, the browser will send them every time it requests a script from the current (or higher) directory during the current session.

The server can URL-encode cookies.

Here is an example, in which the browser sends the server two cookies set earlier:

   GET /2/3.php HTTP/1.1
Host: localhost
User-Agent: Mozilla/5.0 (Windows NT 5.0; en-US; rv:1.7.1) Gecko/20040707
Accept: */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 3000
Connection: keep-alive
Cookie: test=hello; id=88
Cache-Control: max-age=0

As you can see, the values are sent in one header field. They are separated with a semicolon, not with an ampersand as in a POST or GET request. The names and values of these parameters are URL-encoded.

Each COOKIE parameter has a name and a value. In addition, it can include the server address and the path to scripts that require the cookie value. When these are specified, the browser should send the cookies only to documents located in the specified directory or its subdirectories.

Here are examples of cookies with the path and the domain specified:

   Set-Cookie: b=tested; path=/2/
Set-Cookie: c=tested; path=/2/; domain=localhost

In the first line, the cookies will be sent only to documents in the /2/ directory (in the current domain) or its subdirectories. In the second line, the domain is specified explicitly.

The server and the path can differ from the current ones. At present, cookies are seldom set to a third-party server. However, in some cases this option can be used by the attacker. The idea of such an attack would be the following: A user who has nothing in common with the attacker visits an intentionally-generated malicious HTML page. Then he or she visits a target server and sends the server certain COOKIE parameters fabricated by the attacker.

As a variant, the attacker can use JavaScript to redirect the visitor to the target server. With other vulnerabilities, this one can be useful to the attacker.

Another parameter that can be sent with a cookie is its lifetime. A cookie shouldn't be stored on the client forever. Its lifetime can be specified when the cookie is set. For example, the following header sets a cookie and specifies its expiration date:

   Set-Cookie: a=tested; expires=Thu, 01-Sep-06 00:00:00 GMT

If no lifetime is specified, the cookie lives within the current session until the user exits the browser. In most cases, such cookies are stored in the memory.

If the lifetime is specified, the cookie value is written onto the hard disk. Different browsers store cookies differently. For example, Mozilla stores all cookies in one file, COOKIES.TXT. Internet Explorer stores each cookie in an individual file.

You can draw two conclusions from this. First, a cookie can remain even after the computer is rebooted. It can be repeatedly sent to the server for years until the user deletes it. Second, the user can edit cookie files as he or she likes.

Therefore, I give you the following recommendation: For cookies, set only those parameters that are useless to the attacker. It is likely that the attacker analyzing your Web application for vulnerabilities will examine cookie files and decide whether he or she can benefit from changing their values.

Hidden Fields

Storing data in hidden fields is a simple and useful option for a programmer.


Definition

A hidden field is a field of an HTML form that isn't displayed on the HTML page containing the form. However, the contents of this field can be seen in the text representation of the HTML page.

If you use hidden fields when writing Web applications, you should be aware that the attacker can easily read their contents. The attacker is likely to analyze the names and values of the received hidden fields. As practice shows, this analysis can be quite fruitful.

In addition, the attacker can easily change the contents of hidden fields. A script often filters visible parameters but assumes the user cannot change hidden values.

Simulating an HTTP Session

Now that you know a lot about HTTP, you can try to write a small PHP script simulating an HTTP session.

The simplest way to simulate it involves using the fopen() or file_get_contents() function. If you pass either of these functions the name of a remote document with HTTP, and you have appropriate access rights, you'll open this document like a local file.

However, you won't be able to make a POST request, send COOKIE parameters, or send a request through a proxy server. So, it would be best to write a script that uses sockets to create connections, pretends to be a browser, and imitates all header fields, including GET and POST parameters, cookies, and the Referer.

Here is the code of such a script:

   01   02 $host="localhost"; // The host
03 $method="POST"; // GET or POST
04 $addr="/2/1.php?id=55"; // The path relative to the server root
05 $useragent="Mozilla/4.0 (compatible; MSIE 5.0; Windows NT 5.0)";
07 // The browser identification: IE 5
08 $referer="http://any.com/";
09 $postvars="test=tested"; // POST parameters only for a POST request
10 $cookie="cookvar=hello"; // COOKIE parameters, if any
11 $target="127.0.0.1"; // The IP address of the server or a proxy server
12 $targetport=80; // The port of the server or a proxy server
13 $forwarded="127.0.0.2"; // The value of the X-FORWARDED-FOR header
14 $in = "$method $addr HTTP/1.1\r\n".
15 "Accept: */*\r\n".
16 "Accept-Language: en-us\r\n".
17 "Accept-Encoding: gzip, deflate\r\n".
18 "User-Agent: $useragent\r\n".
19 "Host: $host\r\n".
20 "Connection: Close\r\n";
21 if(!empty($forwarded)) $in.="X-FORWARDED-FOR: $forwarded\r\n";
22 if(!empty($referer)) $in.="Referer: $referer\r\n";
23 if(!empty($cookie)) $in.="Cookie: $cookie\r\n";
24 if($method=="POST")
25 {
26 $len=strlen($postvars);
27 $in.=
28 "Content-Type: application/x-www-form-urlencoded\r\n".
29 "Content-Length: $len\r\n\r\n".
30 $postvars;
31 }
32 $socket = socket_create (AF_INET, SOCK_STRERM, 0);
33 $result = socket_connect ($socket, $target, $targetport);
34 socket_write($socket, $in, strlen($in));
35 $o="";
36 while ($out = socket_read ($socket, 2048)) {
37 $o.=$out;
38 }
49 echo $o;
40 ?>

This script is available on the accompanying CD-ROM in the http://localhost/2/http.php file. This script requests the http://localhost/2/1.php script already familiar to you and displays the received result with all headers in the browser window.

The line numbers are given for your convenience. Lines 02 to 13 define all parameters that will be used in the HTTP request. In this case, it is a POST request that simulates a request made by Microsoft Internet Explorer.

Note the X-FORWARDED-FOR, header. If you connect using a proxy server, it can send the HTTP server the actual Internet protocol (IP) address of the client within this header. Therefore, if you send this header to the server, you'll be able to cheat scripts that will believe the address you send is your IP address. Some popular forums will store this address.

Although system log files on the server contain the actual IP address, sometimes this feature can be useful to the attacker.

If server scripts, like some forms, check the value of the Referer header, you can send a desired Referer header. In addition, you can send cookie data.

The header of the HTTP request is created in lines 14 to 31. In lines 32 to 38, the request is sent to the server, and the output is accumulated in the $o variable. In the next line, the content of the variable is displayed to the browser.

Changing the Sent Data

Creating and editing such a script for each request would be a tedious job when you need to send many requests.

You might need to change parameters of the header and the pages. A solution to this problem could be the use of a proxy server, which would change information passing through it according to a certain algorithm.

An example of such a proxy server is Proxomitron. I'm not going to describe its settings here; I just mention that this application fits to this purpose. The application comes with a comprehensive description.

Hacker Web Exploitation (Chapter 1: The Internet Is a Hostile Environment)

The Internet Is a Hostile Environment



Overview

Consider a system or a script. As with any other object in the world, its behavior depends on external and internal conditions. Among internal conditions are the server settings, the type of server, the type of database used in the system, the content of the environment variables, the information on the server's hard disk, and the content of the database.

External conditions are the data sent to the server using the HyperText Transfer Protocol (HTTP). Examples of such data are the GET, POST, and COOKIE parameters. In addition, some headers sent by a client to the server according to HTTP are examples of such data. These settings are specified and changed by the client, and the script will receive them asenvironment variables.

Fortunately, an external user, that is, a visitor to the Web site, cannot affect internal factors. However, he or she can change external factors.

Dynamics Causes "Holes"
Consider a complex system consisting of many interrelated components. For example, a Web system consists of a news system, a chat, a forum, and so on. It would be wise to assume that a system or a site has dynamic content.

Definition In this context, content means the content of a HyperText Markup Language (HTML) page.


Dynamic content can be defined as a response of the system to changes in its external conditions. This response can be documented (i.e., explicitly described or logically implied) or not. In the latter case, it is a result of side effects in the system. These side effects are usually unpredictable, and they are called vulnerabilities or simply holes.

Dynamic content is fraught with threat. A site based completely on static content, that is, including only static HTML pages, will not be vulnerable to attacks on scripts because it has no scripts. By definition, a static system doesn't respond to changes in external conditions; therefore, it has a documented response.

However, you shouldn't think that a static site is invulnerable to all types of attacks. For example, it is possible for a malicious person to attack the site through other services, such as vulnerabilities in other Web sites that are physically located on the same server but are components of another system. In addition, attacks on the Web server are possible. In this book, I describe only Web attacks, that is, attacks on scripts and applications accessible using HTTP.

So, dynamic content is the origin of all holes in Web applications. One obvious solution to the problem could involve abandoning dynamics in the Web. However, the contemporary Web would be impossible without dynamics. Forums, guest books, newsgroups, and so on, would be missing from the Web. Therefore, you need to write secure Web applications and scripts and stable systems.

Stable Systems
Definition A stable system is a system with a documented response to any change in external conditions.


It appears that this definition, which I learned as a student, is a clue to writing secure Web applications. A system can work well in normal conditions.

Messages will be added to a forum, a search in a database will return results, and so on. What's more, the system will pass all tests for functioning in normal conditions, that is, in conditions, in which a user doesn't interfere between the browser and the server but just clicks links and sends forms with valid data. In such conditions, the system will work well.

As you can see, interaction between a user and a system, or, in other words, changing the external conditions of the system, can be of two types.

The first type is valid HTTP requests that agree with common sense. For example, digits are used to specify an ID in a database, and letters are used to specify a person's name when searching in a database.

Definition An HTTP request is a data set sent by a client to a Web server in accordance with HTTP. The data contain the address of the requested script, the server name, and, possibly, parameters such as GET, POST, and COOKIE. In addition, the client can send some secondary data as header fields.


It is recommended that you test the system's behavior in a situation, in which a user examines the HTML code received from the Web server and sends abnormal requests, for example, enters invalid data into form fields.

Consider a few examples.

The script http://localhost/1/1.php returns a person's name stored in a database with an ID. The ID is sent as a GET parameter with the name id.

The system will normally respond to valid ID values:

http://localhost/1/1.php?id=1

http://localhost/1/1.php?id=2

http://localhost/1/1.php?id=100

Therefore, you could say the system works correctly. To be more precise, it correctly responds to correct requests.

In this example, you can see that a request without parameters causes a field to appear, into which you should enter a person's ID. If you enter an integer, you'll see either the person's name or the message telling you that no record was found.

This is an implementation of a simple procedure of retrieving information from the simplest database, a table with two columns: integer id (a person's ID) and string name (his or her name).

How will this script behave in other conditions? What will happen if somebody enters data other than an integer into the ID field? The documentation to the system doesn't describe the system's response. You could expect the system to detect the invalid ID and return an error message. However, you should test it.

Try http://localhost/1/1.php?id=a, and you'll see the following message:

Warning: mysql_fetch_object(): supplied argument is not a valid MySQL
result resource in x:\localhost\1\1.php on line 15

No records were found.


You might be wondering what this means, how an attacker can use this information, and how you should defend your system. I'll comprehensively explain these issues in subsequent chapters.

This warning message shows that the system improperly responds to an ID that isn't an integer.

Consider another example.

The script http://localhost/1/2.php produces almost the same result as the first one, but it looks for a name in a file rather than in a database. The file name is an ID with the TXT extension.

Test this script by sending the following requests:

http://localhost/1/2.php?id=1

http://localhost/1/2.php?id=2

http://localhost/1/2.php?id=3

You'll see that the script normally responds to normal requests that contain IDs of people whose files are available on the disk.

Test the script's behavior in abnormal situations:

http://localhost/1/2.php?id=9999

http://localhost/1/2.php?id=a

You'll get messages like the following:

Warning: fopen(data/5.txt): failed to open stream: No such file or
directory in x:\localhost\1\2.php on line 12

Warning: fread(): supplied argument is not a valid stream resource in
x:\localhost\1\2.php on line 13

Warning: fclose(): supplied argument is not a valid stream resource
in x:\localhost\1\2.php on line 15

As you can see, the system responds improperly to a request containing an ID that isn't integer or an ID that doesn't correspond to any record.

How can an attacker use the information contained in these messages? Again, I'll provide answers in subsequent chapters.

Both examples demonstrate unstable systems. You can explain and predict these scripts' behavior if you examine their code. However, this cannot be called a documented response.

If the scripts would return messages that say requests are invalid, this would be a documented response. Instead, you receive the interpreter's messages that say scripts contained errors.

You could see a lot of such examples in everyday life. People focus attention on how a system works in normal external conditions and almost always ignore that the external conditions can be illogical.

Filtration is most important when writing stable systems.

Filtration
The notion of filtration is often used when discussing vulnerabilities.

Definition Filtration involves changing the contents of a parameter to avoid an undocumented response from the script.


Sometimes, the script performs filtration before it uses the parameter; in other cases, filtration is performed by auxiliary modules.

A character or a sequence of characters received from a user can be filtered in various ways. For example, quotation marks in a string can affect the processing of a Structured Query Language (SQL) request and cause a syntax error. To avoid this, you could simply remove them from the string before processing the request. In my opinion, however, it would be best to add a backslash (\) before each quotation mark. In this case, the database server wouldn't treat a quotation mark as a string-terminating character and wouldn't treat the backslash as a character of the string.

To demonstrate how SQL responds to the backslash character, I suggest that you make a few SQL requests:

mysql> select 'test - \'tested\' ';
+-----------------+
| test - 'tested' |
+-----------------+
| test - 'tested' |
+-----------------+
1 row in set (0.00 sec)
mysql>

As you can see, the quotation marks preceded by backslashes were displayed normally. In contrast, the following request will cause an error message:

mysql> select 'test - 'tested' ';

ERROR 1064: You have an error in your SQL syntax. Check the manual
that corresponds to your MySQL server version for the right syntax to
use near '' '' at line 1

mysql>


Obviously, different parameters should be filtered differently. For example, an unmatched back quotation mark in a string can be crucial in some cases. In other cases, an improper parameter type can cause a system error. This happened in the first example. In yet another case, a parameter with a value outside a valid range can cause an error.

In essence, filtration can be of two types. These are filtration by barring suspicious parameter values and filtration by setting parameters to safe values.

Filtration by barring is a matter of halting the script execution when suspicious elements (such as a quotation mark or the < or > character delimiting HTML tags) are encountered in a parameter. In such cases, a user sees an error message. This type of filtration has disadvantages. For example, valid values can be barred. If a message in a forum contains a quotation mark, such a filtration will prohibit the publication of this message. Therefore, it will be impossible to send a message with a single quotation mark, even though such messages are likely.

This behavior of the protection would seem normal if you remember that a quotation mark makes an SQL request invalid. However, it cannot be justified by common sense.

In my opinion, filtration by setting to safe values is the best. However, it sets all suspicious parameters to a safe form, thus changing their values.

When Filtration Is Insufficient
You could think that filtration is a clue to the problem of Web application safety. However, this is not the case.

Consider an example: http://localhost/1/3.php. A design specification for this script could be as follows:

Write a script that displays the name of a person whose ID is entered. The data are stored in files that have names identical to IDs and the TXT extensions. For example, the data of a person whose ID is 3 are stored in the file 3.TXT.

If no person with the specified ID is found, an appropriate message should be returned.

The ID is sent using the HTTP GET method. If an ID is missing, the script should display a form suggesting that the user enter his or her ID.

Here is the code of this script:

if(empty($id))
{
echo "

enter id (integer)


";
exit;
};
if(file_exists("data/$id.txt"))
{
$f=fopen("data/$id.txt", "r");
$s=fread($f, 1024);
echo $s;
fclose($f);
}
else
echo "records not found";
?>

Does this script conform to the design specification? It certainly does. The script is comprehensively described in the specification. In particular, its response to an abnormal situation is described: If no file is found, a message should be displayed.

However, the design specification doesn't tell whether the ID should be an integer.

The script completely implements the design specification. For example, if the ID is omitted, an appropriate form is displayed. When the script receives the ID, it looks for a file with the corresponding name.

If the file isn't found, the "records not found" message is displayed, and the script doesn't try to read any data.

Finally, it the file is found, its contents are sent to the browser.

This behavior seems invulnerable. It seems impossible to imagine a situation that would cause an error. If the file isn't found or the name is invalid, the script sends a message to the browser. Note that this message is generated by the script rather than by the interpreter.

You should test this. Make the following requests:

http://localhost/1/3.php?id=1

http://localhost/1/3.php?id=2

http://localhost/1/3.php?id=3

As a result, you'll receive corresponding records. Even if you send an ID that isn't integer but a corresponding file exists (e.g., http://localhost/1/3.php?id=abc), you'll receive the record you could expect.

Now specify IDs that are missing from the database or contain characters invalid in a file name (in the file allocation table, or FAT).

Try the following requests:

http://localhost/1/3.php?id=999

http://localhost/1/3.php?id=abcde

http://localhost/1/3.php?id=%3F

http://localhost/1/3.php?id=%3C

http://localhost/1/3.php?id=%7C

Note that the sequences %3F, %3C, and %7C code the characters ?, <, and I, respectively. So, these characters are sent as IDs.

As you can see, the system's responses are adequate. It returns an error message telling you that no record was found.

However, despite such a stable behavior, the script has a vulnerability related to how the file systems are designed.

Remember that some special character sequences are used to change the directory and that nothing prevents you from using them in file names. In a file name, such a sequence changes (or bypasses) a directory. The ../ sequence means "one level up." Look at how the script responds to it.

Suppose you know that the file TEST.TXT is located in the parent directory of the current subdirectory. You cannot access it using HTTP, but you're eager to get the contents of this file. Send the ../test sequence as a person's ID. Examine the code to find out what file will be checked for existence and the contents of which file will be sent to the browser. Obviously, this is DATA/../TEST.TXT. In other words, this is the desired file in the parent directory.

To test how this trick works, make the following request: http://localhost/1/3.php?id=../test. You'll see the contents of the file in the browser window. So, why did the protection let you read the file rather than return a message telling that the file hadn't been found? The reason is that the file is present in the system. What's more, this file name is valid for file functions such as file_exists() or fopen().

This is a crucial vulnerability. I'll try to explain the cause of this vulnerability. The system seems safe, all erroneous situations being excluded. Nevertheless, there is an obvious hole in the system.

The incorrect design specification is responsible for this hole. A perfect one would be as follows:

Write a script that displays the name of a person whose ID is entered. The data are stored in files that have names identical to IDs and the TXT extensions. For example, the data of a person whose ID is 3 are stored in the file 3.TXT. The ID is a sequence of digits, uppercase or lowercase letters, underscores, minuses, or periods. If an invalid ID is received, the script should return an error message.

You could specify more valid characters.

A script complying with the second design specification will be invulnerable to this type of attack.

The Main Principles of Secure Programming
I will now summarize the main principles of writing secure code and the main causes of vulnerabilities.

In fact, there is only one cause. A user can interfere between the browser and the server, and he or she can send illogical values of parameters to the server.

The principle that follows from this is simple: Don't trust the data received from outside the server.

A design specification for a script should be brief, but it should take into account all dangerous situations. A script that complies with a correct design specification will be invulnerable to Web attacks.

If a programmer decides to write a script on his or her own, or if a design specification is written by a person incompetent in security issues who uses the wrong terms, the programmer should write or at least keep in mind a detailed design specification that takes into account all security aspects.

All this entails the following principle: The security of a Web application should be thought out at the stage of writing design specification, before the first line of code is written.

A person who writes the design specification should be competent in Web security. He or she should clearly understand what data should be filtered and how. In addition, he or she should understand why a particular filtration is required.

From the next chapters, you'll learn what data can be considered safe, in what cases you should set data to correct values or halt script execution, how the data should be changed to use hidden features of a script, and how you can benefit by other people's programming flaws.

There are a few types of vulnerabilities that are entirely programmers' fault.

These vulnerabilities cannot be foreseen in a design specification mainly because they are specific to the programming language. As a rule, every programming language has features or functions that should be used carefully. Provision for these nuances is the responsibility of a programmer, not of a manager writing a design specification.

For example, in C and C++, such a slippery issue is the use printf(), strcpy(), and similar functions. They copy specified blocks of the memory without checking whether the copied data are within the allocated address space. However, this topic is beyond the scope of this book.

In PHP, a popular programming language for Web applications, a similar problem relates to automatic definition (registration) of global variables based on data received as GET, POST, and COOKIE parameters.

The next chapters describe how you can use vulnerabilities of this type, how you should eliminate them, and how you can write secure code.

Hacker Web Exploitation (Overview)

Introduction
Overview

These posts are about vulnerabilities in Web applications, that is, scripts and programs running on a server and available using HyperText Transfer Protocol (HTTP). I have tried to give you the most comprehensive information about common mistakes made by inexperienced Web programmers. Hackers can exploit these mistakes to obtain access to a system, gain higher privileges in it, or both.

Internet security is a vast topic. It would be impossible to cover it in one book.

These posts are only about Web applications, and it doesn't touch the installation and configuration of server software, the use of firewalls and antiviruses, vulnerabilities in executable files, and other issues that relate to preventing hackers from obtaining privileges on a server without authentication. Therefore, this book is for Web programmers rather than system administrators responsible for the security of a server.

I demonstrate that improper Web programming results in vulnerable Web applications that can become the weakest components in server protection. "Holes" in these components can allow a hacker to bypass a complicated protection and obtain privileges on the server to investigate the server from the inside.

By protection I mean two types of protection: against changes to information and against unauthorized access to information.

Imagine a small Web site that contains only static data. You could say that the owner of this site has nothing to hide. There are no passwords or access rights. According to HTTP, the server sends data to a client without processing.

Leakage of information about the files located on the site or the server wouldn't be crucial. Even if an attacker accessed the files using File Transfer Protocol (FTP), rather than HTTP, he or she wouldn't benefit from it.

In this situation, the ability of an unauthorized user to change information is more dangerous than that person's ability to access it because the server doesn't store private data. The only exception might be directories protected with a password using the Web server tools.

Now imagine a more complicated system such as an e-shop. Server scripts are accessing a database that stores private data about clients, suppliers, and so on. In addition, this database can store confidential information such as users' credit card numbers.

Disclosure of the source code of the server scripts would also be dangerous. These scripts are likely to contain information sufficient for access to the database, that is, the login and the password. Even if they aren't stored unencrypted, the attacker would be able to disclose them. The source code of the scripts could be analyzed for vulnerabilities that would allow the attacker to obtain high privileges and control the server.

Therefore, leakage of information from this site would be more dangerous than from the static site. A hacker who has found a hole in this system is unlikely to change data in it. Rather, he or she would try to remain unnoticed and obtain commercial secrets to benefit from them.

So, the attacker would first decide whether he or she wants to change information on a server (deface the server, replenish his or her personal account, destroy a database, etc.) or collect information (dump the database, copy system files, etc.).

In any case, the attacker's goal is to obtain as much information about the server as possible and to obtain privileges on it.

A Web programmer should understand, against which type of attack he or she should protect the system. In most cases, the programmer has to protect the data both from changes and from theft.

You, the programmer, should also be aware that a hacker can use holes in Web applications to gain control over the server. You shouldn't neglect protection even if the information on the server isn't valuable and its leakage or compromising wouldn't do harm. Be aware that a hacker's goal can be to control the server to use its computational resources. For example, a server can be used as a relay computer to send spam, scan vulnerabilities on other servers, or find passwords from hashes.

So, the main principle of Web programming is that you should always write Web applications protected as well as possible. This isn't difficult. I hope this book will teach you how to write protected applications and turn vulnerabilities to your advantage.

Hacker Web Exploitation

Note:- This is the new book. Again i want to say, That this is for knowledge-sake. I will not be responsible for any illegal activity if you do, by reading this blog.