Examining HTTP Requests
In this section the requests that pass between the web server and web browser will be examined. The first step is to examine the HTTP requests for a typical web page. This page will be covered in the next section. Understanding how a single page is transmitted is key to seeing how that page fits into a typical surfing session.
A Typical Web Page
A typical web page is displayed on the browser by placing text and images via requests. One of the first things to understand about HTTP requests is that at the heart of each request is a Uniform Resource Locater (URL). The URL tells the web server which file should be sent. The URL could point to an actual file, such as a Hyper Text Markup Language (HTML), or it could point to an image file, such as a GIF or JPEG.
URLs are what the web user types into a browser to access a web page. Chapter 3, “Simple HTTP Requests”, will explain what each part of the URL is for. For now, they simply identify a resource, somewhere on the Internet, that is being requested.
The “typical webpage” for this example is at the following URL:
http://www.httprecipes.com/1/1/typical.php
The actual contents of the “typical webpage” are shown in Figure 1.3.
Figure 1.3: A Typical Webpage

As can be seen in Figure 1.3, four pictures are displayed in the middle of the page. There are actually a total of five pictures, if the Heaton Research logo is counted. When a web page such as this is opened, the HTML, as well as all images must be downloaded.
The first HTTP request is always for the URL that was typed into the browser. This URL is usually an HTML page. The HTML page is a single HTTP request. The text of this HTML file is all that will be transferred on this HTTP request.
It is important to note that only one physical file is transferred per HTTP request. The HTML page will be downloaded and examined for other embedded files that are required to display the web page. Listing 1.1 shows the HTML of the “typical web page”.
Listing 1.1: HTML for the Typical Web Page
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <HTML> <HEAD> <TITLE>HTTP Recipes</TITLE> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> <meta http-equiv="Cache-Control" content="no-cache"> </HEAD> <BODY> <table border="0"><tr><td> <a href="http://www.httprecipes.com/"><img src="/images/logo.gif" alt="Heaton Research Logo" border="0"></a> </td><td valign="top">Heaton Research, Inc.<br> HTTP Recipes Test Site </td></tr> </table> <hr><p><small>[<a href="/">Home</a>:<a href="/1/">First Edition</a>:<a href="/1/1/">Chaper 1</a>]</small></p> <h1>Typical Web Page</h1> <p>Vacation pictures.</p> <table border=1> <tr><td><img src="beach.jpg" height="240" width="320" alt="Beach"> </td><td><img src="ship.jpg" height="240" width="320" alt="Battleship"></td></tr> <tr><td><img src="birds.jpg" height="240" width="320" alt="Birds"> </td><td><img src="flower.jpg" height="240" width="320" alt="Beach Flowers"></td></tr> </table> <hr> <p>Copyright 2006 by <a href="http://www.heatonresearch.com/">Heaton Research, Inc.</a></p> </BODY> </HTML>
As can be seen from the above listing, there are a total of five <img> HTML tags. The following five tags are found:
- <img src="/images/logo.gif" alt="Heaton Research Logo" border="0">
- <img src="beach.jpg" height="240" width="320" alt="Beach">
- <img src="ship.jpg" height="240" width="320" alt="Battleship">
- <img src="birds.jpg" height="240" width="320" alt="Birds">
- <img src="flower.jpg" height="240" width="320" alt="Beach Flowers">
Once the HTML has been downloaded, it is scanned for <img> tags. These <img> tags will cause other requests to be generated to download the images. The above tags are converted to the following five URL’s:
- http://www.heatonresearch/images/logo.gif
- http://www.heatonresearch/1/1/beach.jpg
- http://www.heatonresearch/1/1/ship.jpg
- http://www.heatonresearch/1/1/birds.jpg
- http://www.heatonresearch/1/1/flower.jpg
As can be seen from the above list, the requests are given in fully qualified form. The URL for the file beach.jpg is given in the form:
http://www.httprecipes.com/1/1/beach.jpg
The URL would not be in the form “beach.jpg” as it is represented in the HTML file. Since the web server has no idea what page is currently being browsed, a web browser must fully qualify every request that is sent.
A Typical Surfing Session
Once the user is browsing the “typical web page,” examined in the last section, they will not likely stay there long. The typical web user will “surf,” and visit a large number of pages. The typical web page example contains five different pages that the user may choose to surf to. All of these pages are linked to with anchor tags <a>. The following anchor tags are found in Listing 1.1:
- <a href="http://www.httprecipes.com/">
- <a href="/">
- <a href="/1/">
- <a href="/1/1/">
- <a href="http://www.heatonresearch.com/">
Just as was done with the <img> tags, the above URLs must be converted into their fully qualified form. The above list, when converted into fully qualified form, will give the following five URL’s:
- http://www.httprecipes.com/
- http://www.httprecipes.com/
- http://www.httprecipes.com/1/
- http://www.httprecipes.com/1/1/
- http://www.heatonresearch.com/
As can be seen, some of the <a> tags convert to the same target. For example, two of the tags will open the URL http://www.httprecipes.com/
.
When a user chooses one of the links, the URL is moved to the address line of the browser. The new URL will be handled as if the user had specifically requested the page by typing the URL into the address line of the browser. This process repeats as the user selects more and more pages.
So far HTTP requests have only been discussed as abstract concepts. The actual make up of an HTTP request has not yet been discussed. This will be covered in the next section where the structure of an HTTP request will be examined.












