HTTP Headers
HTTP headers are additional information that is transferred along with both HTTP requests and responses. HTTP requests and responses will return different headers. In the following sections both HTTP request and response headers will be examined.
HTTP headers provide many important elements of the browsing experience. For example, HTTP headers allow browsers to know length and type of data they are displaying. They also allow pages to be secure and can prompt for user and password information. Some of the specific types of information that can be found in the HTTP headers are summarized here:
- User authentication
- Data type
- Data length
- Cookies to maintain state
This book covers each of these uses for headers. HTTP request headers will be examined first.
HTTP Request Headers
HTTP request headers are sent as part of the HTTP request. These headers tell the web server nearly everything about the request. The GET request is made up entirely of headers. The POST request contains an extra block of data, immediately after the headers that contain the posted arguments.
Listing 1.3 shows a typical HTTP request.
Listing 1.3: Typical Request Headers
GET /1/1/typical.php HTTP/1.1 Accept: */* Accept-Language: en Accept-Encoding: gzip, deflate User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/418 (KHTML, like Gecko) Safari/417.9.3 Connection: keep-alive Host: www.httprecipes.com
There are really two parts to the headers: the first line and then the rest of the header lines. The first line, which begins with the request type, is the most important line in the header block, and it has a slightly different format than the other header lines. The request type can be GET, POST, HEAD, or one of the other less frequently used headers. Browsers will always use GET or POST. Following the request type is the file that is being requested. In the above request, the following URL is being requested:
http://www.httprecipes.com/1/1/typical.php
The above URL is not represented exactly in URL form in the request header. The “Host” header line in the header names the web server that contains the file. The request shows the remainder of the URL, which in this case is /1/1/typical.php. Finally, the third thing that the first line provides is the version of the HTTP protocol being used. As of the writing of this book there are only two versions currently in widespread use:
- HTTP/1.1
- HTTP/1.0
This book only deals with HTTP 1.1. Because this book is about writing programs to connect to web servers, it will be assumed that HTTP 1.1 is being used, which is what Java uses when the Java HTTP classes are used.
The lines after the first line make up the actual HTTP headers. Their format is colon delimited. The header name is to the left of the colon and the header value is to the right. It is valid to have two of the same header name in the same request.
The headers give a variety of information. Examining the headers shows type of browser as well as the operating system and other information. In the headers listed above, in Listing 1.3, the Safari browser was being used on the Macintosh platform. Safari is the built in browser for the Macintosh platform.
The headers finally terminate with a blank line. If the request had been a POST, any posted data would follow the blank line. Even when there is no posted data, as is the case with a GET, the blank line is still required.
A web server should respond to every HTTP request from a web browser. The web server’s response is discussed in the next section.
HTTP Response Headers
When the web server responds to a HTTP request, HTTP response header lines are sent. The HTTP response headers look very similar to the HTTP request headers. Listing 1.4 shows the contents of typical HTTP response headers.
Listing 1.4: Typical Response Headers
HTTP/1.1 200 OK Date: Sun, 02 Jul 2006 22:28:58 GMT Server: Apache/2.0.40 (Red Hat Linux) Last-Modified: Sat, 29 Jan 2005 04:13:19 GMT ETag: "824319-509-c6d5c0" Accept-Ranges: bytes Content-Length: 1289 Connection: close Content-Type: text/html
As can be seen from the above listing, at first glance, response headers look nearly the same as request headers. However, look at the first line.
Although the first line is space delimited as in the request, the information is different. The first line of HTTP response headers contains the HTTP version and status information about the response. The HTTP version is reported as 1.1, and the status code, 200, means “OK,” there was no error. Also, this is where the famous error code 404 (page not found) comes from.
Error codes can be grouped according to the digit in their hundreds position:
- 1xx: Informational - Request received, continuing process
- 2xx: Success - The action was successfully received, understood, and accepted
- 3xx: Redirection - Further action must be taken in order to complete the request
- 4xx: Client Error - The request contains bad syntax or cannot be fulfilled
- 5xx: Server Error - The server failed to fulfill an apparently valid request
Immediately following the headers will be a blank line, just as was the case with HTTP requests. Following the blank line delimiter will be the data that was requested. It will be of the length specified in the Content-Length header. The Content-Length header in Listing 1.4 indicates a length of 1,289 bytes. For a list of HTTP codes, refer to Appendix G, “HTTP Response Codes.”




