jeffheaton's picture

    Setting the values of HTTP request headers is another common task of an HttpURLConnection object. The web browser sends browser headers, or HTTP request headers, to the web server. These headers are most commonly used for the following purposes:

  • Identifying the type of web browser
  • Transmitting any cookies
  • Facilitating HTTP authentication

    There are other things that can be accomplished with HTTP request headers; however, these are the most common. HTTP authentication will be covered in Chapter 5, “Secure HTTP Requests,” and cookies will be covered in Chapter 8, “Handling Sessions and Cookies”. Setting the type of browser will be covered later in this section.

Setting HTTP Request Headers

    The HttpURLConnection class provides several functions and methods that can be used to access the HTTP request headers. These functions and methods are shown in Table 4.1.

Table 4.1: HTTP Request Header Methods and Functions

Method or Function Name Purpose
addRequestProperty(String key, String value) Adds a general request property specified by a key-value pair.
getRequestProperties() Returns an unmodifiable Map of general request properties for this connection.
getRequestProperty(String key) Returns the value of the named general request property for this connection.
setRequestProperty(String key, String value) Sets the general request property.

    Usually the only method from the above list that you will use will be the setRequestProperty method. The others are useful when you need to query what values have already been set. If there is already a header with the specified name, then setRequestHeader will overwrite it. The addRequestProperty can be used to add more than one of the same request header with the same name. Usually, you do not want to do this. Adding more that one header of the same name is useful when dealing with cookies - which are discussed in Chapter 8, “Handling Cookies and Sessions”.

Identifying the Browser Type

    One of the HTTP request headers identifies the browser type that the user is using. Many web sites take this header in to account. For example, some web sites are only designed to work with certain versions of Microsoft Internet Explorer. To make use of such sites, you need to change how HttpURLConnection reports the browser type.

    The browser type can be determined from the user-agent HTTP request header. You can easily set the value of this, or any, HTTP request header using the setRequestProperty. For example, to identify the bot as a browser of type “My Bot”, you would use the following command:

http.setRequestProperty("user-agent","My Bot");

    The user-agent header is often used to identify the bot. For example, each of the major search engines use spiders to find pages for their search engines. These search engine companies use user-agent headers to identify them as a search engine spider, and not a human user.

    When you write a bot of your own, you have some decisions to make with the user-agent header. You can either identify the bot, as seen above, or you can emulate one of the common browsers. If a web site requires a version of Internet Explorer, you will have to emulate Internet Explorer.

    Table 4.2 shows the header used by most major browsers to identify them. As you can see, this header also communicates what operating system the user is running as well.

Table 4.2: Identities of Several Major Browsers

Browser User-Agent Header
Mozilla/5.0(PC) (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4
Mozilla/4.0(PC) (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)
Safari v2 (Mac) Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/418.8 (KHTML, like Gecko) Safari/419.3
Firefox v1.5 (Mac) Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4
Internet Explorer 5.1 (Mac) Mozilla/4.0 (compatible; MSIE 5.14; Mac_PowerPC)
Java(PC/Mac) Java/1.5.0_06

    You will also notice from the above list, I have Java listed as a browser. This is what Java will report to a web site, if you do not override the user-agent. It is usually better to override this value with something else.


Copyright 2005 - 2012 by Heaton Research, Inc.. Heaton Research™ and Encog™ are trademarks of Heaton Research. Click here for copyright, license and trademark information.