What is a URL
In the last section I showed how to construct a URL object. You also saw that a MalformedURLException would be thrown if an invalid URL is specified. To understand what an invalid URL is, you should first understand the format of a good URL. URLs follow the following format:
URL Scheme://Host/Path?Query
As seen above, the URL is made up of the following three components:
- Scheme
- Host
- Path
- Query
In the next few sections, each of these components will be discussed. We will start with the URL scheme.
URL Scheme
The scheme is the protocol that will be used to transfer data. For the purposes of this book, we will be dealing with the "http" and "https" schemes. Many of the more common schemes are listed in Table 3.1.
Table 3.1: Common HTML Schemes
| Scheme | Name |
|---|---|
| http | HTTP resources |
| https | HTTP over SSL |
| ftp | File Transfer Protocol |
| mailto | E-mail address |
| ldap | Lightweight Directory Access Protocol lookups |
| file | Resources available on the local computer or over a local file sharing network |
| news | Usenet newsgroups |
| gopher | The Gopher protocol |
| telnet | The TELNET protocol |
| data | URL scheme for inserting small pieces of content in place |
Following the URL scheme, the URL’s host is specified. The URL host will be discussed in the next section.
URL Host
The host specifies to which server the HTTP request is to be directed. There are several different formats in which the URL host can be represented. First, it can be in the typical domain form, such as:
www.httprecipes.com
Second, it can be expressed as an IP address, such as:
127.0.0.1
Finally, it can be expressed as a symbol that is resolved in the “localhost” file on the computer, such as:
localhost
Following the URL host is the URL path and query. The URL path and query will be discussed in the next section.
URL Path and Query
The path specifies which file to retrieve, or which script to run on the server. The "query" specifies parameters to be passed to the URL. The query immediately follows the path delimited by a question mark. The following URL specifies only a path:
http://www.httprecipes.com/1/1/cities.php
The above URL specifies a path of “/1/1/cities.php”.
Parameters can be passed using the query portion of the URL. The following URL demonstrates this idea.
http://www.httprecipes.com/1/1/city.php?city=2
The above URL passes one parameter using the query string. A parameter named “city” is passed to the query string. The parameter “city” has the value of “2”. It is also possible to pass multiple parameters. If you would like to pass multiple parameters, they should be separated by the ampersand symbol (&). The following URL makes use of the ampersand to pass in two parameters. These two parameters are named “city” and “zip code”.
http://www.httprecipes.com/1/1/city.php?city=2&zipcode=63017
So far the URLs we have examined all contain standard ASCII (American Standard Code for Information Interchange) letters and numbers. In the next section, you will see how to encode special characters into the URL.












