Recipes | Heaton Research

Recipes

    This chapter includes two recipes. These two recipes will demonstrate the following:

  • Determining if a URL uses HTTPS
  • Using HTTP authentication

    The first recipe will introduce you to some of the things that can be done with the HttpsURLConnection class. The second recipe shows how to access a site that uses HTTP authentication.

Recipe #5.1: Is a URL HTTPS

    This recipe is a fairly simple example of using the HttpsURLConnection class. This example simply checks a URL connection to see if it is HTTPS or not. I use this code when I need to be sure that a connection is being made over a secure link. Unless you care about examining HTTPS certificate chains and cipher suites directly, this is probably the only use you will have for the HttpsURLConnection object.

    As previously mentioned, you will not normally use the HttpsURLConnection class directly. Because HttpsURLConnection is a subclass of HttpURLConnection, you will normally just use what the URL object returns, and not be concerned if you are reading with HTTP and HTTPS. Only when you care about the specifics of HTTPS will you need to use HttpsURLConnection.

    This program is fairly short, so it is implemented entirely in the main method. This HTTPS checker program is shown below in Listing 5.1.

Listing 5.1: Is a Connection HTTPS (IsHTTPS.java)

package com.heatonresearch.httprecipes.ch5.recipe1;

import java.io.*;
import java.net.*;

import javax.net.ssl.*;

/**
 * Recipe #5.1: Is URL HTTPS?
 * Copyright 2007 by Jeff Heaton(jeff@jeffheaton.com)
 *
 * HTTP Programming Recipes for Java Bots
 * ISBN: 0-9773206-6-9
 * http://www.heatonresearch.com/articles/series/16/
 *
 * This recipe shows how to determine if a URL is using
 * the HTTPS protocol.
 *
 * This software is copyrighted. You may use it in programs
 * of your own, without restriction, but you may not
 * publish the source code without the author's permission.
 * For more information on distributing this code, please
 * visit:
 *    http://www.heatonresearch.com/hr_legal.php
 *
 * @author Jeff Heaton
 * @version 1.1
 */
public class IsHTTPS
{
  
  /**
   * Typical Java main method, create an object, and then
   * start the object passing arguments. If insufficient 
   * arguments are provided, then display startup 
   * instructions.
   * 
   * @param args Program arguments.
   */
  public static void main(String args[])
  {
    String strURL = "";
    
    // obtain a URL to use
    if (args.length < 1)
    {
      strURL = "https://www.httprecipes.com/1/5/secure/";
    } else
    {
      strURL = args[0];
    }

    URL url;
    try
    {
      url = new URL(strURL);
      URLConnection conn = url.openConnection();
      conn.connect();
      
      // see what type of object was returned
      if (conn instanceof HttpsURLConnection)
      {
        System.out.println("Valid HTTPS URL");
      } else if (conn instanceof HttpURLConnection)
      {
        System.out.println("Valid HTTP URL");
      } else
      {
        System.out.println("Unknown protocol URL");
      }

    } catch (MalformedURLException e)
    {
      System.out.println("Invalid URL");
    } catch (IOException e)
    {
      System.out.println("Error connecting to URL");
    }
  }

}

    This program can be passed a URL to access on the command line; or, if no URL is provided, the program will default to the following URL:

https://www.httprecipes.com/1/5/secure/

    You can also specify a URL. For example, to run this recipe with the URL of http://www.httprecipes.com/
, you would use the following command:

IsHTTPS http://www.httprecipes.com

    The above command simply shows the abstract format to call this recipe, with the appropriate parameters. For exact information on how to run this recipe refer to Appendix B, C, or D, depending on the operating system you are using. This program begins by obtaining a URL. The URL is either provided from the command line, or defaults to a page on the HTTP Recipes site. The following lines of code do this.

String strURL = "";

// Obtain a URL to use.
if (args.length < 1)
{
strURL = "https://www.httprecipes.com/1/5/secure/";
} else
{
strURL = args[0];
}

    Next, a URL object is created for the specified URL. A connection is then opened with a call to the openConnection function on the URL object.

URL url;
try
{
url = new URL(strURL);
URLConnection conn = url.openConnection();

    Once the connection URL has been obtained, a connection is made. This will throw an IOException if the website is not accepting connections.

conn.connect();

    Next, the program will check to see what type of object was returned. If an HttpsURLConnection object is returned, the connection is using HTTPS. If an HttpURLConnection object is returned, the connection is using HTTP. If it is neither of these two, then the program reports that it does not know what sort of a connection is being used.

// See what type of object was returned.
if (conn instanceof HttpsURLConnection)
{
System.out.println("Valid HTTPS URL");
} else if (conn instanceof HttpURLConnection)
{
System.out.println("Valid HTTP URL");
} else
{
System.out.println("Unknown protocol URL");
}

    If an invalid URL was provided, then a MalformedURLException will be thrown.

} catch (MalformedURLException e)
{
System.out.println("Invalid URL");

    If any other sort of error occurs while connecting to the URL, an IOException will be thrown.

} catch (IOException e)
{
System.out.println("Error connecting to URL");
}

    This recipe will be rarely used; however, it demonstrates one practical purpose of using the HttpsURLConnection object directly.

Recipe #5.2: HTTP Authentication

    The next recipe is very useful if you need to retrieve data from a site that requires HTTP authentication. First, the code downloads the contents of a URL and then saves it to a local file. This recipe can be used either in whole are in part.

    Used in its entirety, this recipe provides a method, named download. This method requires the URL to download, the local file to save to, and the user id and password. The download method will then download the contents. If the user id or password is invalid, an exception will be thrown.

    If your application doesn’t call for downloading directly to a file, you could use part of this recipe. Perhaps your application needs only to parse data at an HTTP authenticated site. If this is the case, you should copy the addAuthHeaders method. This method requires an HttpURLConnection object, and a user id and password. The addAuthHeaders method then adds the correct HTTP request header for the specified user id and password.

    I have used this recipe in both ways for several bots that required HTTP authenticated access. The program is shown in Listing 5.2.

Listing 5.2: Download Authenticated URL (AuthDownloadURL.java)

package com.heatonresearch.httprecipes.ch5.recipe2;

import java.net.*;
import java.io.*;

/**
 * Recipe #5.2: Downloading a URL(text or binary)
 * Copyright 2007 by Jeff Heaton(jeff@jeffheaton.com)
 *
 * HTTP Programming Recipes for Java Bots
 * ISBN: 0-9773206-6-9
 * http://www.heatonresearch.com/articles/series/16/
 *
 * This recipe shows how to download a text or binary file,
 * using HTTP authentication.
 *
 * This software is copyrighted. You may use it in programs
 * of your own, without restriction, but you may not
 * publish the source code without the author's permission.
 * For more information on distributing this code, please
 * visit:
 *    http://www.heatonresearch.com/hr_legal.php
 *
 * @author Jeff Heaton
 * @version 1.1
 */
public class AuthDownloadURL
{
  public static int BUFFER_SIZE = 8192;

  /**
   * Download either a text or binary file from a URL.
   * The URL's headers will be scanned to determine the
   * type of tile.
   * 
   * @param remoteURL The URL to download from.
   * @param localFile The local file to save to.
   * @throws IOException Exception while downloading.
   */
  public void download(URL remoteURL, File localFile, String uid, String pwd)
      throws IOException
  {
    HttpURLConnection http = (HttpURLConnection) remoteURL.openConnection();
    addAuthHeader(http, uid, pwd);
    InputStream is = http.getInputStream();
    OutputStream os = new FileOutputStream(localFile);
    String type = http.getHeaderField("Content-Type").toLowerCase().trim();
    if (type.startsWith("text"))
      downloadText(is, os);
    else
      downloadBinary(is, os);
    is.close();
    os.close();
    http.disconnect();
  }

  /**
   * Add HTTP authntication headers to the specified HttpURLConnection
   * object.
   * @param http The HTTP connection.
   * @param uid The user id.
   * @param pwd The password.
   */
  private void addAuthHeader(HttpURLConnection http, String uid, String pwd)
  {
    String hdr = uid + ":" + pwd;
    String encode = base64Encode(hdr);
    http.addRequestProperty("Authorization", "Basic " + encode);
  }

  /**
   * Encodes a string in base64.
   *
   * @param s      The string to encode.
   * @return The encoded string.
   */
  static public String base64Encode(String s)
  {
    ByteArrayOutputStream bout = new ByteArrayOutputStream();

    Base64OutputStream out = new Base64OutputStream(bout);
    try
    {
      out.write(s.getBytes());
      out.flush();
    } catch (IOException e)
    {
    }

    return bout.toString();
  }

  /**
   * Overloaded version of download that accepts strings,
   * rather than URL objects.
   * 
   * @param remoteURL The URL to download from.
   * @param localFile The local file to save to.
   * @throws IOException Exception while downloading.
   */
  public void download(String remoteURL, String localFile, String uid,
      String pwd) throws IOException
  {
    download(new URL(remoteURL), new File(localFile), uid, pwd);
  }

  /**
   * Download a text file, which means convert the line
   * ending characters to the correct type for the 
   * operating system that is being used.
   * 
   * @param is The input stream, which is the URL.
   * @param os The output stream, a local file.
   * @throws IOException Exception while downloading.
   */
  private void downloadText(InputStream is, OutputStream os) throws IOException
  {
    byte lineSep[] = System.getProperty("line.separator").getBytes();
    int ch = 0;
    boolean inLineBreak = false;
    boolean hadLF = false;
    boolean hadCR = false;

    do
    {
      ch = is.read();
      if (ch != -1)
      {
        if ((ch == '\r') || (ch == '\n'))
        {
          inLineBreak = true;
          if (ch == '\r')
          {
            if (hadCR)
              os.write(lineSep);
            else
              hadCR = true;
          } else
          {
            if (hadLF)
              os.write(lineSep);
            else
              hadLF = true;
          }
        } else
        {
          if (inLineBreak)
          {
            os.write(lineSep);
            hadCR = hadLF = inLineBreak = false;
          }
          os.write(ch);
        }
      }
    } while (ch != -1);
  }

  /**
   * Download a binary file.  Which means make an exact 
   * copy of the incoming stream.
   * 
   * @param is The input stream, which is the URL.
   * @param os The output stream, a local file.
   * @throws IOException Exception while downloading.
   */
  private void downloadBinary(InputStream is, OutputStream os)
      throws IOException
  {
    byte buffer[] = new byte[BUFFER_SIZE];

    int size = 0;

    do
    {
      size = is.read(buffer);
      if (size != -1)
        os.write(buffer, 0, size);
    } while (size != -1);
  }

  /**
   * Typical Java main method, create an object, and then
   * start the object passing arguments. If insufficient 
   * arguments are provided, then display startup 
   * instructions.
   * 
   * @param args Program arguments.
   */
  public static void main(String args[])
  {
    try
    {
      AuthDownloadURL d = new AuthDownloadURL();

      if (args.length != 4)
      {
        d.download("https://www.httprecipes.com/1/5/secure/", 
            "./test.html",
            "testuser", 
            "testpassword");
      } else
      {

        d.download(args[0], args[1], args[2], args[3]);
      }
    } catch (Exception e)
    {
      e.printStackTrace();
    }
  }
}

    This program can be passed a URL, local filename, user id and password on the command line. If no parameters are provided, the program will default to the following URL:

    The program will also default to a local file of test.html, a user id of testuser, and a password of testpassword.

    You can also specify a URL. For example, to run this recipe with the URL of http://www.httprecipes.com/
, a local file of index.html, a user id of user and a password of password, you would use the following command:

AuthDownloadURL http://www.httprecipes.com ./index.html user password

    The above command simply shows the abstract format to call this recipe, with the appropriate parameters. For exact information on how to run this recipe refer to Appendix B, C, or D, depending on the operating system you are using. This recipe is very similar to Recipe 4.3, except that it can download from an HTTP authenticated site as well as a regular site. Only the new code relating to HTTP authentication will be discussed here. If you would like to review how this recipe actually downloads a binary or text file, refer to Recipe 4.3. The downloading code is the same in both recipes.

    To access an HTTP authenticated site, the program makes use of a method called addAuthHeader. This method adds the Authorization header to the HttpURLConnection object. This process is shown here.

String hdr = uid + ":" + pwd;

    First the addAuthHeader method builds the header. This header is very simple, it is just the user id delimited by a colon (:) followed by the password.

String encode = base64Encode(hdr);
http.addRequestProperty("Authorization", "Basic " + encode);

    Of course, it would be a bad idea to send the password as plain text. So both the user id and password are encoded into base-64. It is very easy to encode/decode from base-64, so this is not a very strong security system. However, it does keep someone casually examining packets from determining the password.

    When HTTP Authentication is combined with HTTPS, it is quite effective. HTTP Authentication requires an id and password, and HTTPS encrypts the entire packet, password included, thus preventing someone from examining the password. Base-64 is just another number system, like decimal (base 10) or hexadecimal (base 16). To convert to base-64, this program uses the function base64Encode. This method will now be examined.

    The base64Encode method begins by creating a ByteArrayOutputStream that holds the base-64 encoded password.

ByteArrayOutputStream bout = new ByteArrayOutputStream();

    Next, a Base64OutputStream class is created. This recipe also provides this class. The Base64OutputStream class receives an output stream argument to which it writes the base-64 encoded data. You then call the write method of the Base64OutputStream object, and give it the data you want encoded. This is shown below:

Base64OutputStream out = new Base64OutputStream(bout);
try
{
out.write(s.getBytes());
out.flush();

    As you can see, the above lines of code create the Base64OutputStream object, and then write the string to it. The stream is then flushed, ensuring all data is encoded and is not being buffered.

    It is unlikely that an exception will occur, since this is a completely “in memory” operation. However, if an exception does occur, the following line of code will catch it:

} catch (IOException e)
{
}

    Finally, we return the converted base-64 string.

return bout.toString();

    The Base64OutputStream class described above is shown below in Listing 5.3.

Listing 5.3: Base-64 Output Stream (Base64OutputStream.java)

package com.heatonresearch.httprecipes.ch5.recipe2;

import java.io.*;
/**
 * Base64 Output Stream
 * Copyright 2007 by Jeff Heaton(jeff@jeffheaton.com)
 *
 * HTTP Programming Recipes for Java Bots
 * ISBN: 0-9773206-6-9
 * http://www.heatonresearch.com/articles/series/16/
 *
 * Class encodes data using Base64.  It us used for
 * HTTP authentication.
 *
 * This software is copyrighted. You may use it in programs
 * of your own, without restriction, but you may not
 * publish the source code without the author's permission.
 * For more information on distributing this code, please
 * visit:
 *    http://www.heatonresearch.com/hr_legal.php
 *
 * @author Jeff Heaton
 * @version 1.1
 */
class Base64OutputStream extends FilterOutputStream
{

  /**
   * The constructor.
   *
   * @param out The stream used to write to.
   */
  public Base64OutputStream(OutputStream out)
  {
    super(out);
  }

  /**
   * Write a byte to be encoded.
   *
   * @param c The character to be written.
   * @exception java.io.IOException
   */
  public void write(int c) throws IOException
  {
    buffer[index] = c;
    index++;
    if (index == 3)
    {
      super.write(toBase64[(buffer[0] & 0xfc) >> 2]);
      super.write(toBase64[((buffer[0] & 0x03) << 4)
          | ((buffer[1] & 0xf0) >> 4)]);
      super.write(toBase64[((buffer[1] & 0x0f) << 2)
          | ((buffer[2] & 0xc0) >> 6)]);
      super.write(toBase64[buffer[2] & 0x3f]);

      index = 0;
    }
  }

  /**
   * Ensure all bytes are written.
   *
   * @exception java.io.IOException
   */
  public void flush() throws IOException
  {
    if (index == 1)
    {
      super.write(toBase64[(buffer[2] & 0x3f) >> 2]);
      super.write(toBase64[(buffer[0] & 0x03) << 4]);
      super.write('=');
      super.write('=');
    } else if (index == 2)
    {
      super.write(toBase64[(buffer[0] & 0xfc) >> 2]);
      super.write(toBase64[((buffer[0] & 0x03) << 4)
          | ((buffer[1] & 0xf0) >> 4)]);
      super.write(toBase64[(buffer[1] & 0x0f) << 2]);
      super.write('=');
    }
  }

  /**
   * Allowable characters for base-64.
   */
  private static char[] toBase64 = { 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H',
      'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V',
      'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
      'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x',
      'y', 'z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '+', '/' };

  /**
   * Current index.
   */
  private int index = 0;

  /**
   * Outbound buffer.
   */
  private int buffer[] = new int[3];

}

    To convert to base-64, this program makes use of an array that holds the entire base-64 alphabet. For the normal number system, base 10, this alphabet would be “0”, “1”, “2”, “3”, “4”, “5”, “6”, “7”, “8”, and “9”. For hexadecimal (base 16) the alphabet would be “0”, “1”, “2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”, “A”, “B”, “C”, “D”, “E” and “F”.

    Base-64’s alphabet has so many elements that both upper and lower case letters are needed to represent different numerical digits. The alphabet representation used by base-64 shown in Listing 5.3 above is stored in the toBase64 variable.

    The majority of the encryption work is done by the write method. Base-64 conversion requires the input data to be broken up into 3-byte chunks.

buffer[index] = c;
index++;
if (index == 3)
{

    Once we have three bytes, then we write the base-64 version of the number. This is accomplished by taking the 24 bits gathered, 6 bits at a time. This results in four digits, as shown here:

super.write(toBase64[(buffer[0] & 0xfc) >> 2]);
super.write(toBase64[((buffer[0] & 0x03) << 4)
| ((buffer[1] & 0xf0) >> 4)]);
super.write(toBase64[((buffer[1] & 0x0f) << 2)
| ((buffer[2] & 0xc0) >> 6)]);
super.write(toBase64[buffer[2] & 0x3f]);
index = 0;
}

    Each write statement segments one of the groups of 6-bits, and writes that digit.

    The flush method is needed because the data to be converted may not end exactly on a 3-byte boundary. The flush method works like the write method, in that it fills the remaining buffer with zeroes and writes the final digits of the conversion.

    This is a very useful recipe that can be applied anytime you must access an HTTP authenticated web page. The addAuthHeader method can be used to add HTTP authentication support to programs of your own.

Copyright 2005-2009 by Heaton Research, Inc.