Summary | Heaton Research

Summary

    This chapter showed you how to use HTTP security mechanisms. HTTP has two built-in security mechanisms. Firstly, HTTP supports encryption through HTTPS. Secondly, HTTP provides authentication, which requires users to identify themselves.

    HTTP encryption is supported through HTTPS. Any website that you access that makes use of HTTPS will begin with the prefix https://. HTTPS encrypts the data so that it cannot be intercepted. Support for HTTPS is built into C#. You are only required to use an HTTPS URL where you would normally use an HTTP URL. HTTPS encryption prevents a third party from examining packets being exchanged between your browser and the web server.

    HTTP authentication allows the web server to prompt the user for a user id and password. The web server then can determine the identity of the user accessing it. Most web sites do not use HTTP authentication, rather they use their own HTML forms to authenticate. However, some sites make use of HTTP authentication, and to access these sites with a bot, you will have to support HTTP authentication.

    C# contains support for HTTP authentication. To use HTTP authentication with C# a NetworkCredential object must be constructed. This object will then be used to connect to the site.

    This chapter provided two recipes. The first determines if a URL is using HTTPS. The second recipe downloads data from an HTTP authenticated site.

    Up to this point, the chapters have shown you how to access data. In the next chapter we will begin to learn what to do with HTTP data once you have retrieved it. Chapter 6 will show how to parse HTML and extract data from forms, lists, tables and other HTML constructs.

Copyright 2005-2009 by Heaton Research, Inc.