Bots
Understanding Spider Configuration
Submitted by jeffheaton on Thu, 01/10/2008 - 06:06.The Heaton Research Spider uses the SpiderOptions class for basic configuration information. There are two ways to configure the spider. Either you must set the properties of the SpiderOptions class directly, or you must instruct the spider to load the SpiderOptions properties from a file. This article shows how to do this, as well as what the items in the configuration file mean.
Using a Memory Based Workload
Submitted by jeffheaton on Thu, 01/10/2008 - 06:04.This article shows you how to use a memory based workload. This works when you are spidering a small to medium single host website. A memory based workload saves you the overhead of using SQL. However, for large sites, or multiple sites, you should consider using a SQL based workload.
Heaton Research Spider DBMS Configuration
Submitted by jeffheaton on Thu, 01/10/2008 - 06:03.The Heaton Research Spider allows you to use the spider.conf file to configure the spider. Using this file you can allow the Heaton Research Spider to store its workload in memory or in a variety of databases. This article presents some sample configuration files both for memory based workload management, as well as SQL based workload management.
A Java Amazon.com Bot
Submitted by jeffheaton on Thu, 01/10/2008 - 06:02.This article presents a simple bot that can download data from Amazon.com. Using the provided class, you can easily obtain title and price information from Amazon.com for any ISBN. A simple class, named GetPrice is provided that completely encapsulates this functionality.
HTTP Programming Recipes for C# Bots
Submitted by jeffheaton on Thu, 01/10/2008 - 04:32.The Hypertext Transfer Protocol (HTTP) allows information to be
exchanged between a web server and a web browser. C# allows you to
program HTTP directly. HTTP programming allows you to create programs
that access the web much like a human user would. These programs, which
are called bots, can collect information or automate common web
programming tasks. This book presents a collection of very reusable
recipes for C# bot programming.
HTTP Programming Recipes for Java Bots
Submitted by jeffheaton on Thu, 01/10/2008 - 04:11.The Hypertext Transfer Protocol (HTTP) allows information to be
exchanged between a web server and a web browser. Java allows you to
program HTTP directly. HTTP programming allows you to create programs
that access the web much like a human user would. These programs, which
are called bots, can collect information or automate common web
programming tasks. This book presents a collection of very reusable
recipes for Java bot programming.



