Bots | Heaton Research

Bots

Understanding Spider Configuration

The Heaton Research Spider uses the SpiderOptions class for basic configuration information. There are two ways to configure the spider. Either you must set the properties of the SpiderOptions class directly, or you must instruct the spider to load the SpiderOptions properties from a file. This article shows how to do this, as well as what the items in the configuration file mean.

Using a Memory Based Workload

This article shows you how to use a memory based workload. This works when you are spidering a small to medium single host website. A memory based workload saves you the overhead of using SQL. However, for large sites, or multiple sites, you should consider using a SQL based workload.

Heaton Research Spider DBMS Configuration

The Heaton Research Spider allows you to use the spider.conf file to configure the spider. Using this file you can allow the Heaton Research Spider to store its workload in memory or in a variety of databases. This article presents some sample configuration files both for memory based workload management, as well as SQL based workload management.

A Java Amazon.com Bot

This article presents a simple bot that can download data from Amazon.com. Using the provided class, you can easily obtain title and price information from Amazon.com for any ISBN. A simple class, named GetPrice is provided that completely encapsulates this functionality.

HTTP Programming Recipes for C# Bots

The Hypertext Transfer Protocol (HTTP) allows information to be
exchanged between a web server and a web browser. C# allows you to
program HTTP directly. HTTP programming allows you to create programs
that access the web much like a human user would. These programs, which
are called bots, can collect information or automate common web
programming tasks. This book presents a collection of very reusable
recipes for C# bot programming.

HTTP Programming Recipes for Java Bots

The Hypertext Transfer Protocol (HTTP) allows information to be
exchanged between a web server and a web browser. Java allows you to
program HTTP directly. HTTP programming allows you to create programs
that access the web much like a human user would. These programs, which
are called bots, can collect information or automate common web
programming tasks. This book presents a collection of very reusable
recipes for Java bot programming.

Syndicate content
Copyright 2005-2008 by Heaton Research, Inc.