Setting Up MySQL
MySQL is a free database that you can obtain from the following URL:
To setup a database in MySQL, use the following DDL script in Listing F.1.
Listing F.1: MySQL DDL Script
SET NAMES latin1; SET FOREIGN_KEY_CHECKS = 0; CREATE TABLE `spider_workload` ( `workload_id` int(10) unsigned NOT NULL auto_increment, `host` int(10) unsigned NOT NULL, `url` varchar(2083) NOT NULL default '', `status` varchar(1) NOT NULL default '', `depth` int(10) unsigned NOT NULL, `url_hash` int(11) NOT NULL, `source_id` int(11) NOT NULL, PRIMARY KEY (`workload_id`), KEY `status` (`status`), KEY `url_hash` (`url_hash`), KEY `host` (`host`) ) ENGINE=MyISAM AUTO_INCREMENT=189 DEFAULT CHARSET=latin1; CREATE TABLE `spider_host` ( `host_id` int(10) unsigned NOT NULL auto_increment, `host` varchar(255) NOT NULL default '', `status` varchar(1) NOT NULL default '', `urls_done` int(11) NOT NULL, `urls_error` int(11) NOT NULL, PRIMARY KEY (`host_id`) ) ENGINE=MyISAM AUTO_INCREMENT=19796 DEFAULT CHARSET=latin1; SET FOREIGN_KEY_CHECKS = 1;
To make use of the MySQL database, you will need to make sure that the MySQL JAR file is part of your Java classpath. The recipe.bat or recipe.sh scripts, which can be used to run any of the recipes, automatically include the MySQL Jar. The MySQL Jar is named as follows:
mysql-connector-java-3.1.13-bin.jar
This Jar filename name will likely change slightly as new versions are introduced. You can obtain this driver(Jar file) from the MySQL web site. The following configuration file should serve as a guide for using MySQL.
Listing F.2: Sample Spider Configuration for MySQL
timeout: 60000 maxDepth: -1 userAgent: corePoolSize: 100 maximumPoolSize:100 keepAliveTime: 60 dbURL: jdbc:mysql://127.0.0.1/spider dbClass: com.mysql.jdbc.Driver dbUID: user dbPWD: password workloadManager:com.heatonresearch.httprecipes.spider.workload.sql.SQLWorkloadManager startup: clear filter: com.heatonresearch.httprecipes.spider.filter.RobotsFilter




