The Encog Project

org.encog.bot.spider
Class SimpleReport

java.lang.Object
  extended by org.encog.bot.spider.SimpleReport
All Implemented Interfaces:
SpiderReportable

public class SimpleReport
extends java.lang.Object
implements SpiderReportable

SimpleReport: This is a very simple implementation of the SpiderReportable interface. It stays within a single host and does not process any data.


Nested Class Summary
 
Nested classes/interfaces inherited from interface org.encog.bot.spider.SpiderReportable
SpiderReportable.URLType
 
Constructor Summary
SimpleReport()
           
 
Method Summary
 boolean beginHost(java.lang.String host)
          This function is called when the spider is ready to process a new host.
 void init(Spider spider)
          Called when the spider is starting up.
 boolean spiderFoundURL(java.net.URL url, java.net.URL source, SpiderReportable.URLType type)
          Called when the spider encounters a URL.
 void spiderProcessURL(java.net.URL url, java.io.InputStream stream)
          Called when the spider is about to process a NON-HTML URL.
 void spiderProcessURL(java.net.URL url, SpiderParseHTML parse)
          Called when the spider is ready to process an HTML URL.
 void spiderURLError(java.net.URL url)
          Called when the spider tries to process a URL but gets an error.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SimpleReport

public SimpleReport()
Method Detail

beginHost

public boolean beginHost(java.lang.String host)
This function is called when the spider is ready to process a new host. THis function simply stores the value of the current host.

Specified by:
beginHost in interface SpiderReportable
Parameters:
host - The new host that is about to be processed.
Returns:
True if this host should be processed, false otherwise.

init

public void init(Spider spider)
Called when the spider is starting up. This method provides the SpiderReportable class with the spider object.

Specified by:
init in interface SpiderReportable
Parameters:
spider - The spider that will be working with this object.

spiderFoundURL

public boolean spiderFoundURL(java.net.URL url,
                              java.net.URL source,
                              SpiderReportable.URLType type)
Called when the spider encounters a URL.

Specified by:
spiderFoundURL in interface SpiderReportable
Parameters:
url - The URL that the spider found.
source - The page that the URL was found on.
type - The type of link this URL is.
Returns:
True if the spider should scan for links on this page.

spiderProcessURL

public void spiderProcessURL(java.net.URL url,
                             java.io.InputStream stream)
                      throws java.io.IOException
Called when the spider is about to process a NON-HTML URL. For this SimpleReport manager, this is ignored.

Specified by:
spiderProcessURL in interface SpiderReportable
Parameters:
url - The URL that the spider found.
stream - An InputStream to read the page contents from.
Throws:
java.io.IOException - Thrown if an IO error occurs while processing the page.

spiderProcessURL

public void spiderProcessURL(java.net.URL url,
                             SpiderParseHTML parse)
                      throws java.io.IOException
Called when the spider is ready to process an HTML URL. For this SimpleReport manager, this is ignored.

Specified by:
spiderProcessURL in interface SpiderReportable
Parameters:
url - The URL that the spider is about to process.
parse - An object that will allow you you to parse the HTML on this page.
Throws:
java.io.IOException - Thrown if an IO error occurs while processing the page.

spiderURLError

public void spiderURLError(java.net.URL url)
Called when the spider tries to process a URL but gets an error. For this SimpleReport manager, this is ignored.

Specified by:
spiderURLError in interface SpiderReportable
Parameters:
url - The URL that generated an error.

The Encog Project