The Encog Project

org.encog.bot.spider.filter
Interface SpiderFilter

All Known Implementing Classes:
RobotsFilter

public interface SpiderFilter

SpiderFilter: Filters will cause the spider to skip URL's.


Method Summary
 boolean isExcluded(java.net.URL url)
          Check to see if the specified URL is to be excluded.
 void newHost(java.lang.String host, java.lang.String userAgent)
          Called when a new host is to be processed.
 

Method Detail

isExcluded

boolean isExcluded(java.net.URL url)
Check to see if the specified URL is to be excluded.

Parameters:
url - The URL to be checked.
Returns:
Returns true if the URL should be excluded.

newHost

void newHost(java.lang.String host,
             java.lang.String userAgent)
             throws java.io.IOException
Called when a new host is to be processed. Hosts are processed one at a time. SpiderFilter classes can not be shared among hosts.

Parameters:
host - The new host.
userAgent - The user agent being used by the spider. Leave null for default.
Throws:
java.io.IOException - Thrown if an I/O error occurs.

The Encog Project