CSE494
Class Webcrawl

java.lang.Object
  |
  +--CSE494.Webcrawl

public class Webcrawl
extends java.lang.Object
implements java.lang.Runnable

Crawls the web and stores the first 1000 URLs encountered. The files opened are saved into the directory from where crawler has been invoked. Usage: java Webcrawl (http://SiteURL).


Field Summary
static java.lang.String DISALLOW
           
static java.lang.String SEARCH
           
static int SEARCH_LIMIT
           
static java.lang.String STOP
           
static int URL_OPENED
          Limit on Number of URLs Opened
 
Constructor Summary
Webcrawl()
           
 
Method Summary
 int crawler(int newdepth)
          The crawler which extracts the links recursively and goes through them
static void main(java.lang.String[] args)
          Invoke the crawler from commandline.
 void run()
           
 void setStatus(java.lang.String status)
          Method to print.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

SEARCH

public static final java.lang.String SEARCH

STOP

public static final java.lang.String STOP

DISALLOW

public static final java.lang.String DISALLOW

SEARCH_LIMIT

public static final int SEARCH_LIMIT

URL_OPENED

public static final int URL_OPENED
Limit on Number of URLs Opened
Constructor Detail

Webcrawl

public Webcrawl()
Method Detail

run

public void run()
Specified by:
run in interface java.lang.Runnable

crawler

public int crawler(int newdepth)
The crawler which extracts the links recursively and goes through them

setStatus

public void setStatus(java.lang.String status)
Method to print. Mimics to System.out.println

main

public static void main(java.lang.String[] args)
Invoke the crawler from commandline. To invoke type: Webcrawl http:// The HTML files are stored in the directory from where it is invoked