|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object | +--CSE494pgRank.LinkGen
Generate the Link Matrix from the files crawled. The class considers the link mapping for only the files/URL present in the repository. Any URL not crawled and stored is considered not present. LinkGen first maps all the documents to a hastable. Then it recursively goes through each document and extracts URLs that this document points to. Each extracted URL is compared to the list of URLs in hastable and discards those that are not present. Then if document A has a link to B, a entry in link table saying A->B is made. This is done for all the documents. The link matrix so generated is stored in a file.
Constructor Summary | |
LinkGen(java.lang.String repository)
Constructor that accepts directory name where crawled webpages are stored |
Method Summary | |
void |
linker()
Method to generate the link matrix from the files stored in repository. |
static void |
main(java.lang.String[] args)
Should be called as: java LinkGen Crawled_Files_Directory (without the ending '/'). |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
public LinkGen(java.lang.String repository)
Method Detail |
public void linker()
public static void main(java.lang.String[] args)
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |