This is the part B of the project for CSE 494/598.
SUBMIT:
- Hardcopy showing the Top N authorities and hubs using the A/H computation for the sample queries given below.
- A report comparing and analyzing the A/H results with those given by pure Vector Space based search. How does varying the size of "root set" affect results of A/H computation? Which results are more relevant: Authorities or Hubs? Comments?
- Hardcopy of your code with comments.
- Return the "Top 10" results for the query by combining PageRank and Vector Space similarity values for the results using the formula
w * (PageRank) + (1-w)* (Vector Space Similarity), where 0 < w < 1.
Do make provisions for varying the value of 'w' at query time. (Hint: For ease of combining with Vector Space rank, normalize the PageRank such that it lies between 0 and 1)
SUBMIT:
- Hardcopy showing the results for sample queries derived using a combination of PageRank and Vector Space ranking.
- A report comparing and analyzing the A/H results with those given by PageRank+VectorSpace. Comment on the effects of varying 'w' between 0 and 1. Comment on the effects of varying the value of "c" (see formula for PageRank computation in class notes). Does PageRank computation converge ?
- Hardcopy of your code with comments.
Extra Credit:
- Implement a GUI for your search engine. Make provisions for selecting Vector Space, A/H, PageRank + Vector Space model for ranking your answers.
- The GUI should preferably be a stand alone application or applet. The GUI can be servlet based only if you have access to a personal Web server that can be accessed by the TA.
- You are encouraged to make the output as "Googlish" as you can i.e. simple interface, links open source documents, anything else you fancy.
- additional extra credit tasks are certainly possible; consult the instructor and or TAs.
Note: A end-of-semester Demo of all tasks will be required. A GUI could come in handy at that time. Hence you are highly encouraged to implement a GUI.
Sample Output: Click here for a sample output for A/H and PageRank + Vector Space search.
java CSE494pgRank.LinkExtract crawledpages Hashedlinks
to see a demo of the program.
Last Modified: 02.24.2004