----- Original Message ----- From:
"Subbarao Kambhampati" <rao@asu.edu> To:
<cse494-f02@parichaalak.eas.asu.edu> Sent: Monday, November 25, 2002
10:57 PM Subject: Re: Google Linux Cluster talk by Urs Hoelzle--Media player
archive...
> So not having much homework or project stuff
hanging on my head, I just > completed watching the Hoelzle
talk.
well, despite having much homework or project stuff
hanging on my head ;-), I found this pretty interesting talk by Soumen
Chakrabarti, in the same colloquia series as the link you sent (but back in
December 2000, so a little outdated I guess). This is more of a recap of what
you covered in class but he explains why HITS and PageRank aren't as good as we
could think and why people should really concentrate on focused crawling
instead... (I just thought this link could serve as part of the
final exam review for
instance...) Thomas Title:
Beyond Hubs and Authorities: Web Resource Discovery and
Segmentation Link: http://www.researchchannel.org/program/displayevent.asp?rid=1002 Abstract:
After crawling and keyword indexing, the next wave that has made a
significant impact on Web search is topic distillation: analyzing properties of
the hyperlink graph for enhanced ranking of Web pages in response to a query.
Hyperlink induced topic search (HITS) and PageRank (used in Google) are two
examples. We discuss two enhancements to the graph selection process. First
we will describe a learning system called a "focused crawler". Second we will
discuss a fine-grained model for 'micro-hubs' and new algorithms based on the
Minimum Description Length principle.
|