[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Paper Reading assignment: Due 10/13 (Updated the homework page too)
- To: Rao Kambhampati <rao@asu.edu>
- Subject: Paper Reading assignment: Due 10/13 (Updated the homework page too)
- From: Subbarao Kambhampati <rao@asu.edu>
- Date: Thu, 6 Oct 2011 18:14:09 -0700
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:date:x-google-sender-auth:message-id:subject :from:to:content-type; bh=wV9UZvP2/326Syy8qlZqfsLhWqN1ZsSYYmjFmaZJms0=; b=E63xNmojPnjZS/3bdRQ/qjxOgcWfCJP+MDZJqUEkPxDQ6SoWm3HjtvAQCzRuNqqwDw W0t4EljarucyYvnCZKvPS0namFvnoSNPHkV7U8218RJETqwHFLe6eByckuao/22kYPsm 5Ef5gtfTz5CpNDG1KFcvj2g5UXMuGUALklWNA=
- Sender: subbarao2z2@gmail.com
- Homework 3 (or think of it as Homework 2 part b). Due 10/13 in class. (We might discuss your answers)
-
("Reading Comprehension") Read the paper "Anatomy of a large-scale hyper-textual search engine" which constains a description of Google search engine circa 1998 (i.e., before it became a company). Answer the following questions:
- What are Fancy hits?
- Why are there two types of barrels--the short and the long?
- How is indexing parallelized?
- How does Google show that it doesn't quite care about recall?
- How does Google avoid crawling the same URL multiple times?
- What are some of the memory saving things they do?
- Do they use TF/IDF directly or indirectly?
- Do they normalize the vectors? (why not?)
- Can they support proximity queries?
- How are "page synopses" made?
- List some of the parameters that need to be initialized for their search engine. Are the default values they pick reasonable?