Thomas Schneidner

Liked the LSI and matrix stuff. Learned a lot compared to linear algebra class. Using JAMA for SVD decomposition.

Also liked Semantic Web—the Sci. Am. Article.

Priyamvada Tripathi

Project was interesting since it brought together everything made her appreciate the efficiency issues such as memory handling.

Thomas Hernandez

The theory behind Pagerank,  and how his page got higher rank because some MIT guy points to his page.[r1] 

Luis Casian

Authority hubs and the discussion on the notion of importance; understanding the idea that Google can point to pages that it did not even crawl just based on collective references to it. 

Felt that lack of Database background made second part harder to follow.

Kanav Kahol

Was impressed at how much IR could be done without requiring NLP. Was impressed that one single (primary) eigen vector is enough to get importance (while multiple eigen vectors are needed for face recognition using eigen faces).

Wants to apply techniques learnt in content-based image retrieval.

Database part was okay. He is “mesmerized” by the largely unsuccessful efforts at standard-enforcement in semantic web.

Liked the readings.

Jai Kannan

Liked the Anatomy of  Google paper describing search engine technology.

Thinks all this will help him  in developing 3-D search engine.

Viet Hung

Google pagerank algorithm (and his own “hit-based” extensions to notions of importance). Also like the content-based recommender systems such as NBC. Felt that data-integration was still researchy.

Shane Calhoun

Felt that most of it went over his head.

Felt that the class as well project was very difficult.

Does feel that he now knows where to go to understand stuff learnt in the class.

Liked(?) that the class talks about current research directions unlike other 300 level courses and hence interesting though hard.

Satnil Lallian

Liked the Slides and lectures. Liked efficiency issues; liked the A/H task on project.



Google pagerank. The search engine jargon became easier to understand. Hope he can apply it one day.

Material was too difficult to understand as well as the project

Ryan Wilson

Loved  knowing about search stuff.

Hated  having to learn[r2]  the stuff.

Ryan Stephans

LSI is cool since it provides a different visualization of the documents compared to vector space. Naïve bayes was good.

Class close to research edge hence interesting.

On the negative side, the fact that not everything was built on previous things—e.g. XML and Vector space-- caused difficulties.

Marc Chung

Discussion of social factors and how algorithms can be derived from it (e.g. A/H, Content based and collaborative filtering.)

Data Integration discussion was done a somewhat higher level.

Liked the “graduate-course-ish[r3] ” feel of it.

Class went too fast to take notes[r4] .


Data integration part; the idea that quality of results is as important as completeness.

Felt that homework problems involved too many calculations.

Fenil shah

Retrieval techniques. Surprised at how “hard” the google pagerank idea was.

XML & databases was more interesting.

Missing Xml project was disappointing.

Overall interesting, even though tough.


Hated that it took away his sleep..

Liked the Database refresher—since that helped him make up for lost sleep in the database classes. Liked the idea of building retrieval techniques from vector space towards more complex ideas like LSI.


Liked the second half on data-integration.

Hated working on the project.

Hated LSI until she did nothing but LSI one weekend; and then learned to live with it.

Really hated the 1st question from 3rd homework [r5] (collaborative and content-based filtering).


Real application of databases XML discussed.

No other course talks about it. Figured out how XML is used on Internet(?)

Sarbjot Cheira

Liked Authorities / Hubs part of the project.

Felt that course aims should have been restricted to IR alone--without bringing in data integration.

Ravish Patel


Pagerank, google application. Algorithms.

Dennis Session

The average 17th century man knew less information than what is printed in one day’s New York Times. We are living in an era of unprecedented access to information, and this course went to great lengths to discuss how such information can be managed.

Jaya Bansal

IR was interesting.

Did not follow/understand much in the second half. The project was very hard. Sree (Grader-who-became-TA) was great help


 [r1]Tell us when you get Yahoo to point to you..

 [r2]sorry, this wasn’t meant to be search engines for dummies J

 [r3]He also said  S*#%ford-ish. BLECH.

 [r4]You should always take notes—however fast things go! Staring at slides is a surefire way to not retain much unless  you are much more disciplined than I.

 [r5]Imagine remembering such specific things 20 years from now..