CSE 494 Readings
The main papers are those starting un-indented. The indented
numbered papers are optional readings
General (non-technical) readings
A new text book on information retrieval. Chapters from the book would provide good text-book material for IR topics
textbook (in development) on mining massive datasets. Has
good coverage of link analysis, map-reduce, advertising
on the web etc.
Text Retrieval (a draft chapter from Wei
Meng, SUNY Binhghamton. Used with Dr. Meng's permission).
Special readings for latent semantic indexing: Chapter on
LSI in Mannig et al book
Engine Technology (a draft chapter from Wei
Meng, SUNY Binhghamton. Used with Dr. Meng's
cluster (read first three pages to get an idea on
how Google processes a query on a parallel linux cluster
with tons of replication)
Overall Seach Engine
17 from the IR book draft.
Text Classification & Collaborative and Content-based Filtering
Database refresher readings
XML as a Semi-structured Language
based techniques in data integration. Alon Levy
On the need for Schema Mapping
Query Optimization/Procesing in Data Integration
Collection Selection (Text data aggregation)
Combining Database and Information Retrieval
All of the following are short papers...
- Some basics of linear algebra (vectors, matrices, eigen values).
vectors spaces and Information Retrieval Berry et. al. (a math view of linear algebra in IR
- Here are refreshers on database background.
Last modified: Tue Oct 4 19:55:29 MST 2011