CSE 494/598 Lecture Notes
CSE 494/598 Information Retrieval, Mining and Integration on
the Internet
This course is geared towards exposing students to some of the core
technologies for controlling and using the content on the
Internet. The following are some of the questions we will
consider:
- How do search engines work? Why are some
pp better than others?
- Can we think of the web as a big database/knoweldge base and support efficient
database style query processing?
- Can we find useful pearls and patterns in the mass of
accessible data on the Internet?
This course will be breadth-oriented introduction to the issues
involved in answering these questions.
Prerequisites: CSE 310 required. Other courses that will help include
CSE 471 (AI) CSE 412 (Databases) and CSE 450 (Algorithms). I
am hoping that students have had at least one of these 4-level
courses already, but won't insist on them. Students planning
to register for this course are encouraged to talk to the
instructor (via email at rao wholivesat asu dot edu).
Grading: The grading will be based on class participation, exams and
projects.
Textbooks: There is no prescribed textbook. We will read papers (see
the reading list.)
Overview: The best overview is the list of topics and lecture notes
from the previous offering (shown below).
Additional pointers:
Lecture Notes from Fall 2005 (slides in ppt; lectures in .wav)
- Introduction
(8/22;)
Audio
of the lecture [Aug 22, 2005]
- Text
retrieval; vectorspace ranking
-
Audio
of the lecture [Aug 24, 2005]
-
Audio
of the lecture of [Aug 29, 2005]
-
Audio
of the lecture of [Aug 31, 2005]
- Indexing/Retrieval
issues
- Correlation
analysis & Latent Semantic Indexing
- Search
engine technology
- Anatomy
of Google etc
- Clustering
- Text
Classification
- Filtering/Personalization
- Web & Databases: Why do we even care?
- XML and handling semi-structured data
- Semantic web and its standards (RDF/RDF-S/OWL...)
- Information Extraction
- Data/Information Integration/aggregation
- Query Processing in Data Integration: Gathering and Using Source Statistics
- Bridging Information Retrieval and Databases
- Social Networks
- Interactive Review + a (corny) ending (Here are the notes by the TA of the student review comments)
11/21; 11/23 (DB & IR); Collection Selection; Webservices
11/28; 11/30: Social network Analysis (Kevin bacon game; Erdos number;
Trust propagation etc)
12/5: Interactive Review
12/17:
Topics:
DB+IR --> Imprecise queries (1 week)
Collection Selection (1 class)
web services (1 class)
Social Network Analysis (1-2 classes)
Subbarao Kambhampati
Last modified: Wed Jan 24 15:02:49 MST 2007