Information Integration on the Web

( AAAI 2007 Tutorial)

Presenters: Subbarao Kambhampati & Craig Knoblock


The explosive growth of the web resulted in thousands of structured queryable information sources on the Internet, and the promise of unprecedented information-gathering capabilities to lay users. Unfortunately, the promise has not yet been transformed into reality. While there are sources relevant to virtually any user queries, the morass of sources presents a formidable hurdle to effectively accessing the information. One way of alleviating this problem is to develop web-based information integration agents, which take the userís query, and access the relevant sources to answer the userís query efficiently.

This tutorial will survey the research and systems for web-based information integration. There is a wide range of technical problems that must be addressed to integrate the diverse sources. We will describe approaches from both the Database and Artificial Intelligence communities that address these various issues. These topics include the relationship to database integration and information retrieval, languages for mediation and exchange, machine learning techniques for generating wrappers, terminology alignment for combining data across sources, query optimization, and query execution. We will also describe the various integration systems and where they fit in within the space of technical approaches. Finally, we will discuss important application areas ó such as bioinformatics and geospatial data integration.

Schedule (Tentative)

  1. Motivation & Models for Information Integration (30m)
  2. Getting the data into Structured format (30m)
  3. Getting sources into alignment (30m)
  4. Getting data into alignment (30m)
  5. Processing queries (45m)
  6. Applications and Wrapup (15m)



Subbarao Kambhampati

Last modified: Sat Jul 28 08:33:50 MST 2007