Efficiently gathering information on the Internet using AI & DB techniques:The EMERAC PROJECT
Motivation
PPT Slide
Data representation
Data Representation --2
Data location
Tricky issues
Desirable Properties of information gathering plans
Source Access Limitations
Representing Source Access Limitations
Computing source-complete plans
Building Source Complete Plans
Complexity of finding maximally-contained plans (Certain answers)
Practical Problems with Plans derived from source inversion rules
Optimization challenges in EMERAC
Minimizing information gathering plans
Greedily Minimizing Information Gathering Plans
LCW vs. Naïve [Artificial Sources]
Issues in ordering source calls
Our Approach: Assumptions
Our Approach: Overview
The HTBP Table
The Algorithm
Example
More capable sources
More realistic overlap statistics
Optimizing for First n-tuples
XML ….
Current directions
The EMERAC Crowd
Email: rao@asu.edu
Home Page: http://rakaposhi.eas.asu.edu
Download presentation source