Yochan Information Integration Group

Yochan Information Integration Group

We are broadly interested in developing flexible frameworks for data and information integration. Our projects include QBase, QUIC and QPIAD, BibFinder, Havasu, and Emerac. We are part of ET-I3 a university-wide initiative on Intelligent Information Integration. Our work is supported in part by Office of Naval Research (grant N000140910032) and three Google research awards (2007, 2010 and 2013).


  • Lydia Manikonda is working on crowd-sourced computation and instagram analysis.
  • Vamsi Meduri is working on extensions of BayesWipe.
  • Professor Subbarao Kambhampati is keeping himself busy being a copy-editor for all their papers...


  • Yuheng Hu worked on the SocSent framework for social media analytics. He is currently at IBM Almaden, and will be joining University of Illinois, Chicago.
  • Sushovan De worked on BayesWipe system for probabilistic data rectification; he is off in Google.
  • Anirudh Acharya extended ET-LDA to support online processing of Tweet-Transcript alignment. He is off to Yahoo!
  • Tejas Mallapura worked on context recovery for orphan Tweets. She is off at Mathworks.
  • Manikandan Vijayakumr worked on context recovery for Tweets. He is off at American Express.
  • Srijith Ravikumar worked on trust sensitive ranking of tweets. He is off at Amazon.
  • Preeth Inder Singh worked on storing
  • Raju Balakrishnan worked on Ad Ranking, Source reputation assessment, uncertain databases, etc. He is currently minting money at Groupon.
  • Manish Kumar extended Source Rank to be sensitive to source topics. Currently at Amazon.
  • Rohit Raghunathan looked at the relative advantages of AFDs vs. Graphical Models in handling incomplete data and imprecise queries. Currently at Amazon.
  • Sanil Salvi did follow-up and extension work on SMARTInt
  • Ravi Gummadi (co-advised with Pat Langley) worked on joining autonomous data-sources in the absence of PK/FK dependencies (Currently at .
  • Anupam Khulbe (co-advised with Pat Langley) worked on joining autonomous data-sources in the absence of PK/FK dependencies (Currently at Amazon)
  • Garrett Wolf is worked on QUIC and QPIAD system dealing with imprecision and incompleteness in autonomous databases.
  • Aravind Krishna worked on an efficient single-pass appraoch for generating approximate functional dependencies. (Currently at Yahoo!)
  • Bhaumik Chokshi worked on novelty and redundancy analysis in collection selection, and developed the ROSCO system. (Currently at Microsoft Search)
  • Jianchun Fan worked on flexible approaches for web-service composition, multi-objective query optimization and reasoning with incomplete data. (Currently at Amazon)
  • Hemal Khatri worked on reasoning with incomplete data in web-data sources. He is currently at MSN Search.
  • Wes Dyer did an honors thesis on handling relevance and overlap together in collection selection. (Currently at Microsoft Live Search).
  • Thomas Hernandez worked on integrating bio-informatics data sources, and on statistics-oriented approaches for meta-search on text databases. He is currently gainfully employed at Amazon
  • Zaiqing Nie developed BibFinder, a meta-search engine for bibliographic entries, that uses automatically gathered coverage and overlap statistics. Earlier, he had done work on multi-objective query optimization for data integration scenarios. Currently at Microsoft Research (Asia). He is one of the principals on Libra.
  • Ullas Nambiar worked on supporting imprecise queries in information aggregation scenarios. He is currently a researcher at IBM India Research Labs.
  • Eric Lambrecht developed the Emerac data integration system, that uses a novel framework for supporting recursive information gathering queries.
  • Sreelakshmi Vaddi worked on adaptive execution techniques for data integration scenarios.