A tool for supporting integration across multiple flat-file datasets

Zhang Xuan, Gagan Agrawal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

Traditionally, biologists focused on a single research subject. New high-throughput experimental and analytical technologies, such as microarray and BLAST programs, have changed this. An important functionality required now is the ability to process queries about multiple data entries with little user intervention. This paper presents the design, implementation, and evaluation of a data integration tool that supports database-like query operations across flat-file biological datasets. Compared with the existing solutions, our system has several advantages, i.e., no database management system is required, users can still use declarative languages to communicate with the system, and no data parsing, loading, or indexing utility programs need to be written. We have used the system on three biological queries, each of which was inspired by an actual study from bioinformatics research literature. These case studies have demonstrated the functionality and scalability of our tool. Overall, our approach provides a light-weight and scalable solution for data integration over flat-file datasets.

Original languageEnglish (US)
Title of host publicationProceedings - Sixth IEEE Symposium on BioInformatics and BioEngineering, BIBE 2006
Pages141-148
Number of pages8
DOIs
StatePublished - 2006
Externally publishedYes
Event6th IEEE Symposium on BioInformatics and BioEngineering, BIBE 2006 - Arlington, VA, United States
Duration: Oct 16 2006Oct 18 2006

Publication series

NameProceedings - Sixth IEEE Symposium on BioInformatics and BioEngineering, BIBE 2006

Conference

Conference6th IEEE Symposium on BioInformatics and BioEngineering, BIBE 2006
CountryUnited States
CityArlington, VA
Period10/16/0610/18/06

ASJC Scopus subject areas

  • Biotechnology
  • Computer Science Applications
  • Information Systems

Fingerprint Dive into the research topics of 'A tool for supporting integration across multiple flat-file datasets'. Together they form a unique fingerprint.

Cite this