Exploiting parallelism to accelerate keyword search on deep-web sources

Tantan Liu, Fan Wang, Gagan Agrawal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Increasingly, biological data is being shared over the deep web. Many biological queries can only be answered by successively searching a number of distinct web-sites. This paper introduces a system that exploits parallelization for accelerating search over multiple deep web data sources. An interactive, two-stage multi-threading system is developed to achieve task parallelization, thread parallelization, and pipelined parallelization. We show the effectiveness of our system by considering a number of queries involving SNP datasets. We show that most of the queries can be accelerated significantly by exploiting these three forms of parallelism.

Original languageEnglish (US)
Title of host publicationData Integration in the Life Sciences - 6th International Workshop, DILS 2009, Proceedings
Pages141-156
Number of pages16
DOIs
StatePublished - 2009
Externally publishedYes
Event6th International Workshop on Data Integration in the Life Sciences, DILS 2009 - Manchester, United Kingdom
Duration: Jul 20 2009Jul 22 2009

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5647 LNBI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference6th International Workshop on Data Integration in the Life Sciences, DILS 2009
CountryUnited Kingdom
CityManchester
Period7/20/097/22/09

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Exploiting parallelism to accelerate keyword search on deep-web sources'. Together they form a unique fingerprint.

Cite this