Exploiting parallelism to accelerate keyword search on deep-web sources

Tantan Liu; Fan Wang; Gagan Agrawal

doi:10.1007/978-3-642-02879-3_12

Exploiting parallelism to accelerate keyword search on deep-web sources

Tantan Liu, Fan Wang, Gagan Agrawal

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

Increasingly, biological data is being shared over the deep web. Many biological queries can only be answered by successively searching a number of distinct web-sites. This paper introduces a system that exploits parallelization for accelerating search over multiple deep web data sources. An interactive, two-stage multi-threading system is developed to achieve task parallelization, thread parallelization, and pipelined parallelization. We show the effectiveness of our system by considering a number of queries involving SNP datasets. We show that most of the queries can be accelerated significantly by exploiting these three forms of parallelism.

Original language	English (US)
Title of host publication	Data Integration in the Life Sciences - 6th International Workshop, DILS 2009, Proceedings
Pages	141-156
Number of pages	16
DOIs	https://doi.org/10.1007/978-3-642-02879-3_12
State	Published - 2009
Externally published	Yes
Event	6th International Workshop on Data Integration in the Life Sciences, DILS 2009 - Manchester, United Kingdom Duration: Jul 20 2009 → Jul 22 2009

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	5647 LNBI
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	6th International Workshop on Data Integration in the Life Sciences, DILS 2009
Country/Territory	United Kingdom
City	Manchester
Period	7/20/09 → 7/22/09

ASJC Scopus subject areas

Theoretical Computer Science
General Computer Science

Access to Document

10.1007/978-3-642-02879-3_12

Cite this

Liu, T., Wang, F., & Agrawal, G. (2009). Exploiting parallelism to accelerate keyword search on deep-web sources. In Data Integration in the Life Sciences - 6th International Workshop, DILS 2009, Proceedings (pp. 141-156). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5647 LNBI). https://doi.org/10.1007/978-3-642-02879-3_12

Exploiting parallelism to accelerate keyword search on deep-web sources. / Liu, Tantan; Wang, Fan; Agrawal, Gagan.
Data Integration in the Life Sciences - 6th International Workshop, DILS 2009, Proceedings. 2009. p. 141-156 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 5647 LNBI).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Liu, T, Wang, F & Agrawal, G 2009, Exploiting parallelism to accelerate keyword search on deep-web sources. in Data Integration in the Life Sciences - 6th International Workshop, DILS 2009, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5647 LNBI, pp. 141-156, 6th International Workshop on Data Integration in the Life Sciences, DILS 2009, Manchester, United Kingdom, 7/20/09. https://doi.org/10.1007/978-3-642-02879-3_12

Liu T, Wang F, Agrawal G. Exploiting parallelism to accelerate keyword search on deep-web sources. In Data Integration in the Life Sciences - 6th International Workshop, DILS 2009, Proceedings. 2009. p. 141-156. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-642-02879-3_12

@inproceedings{c8533b88c05e443eaac5f317cdac16db,

title = "Exploiting parallelism to accelerate keyword search on deep-web sources",

abstract = "Increasingly, biological data is being shared over the deep web. Many biological queries can only be answered by successively searching a number of distinct web-sites. This paper introduces a system that exploits parallelization for accelerating search over multiple deep web data sources. An interactive, two-stage multi-threading system is developed to achieve task parallelization, thread parallelization, and pipelined parallelization. We show the effectiveness of our system by considering a number of queries involving SNP datasets. We show that most of the queries can be accelerated significantly by exploiting these three forms of parallelism.",

author = "Tantan Liu and Fan Wang and Gagan Agrawal",

year = "2009",

doi = "10.1007/978-3-642-02879-3_12",

language = "English (US)",

isbn = "3642028780",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

pages = "141--156",

booktitle = "Data Integration in the Life Sciences - 6th International Workshop, DILS 2009, Proceedings",

note = "6th International Workshop on Data Integration in the Life Sciences, DILS 2009 ; Conference date: 20-07-2009 Through 22-07-2009",

}

TY - GEN

T1 - Exploiting parallelism to accelerate keyword search on deep-web sources

AU - Liu, Tantan

AU - Wang, Fan

AU - Agrawal, Gagan

PY - 2009

Y1 - 2009

N2 - Increasingly, biological data is being shared over the deep web. Many biological queries can only be answered by successively searching a number of distinct web-sites. This paper introduces a system that exploits parallelization for accelerating search over multiple deep web data sources. An interactive, two-stage multi-threading system is developed to achieve task parallelization, thread parallelization, and pipelined parallelization. We show the effectiveness of our system by considering a number of queries involving SNP datasets. We show that most of the queries can be accelerated significantly by exploiting these three forms of parallelism.

AB - Increasingly, biological data is being shared over the deep web. Many biological queries can only be answered by successively searching a number of distinct web-sites. This paper introduces a system that exploits parallelization for accelerating search over multiple deep web data sources. An interactive, two-stage multi-threading system is developed to achieve task parallelization, thread parallelization, and pipelined parallelization. We show the effectiveness of our system by considering a number of queries involving SNP datasets. We show that most of the queries can be accelerated significantly by exploiting these three forms of parallelism.

UR - http://www.scopus.com/inward/record.url?scp=70350350693&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70350350693&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-02879-3_12

DO - 10.1007/978-3-642-02879-3_12

M3 - Conference contribution

AN - SCOPUS:70350350693

SN - 3642028780

SN - 9783642028786

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 141

EP - 156

BT - Data Integration in the Life Sciences - 6th International Workshop, DILS 2009, Proceedings

T2 - 6th International Workshop on Data Integration in the Life Sciences, DILS 2009

Y2 - 20 July 2009 through 22 July 2009

ER -

Exploiting parallelism to accelerate keyword search on deep-web sources

Abstract

Publication series

Conference

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this