Multiple genome alignment by clustering pairwise matches

Jeong Hyeon Choi, Kwangmin Choi, Hwan Gue Cho, Sun Kim

Research output: Contribution to journalConference article

4 Citations (Scopus)

Abstract

We have developed a multiple genome alignment algorithm by using a sequence clustering algorithm to combine local pairwise genome sequence matches produced by pairwise genome alignments, e.g, BLASTZ. Sequence clustering algorithms often generate clusters of sequences such that there exists a common shared region among all sequences in each cluster. To use a sequence clustering algorithm for genome alignment, it is necessary to handle numerous local alignments between a pair of genomes. We propose a multiple genome alignment method that converts the multiple genome alignment problem to the sequence clustering problem. This method does not need to make a guide tree to determine the order of multiple alignment, and it accurately detects multiple homologous regions. As a result, our multiple genome alignment algorithm performs competitively over existing algorithms. This is shown using an experiment which compares the performance of TBA, MultiPipMaker (MPM) and our algorithm in aligning 12 groups of 56 microbial genomes and by evaluating the number of common COGs detected.

Original languageEnglish (US)
Pages (from-to)30-41
Number of pages12
JournalLecture Notes in Bioinformatics (Subseries of Lecture Notes in Computer Science)
Volume3388
StatePublished - Oct 17 2005
EventRECOMB 2004 International Workshop, RRCG 2004 - Comparative Genomics - Bertinoro, Italy
Duration: Oct 16 2004Oct 19 2004

Fingerprint

Cluster Analysis
Pairwise
Genome
Alignment
Genes
Clustering
Clustering algorithms
Clustering Algorithm
Microbial Genome
Convert
Necessary
Experiment

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Multiple genome alignment by clustering pairwise matches. / Choi, Jeong Hyeon; Choi, Kwangmin; Cho, Hwan Gue; Kim, Sun.

In: Lecture Notes in Bioinformatics (Subseries of Lecture Notes in Computer Science), Vol. 3388, 17.10.2005, p. 30-41.

Research output: Contribution to journalConference article

@article{b5d595ce7fb6489cabe3f22f51661399,
title = "Multiple genome alignment by clustering pairwise matches",
abstract = "We have developed a multiple genome alignment algorithm by using a sequence clustering algorithm to combine local pairwise genome sequence matches produced by pairwise genome alignments, e.g, BLASTZ. Sequence clustering algorithms often generate clusters of sequences such that there exists a common shared region among all sequences in each cluster. To use a sequence clustering algorithm for genome alignment, it is necessary to handle numerous local alignments between a pair of genomes. We propose a multiple genome alignment method that converts the multiple genome alignment problem to the sequence clustering problem. This method does not need to make a guide tree to determine the order of multiple alignment, and it accurately detects multiple homologous regions. As a result, our multiple genome alignment algorithm performs competitively over existing algorithms. This is shown using an experiment which compares the performance of TBA, MultiPipMaker (MPM) and our algorithm in aligning 12 groups of 56 microbial genomes and by evaluating the number of common COGs detected.",
author = "Choi, {Jeong Hyeon} and Kwangmin Choi and Cho, {Hwan Gue} and Sun Kim",
year = "2005",
month = "10",
day = "17",
language = "English (US)",
volume = "3388",
pages = "30--41",
journal = "Lecture Notes in Computer Science",
issn = "0302-9743",
publisher = "Springer Verlag",

}

TY - JOUR

T1 - Multiple genome alignment by clustering pairwise matches

AU - Choi, Jeong Hyeon

AU - Choi, Kwangmin

AU - Cho, Hwan Gue

AU - Kim, Sun

PY - 2005/10/17

Y1 - 2005/10/17

N2 - We have developed a multiple genome alignment algorithm by using a sequence clustering algorithm to combine local pairwise genome sequence matches produced by pairwise genome alignments, e.g, BLASTZ. Sequence clustering algorithms often generate clusters of sequences such that there exists a common shared region among all sequences in each cluster. To use a sequence clustering algorithm for genome alignment, it is necessary to handle numerous local alignments between a pair of genomes. We propose a multiple genome alignment method that converts the multiple genome alignment problem to the sequence clustering problem. This method does not need to make a guide tree to determine the order of multiple alignment, and it accurately detects multiple homologous regions. As a result, our multiple genome alignment algorithm performs competitively over existing algorithms. This is shown using an experiment which compares the performance of TBA, MultiPipMaker (MPM) and our algorithm in aligning 12 groups of 56 microbial genomes and by evaluating the number of common COGs detected.

AB - We have developed a multiple genome alignment algorithm by using a sequence clustering algorithm to combine local pairwise genome sequence matches produced by pairwise genome alignments, e.g, BLASTZ. Sequence clustering algorithms often generate clusters of sequences such that there exists a common shared region among all sequences in each cluster. To use a sequence clustering algorithm for genome alignment, it is necessary to handle numerous local alignments between a pair of genomes. We propose a multiple genome alignment method that converts the multiple genome alignment problem to the sequence clustering problem. This method does not need to make a guide tree to determine the order of multiple alignment, and it accurately detects multiple homologous regions. As a result, our multiple genome alignment algorithm performs competitively over existing algorithms. This is shown using an experiment which compares the performance of TBA, MultiPipMaker (MPM) and our algorithm in aligning 12 groups of 56 microbial genomes and by evaluating the number of common COGs detected.

UR - http://www.scopus.com/inward/record.url?scp=26444589766&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=26444589766&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:26444589766

VL - 3388

SP - 30

EP - 41

JO - Lecture Notes in Computer Science

JF - Lecture Notes in Computer Science

SN - 0302-9743

ER -