Statistical methods for detecting differentially methylated regions based on MethylCap-seq data

Deepak Nag Ayyala, David E. Frankhouser, Javkhlan Ochir Ganbat, Guido Marcucci, Ralf Bundschuh, Pearlly Yan, Shili Lin

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

DNA methylation is a well-established epigenetic mark, whose pattern throughout the genome, especially in the promoter or CpG islands, may be modified in a cell at a disease stage. Recently developed probabilistic approaches allow distributing methylation signals at nucleotide resolution from MethylCap-seq data. Standard statistical methods for detecting differential methylation suffer from 'curse of dimensionality' and sparsity in signals, resulting in high false-positive rates. Strong correlation of signals between CG sites also yields spurious results. In this article, we review applicability of highdimensional mean vector tests for detection of differentially methylated regions (DMRs) and compare and contrast such tests with other methods for detecting DMRs. Comprehensive simulation studies are conducted to highlight the performance of these tests under different settings. Based on our observation, we make recommendations on the optimal test to use. We illustrate the superiority of mean vector tests in detecting cancer-related canonical gene pathways, which are significantly enriched for acute myeloid leukemia and ovarian cancer.

Original languageEnglish (US)
Pages (from-to)926-937
Number of pages12
JournalBriefings in Bioinformatics
Volume17
Issue number6
DOIs
StatePublished - Jan 1 2016
Externally publishedYes

Fingerprint

Methylation
Statistical methods
Genes
CpG Islands
DNA Methylation
Nucleotides
Acute Myeloid Leukemia
Epigenomics
Ovarian Neoplasms
Observation
Genome
Neoplasms

Keywords

  • Differentially methylated regions
  • High dimensionality
  • Mean vector test
  • MethylCap-seq

ASJC Scopus subject areas

  • Information Systems
  • Molecular Biology

Cite this

Ayyala, D. N., Frankhouser, D. E., Ganbat, J. O., Marcucci, G., Bundschuh, R., Yan, P., & Lin, S. (2016). Statistical methods for detecting differentially methylated regions based on MethylCap-seq data. Briefings in Bioinformatics, 17(6), 926-937. https://doi.org/10.1093/BIB/BBV089

Statistical methods for detecting differentially methylated regions based on MethylCap-seq data. / Ayyala, Deepak Nag; Frankhouser, David E.; Ganbat, Javkhlan Ochir; Marcucci, Guido; Bundschuh, Ralf; Yan, Pearlly; Lin, Shili.

In: Briefings in Bioinformatics, Vol. 17, No. 6, 01.01.2016, p. 926-937.

Research output: Contribution to journalArticle

Ayyala, DN, Frankhouser, DE, Ganbat, JO, Marcucci, G, Bundschuh, R, Yan, P & Lin, S 2016, 'Statistical methods for detecting differentially methylated regions based on MethylCap-seq data', Briefings in Bioinformatics, vol. 17, no. 6, pp. 926-937. https://doi.org/10.1093/BIB/BBV089
Ayyala, Deepak Nag ; Frankhouser, David E. ; Ganbat, Javkhlan Ochir ; Marcucci, Guido ; Bundschuh, Ralf ; Yan, Pearlly ; Lin, Shili. / Statistical methods for detecting differentially methylated regions based on MethylCap-seq data. In: Briefings in Bioinformatics. 2016 ; Vol. 17, No. 6. pp. 926-937.
@article{e01e080e96c0469ea6484614e20d325c,
title = "Statistical methods for detecting differentially methylated regions based on MethylCap-seq data",
abstract = "DNA methylation is a well-established epigenetic mark, whose pattern throughout the genome, especially in the promoter or CpG islands, may be modified in a cell at a disease stage. Recently developed probabilistic approaches allow distributing methylation signals at nucleotide resolution from MethylCap-seq data. Standard statistical methods for detecting differential methylation suffer from 'curse of dimensionality' and sparsity in signals, resulting in high false-positive rates. Strong correlation of signals between CG sites also yields spurious results. In this article, we review applicability of highdimensional mean vector tests for detection of differentially methylated regions (DMRs) and compare and contrast such tests with other methods for detecting DMRs. Comprehensive simulation studies are conducted to highlight the performance of these tests under different settings. Based on our observation, we make recommendations on the optimal test to use. We illustrate the superiority of mean vector tests in detecting cancer-related canonical gene pathways, which are significantly enriched for acute myeloid leukemia and ovarian cancer.",
keywords = "Differentially methylated regions, High dimensionality, Mean vector test, MethylCap-seq",
author = "Ayyala, {Deepak Nag} and Frankhouser, {David E.} and Ganbat, {Javkhlan Ochir} and Guido Marcucci and Ralf Bundschuh and Pearlly Yan and Shili Lin",
year = "2016",
month = "1",
day = "1",
doi = "10.1093/BIB/BBV089",
language = "English (US)",
volume = "17",
pages = "926--937",
journal = "Briefings in Bioinformatics",
issn = "1467-5463",
publisher = "Oxford University Press",
number = "6",

}

TY - JOUR

T1 - Statistical methods for detecting differentially methylated regions based on MethylCap-seq data

AU - Ayyala, Deepak Nag

AU - Frankhouser, David E.

AU - Ganbat, Javkhlan Ochir

AU - Marcucci, Guido

AU - Bundschuh, Ralf

AU - Yan, Pearlly

AU - Lin, Shili

PY - 2016/1/1

Y1 - 2016/1/1

N2 - DNA methylation is a well-established epigenetic mark, whose pattern throughout the genome, especially in the promoter or CpG islands, may be modified in a cell at a disease stage. Recently developed probabilistic approaches allow distributing methylation signals at nucleotide resolution from MethylCap-seq data. Standard statistical methods for detecting differential methylation suffer from 'curse of dimensionality' and sparsity in signals, resulting in high false-positive rates. Strong correlation of signals between CG sites also yields spurious results. In this article, we review applicability of highdimensional mean vector tests for detection of differentially methylated regions (DMRs) and compare and contrast such tests with other methods for detecting DMRs. Comprehensive simulation studies are conducted to highlight the performance of these tests under different settings. Based on our observation, we make recommendations on the optimal test to use. We illustrate the superiority of mean vector tests in detecting cancer-related canonical gene pathways, which are significantly enriched for acute myeloid leukemia and ovarian cancer.

AB - DNA methylation is a well-established epigenetic mark, whose pattern throughout the genome, especially in the promoter or CpG islands, may be modified in a cell at a disease stage. Recently developed probabilistic approaches allow distributing methylation signals at nucleotide resolution from MethylCap-seq data. Standard statistical methods for detecting differential methylation suffer from 'curse of dimensionality' and sparsity in signals, resulting in high false-positive rates. Strong correlation of signals between CG sites also yields spurious results. In this article, we review applicability of highdimensional mean vector tests for detection of differentially methylated regions (DMRs) and compare and contrast such tests with other methods for detecting DMRs. Comprehensive simulation studies are conducted to highlight the performance of these tests under different settings. Based on our observation, we make recommendations on the optimal test to use. We illustrate the superiority of mean vector tests in detecting cancer-related canonical gene pathways, which are significantly enriched for acute myeloid leukemia and ovarian cancer.

KW - Differentially methylated regions

KW - High dimensionality

KW - Mean vector test

KW - MethylCap-seq

UR - http://www.scopus.com/inward/record.url?scp=85038129988&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85038129988&partnerID=8YFLogxK

U2 - 10.1093/BIB/BBV089

DO - 10.1093/BIB/BBV089

M3 - Article

VL - 17

SP - 926

EP - 937

JO - Briefings in Bioinformatics

JF - Briefings in Bioinformatics

SN - 1467-5463

IS - 6

ER -