Computational methods for detection of differentially methylated regions using kernel distance and scan statistics

Faith Dunbar, Hongyan Xu, Duchwan Ryu, Santu Ghosh, Huidong Shi, Varghese George

Research output: Contribution to journalArticle

Abstract

Motivation: Researchers in genomics are increasingly interested in epigenetic factors such as DNA methylation because they play an important role in regulating gene expression without changes in the sequence of DNA. Abnormal DNA methylation is associated with many human diseases. Results: We propose two different approaches to test for differentially methylated regions (DMRs) associated with complex traits, while accounting for correlations among CpG sites in the DMRs. The first approach is a nonparametric method using a kernel distance statistic and the second one is a likelihood-based method using a binomial spatial scan statistic. The kernel distance method uses the kernel function, while the binomial scan statistic approach uses a mixed-effects model to incorporate correlations among CpG sites. Extensive simulations show that both approaches have excellent control of type I error, and both have reasonable statistical power. The binomial scan statistic approach appears to have higher power, while the kernel distance method is computationally faster. The proposed methods are demonstrated using data from a chronic lymphocytic leukemia (CLL) study.

Original languageEnglish (US)
Article number298
JournalGenes
Volume10
Issue number4
DOIs
StatePublished - Apr 1 2019

Fingerprint

DNA Methylation
B-Cell Chronic Lymphocytic Leukemia
Genomics
Epigenomics
Research Personnel
Gene Expression

Keywords

  • Binomial scan statistic
  • CpG sites
  • DNA methylation
  • Kernel distance statistic
  • Mixed-effects model

ASJC Scopus subject areas

  • Genetics
  • Genetics(clinical)

Cite this

Computational methods for detection of differentially methylated regions using kernel distance and scan statistics. / Dunbar, Faith; Xu, Hongyan; Ryu, Duchwan; Ghosh, Santu; Shi, Huidong; George, Varghese.

In: Genes, Vol. 10, No. 4, 298, 01.04.2019.

Research output: Contribution to journalArticle

@article{a985cd0ff8154762a93565028c0ef5a0,
title = "Computational methods for detection of differentially methylated regions using kernel distance and scan statistics",
abstract = "Motivation: Researchers in genomics are increasingly interested in epigenetic factors such as DNA methylation because they play an important role in regulating gene expression without changes in the sequence of DNA. Abnormal DNA methylation is associated with many human diseases. Results: We propose two different approaches to test for differentially methylated regions (DMRs) associated with complex traits, while accounting for correlations among CpG sites in the DMRs. The first approach is a nonparametric method using a kernel distance statistic and the second one is a likelihood-based method using a binomial spatial scan statistic. The kernel distance method uses the kernel function, while the binomial scan statistic approach uses a mixed-effects model to incorporate correlations among CpG sites. Extensive simulations show that both approaches have excellent control of type I error, and both have reasonable statistical power. The binomial scan statistic approach appears to have higher power, while the kernel distance method is computationally faster. The proposed methods are demonstrated using data from a chronic lymphocytic leukemia (CLL) study.",
keywords = "Binomial scan statistic, CpG sites, DNA methylation, Kernel distance statistic, Mixed-effects model",
author = "Faith Dunbar and Hongyan Xu and Duchwan Ryu and Santu Ghosh and Huidong Shi and Varghese George",
year = "2019",
month = "4",
day = "1",
doi = "10.3390/genes10040298",
language = "English (US)",
volume = "10",
journal = "Genes",
issn = "2073-4425",
publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",
number = "4",

}

TY - JOUR

T1 - Computational methods for detection of differentially methylated regions using kernel distance and scan statistics

AU - Dunbar, Faith

AU - Xu, Hongyan

AU - Ryu, Duchwan

AU - Ghosh, Santu

AU - Shi, Huidong

AU - George, Varghese

PY - 2019/4/1

Y1 - 2019/4/1

N2 - Motivation: Researchers in genomics are increasingly interested in epigenetic factors such as DNA methylation because they play an important role in regulating gene expression without changes in the sequence of DNA. Abnormal DNA methylation is associated with many human diseases. Results: We propose two different approaches to test for differentially methylated regions (DMRs) associated with complex traits, while accounting for correlations among CpG sites in the DMRs. The first approach is a nonparametric method using a kernel distance statistic and the second one is a likelihood-based method using a binomial spatial scan statistic. The kernel distance method uses the kernel function, while the binomial scan statistic approach uses a mixed-effects model to incorporate correlations among CpG sites. Extensive simulations show that both approaches have excellent control of type I error, and both have reasonable statistical power. The binomial scan statistic approach appears to have higher power, while the kernel distance method is computationally faster. The proposed methods are demonstrated using data from a chronic lymphocytic leukemia (CLL) study.

AB - Motivation: Researchers in genomics are increasingly interested in epigenetic factors such as DNA methylation because they play an important role in regulating gene expression without changes in the sequence of DNA. Abnormal DNA methylation is associated with many human diseases. Results: We propose two different approaches to test for differentially methylated regions (DMRs) associated with complex traits, while accounting for correlations among CpG sites in the DMRs. The first approach is a nonparametric method using a kernel distance statistic and the second one is a likelihood-based method using a binomial spatial scan statistic. The kernel distance method uses the kernel function, while the binomial scan statistic approach uses a mixed-effects model to incorporate correlations among CpG sites. Extensive simulations show that both approaches have excellent control of type I error, and both have reasonable statistical power. The binomial scan statistic approach appears to have higher power, while the kernel distance method is computationally faster. The proposed methods are demonstrated using data from a chronic lymphocytic leukemia (CLL) study.

KW - Binomial scan statistic

KW - CpG sites

KW - DNA methylation

KW - Kernel distance statistic

KW - Mixed-effects model

UR - http://www.scopus.com/inward/record.url?scp=85068414796&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85068414796&partnerID=8YFLogxK

U2 - 10.3390/genes10040298

DO - 10.3390/genes10040298

M3 - Article

VL - 10

JO - Genes

JF - Genes

SN - 2073-4425

IS - 4

M1 - 298

ER -