A statistical change point model approach for the detection of DNA copy number variations in array CGH data

Jie Chen, Yu Ping Wang

Research output: Contribution to journalArticle

33 Citations (Scopus)

Abstract

Array comparative genomic hybridization (aCGH) provides a high-resolution and high-throughput technique for screening of copy number variations (CNVs) within the entire genome. This technique, compared to the conventional CGH, significantly improves the identification of chromosomal abnormalities. However, due to the random noise inherited in the imaging and hybridization process, identifying statistically significant DNA copy number changes in aCGH data is challenging. We propose a novel approach that uses the mean and variance change point model (MVCM) to detect CNVs or breakpoints in aCGH data sets. We derive an approximate p-value for the test statistic and also give the estimate of the locus of the DNA copy number change. We carry out simulation studies to evaluate the accuracy of the estimate and the p-value formulation. These simulation results show that the approach is effective in identifying copy number changes. The approach is also tested on fibroblast cancer cell line data, breast tumor cell line data, and breast cancer cell line aCGH data sets that are publicly available. Changes that have not been identified by the circular binary segmentation (CBS) method but are biologically verified are detected by our approach on these cell lines with higher sensitivity and specificity than CBS.

Original languageEnglish (US)
Article number4695823
Pages (from-to)529-541
Number of pages13
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume6
Issue number4
DOIs
StatePublished - Oct 1 2009

Fingerprint

DNA Copy Number Variations
Change-point Model
Comparative Genomic Hybridization
Comparative Genomics
Statistical Model
DNA
Cells
Cell Line
Line
Cell
p-Value
Breast Neoplasms
Segmentation
Binary
Fibroblasts
Tumor Cell Line
Chromosome Aberrations
Random Noise
Tumors
Screening

Keywords

  • ACGH microarray data
  • CNVs
  • DNA copy numbers
  • Gene expression
  • Statistical hypothesis testing

ASJC Scopus subject areas

  • Biotechnology
  • Genetics
  • Applied Mathematics

Cite this

A statistical change point model approach for the detection of DNA copy number variations in array CGH data. / Chen, Jie; Wang, Yu Ping.

In: IEEE/ACM Transactions on Computational Biology and Bioinformatics, Vol. 6, No. 4, 4695823, 01.10.2009, p. 529-541.

Research output: Contribution to journalArticle

@article{abee8279df1e487fa3b162cb0c72ba4a,
title = "A statistical change point model approach for the detection of DNA copy number variations in array CGH data",
abstract = "Array comparative genomic hybridization (aCGH) provides a high-resolution and high-throughput technique for screening of copy number variations (CNVs) within the entire genome. This technique, compared to the conventional CGH, significantly improves the identification of chromosomal abnormalities. However, due to the random noise inherited in the imaging and hybridization process, identifying statistically significant DNA copy number changes in aCGH data is challenging. We propose a novel approach that uses the mean and variance change point model (MVCM) to detect CNVs or breakpoints in aCGH data sets. We derive an approximate p-value for the test statistic and also give the estimate of the locus of the DNA copy number change. We carry out simulation studies to evaluate the accuracy of the estimate and the p-value formulation. These simulation results show that the approach is effective in identifying copy number changes. The approach is also tested on fibroblast cancer cell line data, breast tumor cell line data, and breast cancer cell line aCGH data sets that are publicly available. Changes that have not been identified by the circular binary segmentation (CBS) method but are biologically verified are detected by our approach on these cell lines with higher sensitivity and specificity than CBS.",
keywords = "ACGH microarray data, CNVs, DNA copy numbers, Gene expression, Statistical hypothesis testing",
author = "Jie Chen and Wang, {Yu Ping}",
year = "2009",
month = "10",
day = "1",
doi = "10.1109/TCBB.2008.129",
language = "English (US)",
volume = "6",
pages = "529--541",
journal = "IEEE/ACM Transactions on Computational Biology and Bioinformatics",
issn = "1545-5963",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "4",

}

TY - JOUR

T1 - A statistical change point model approach for the detection of DNA copy number variations in array CGH data

AU - Chen, Jie

AU - Wang, Yu Ping

PY - 2009/10/1

Y1 - 2009/10/1

N2 - Array comparative genomic hybridization (aCGH) provides a high-resolution and high-throughput technique for screening of copy number variations (CNVs) within the entire genome. This technique, compared to the conventional CGH, significantly improves the identification of chromosomal abnormalities. However, due to the random noise inherited in the imaging and hybridization process, identifying statistically significant DNA copy number changes in aCGH data is challenging. We propose a novel approach that uses the mean and variance change point model (MVCM) to detect CNVs or breakpoints in aCGH data sets. We derive an approximate p-value for the test statistic and also give the estimate of the locus of the DNA copy number change. We carry out simulation studies to evaluate the accuracy of the estimate and the p-value formulation. These simulation results show that the approach is effective in identifying copy number changes. The approach is also tested on fibroblast cancer cell line data, breast tumor cell line data, and breast cancer cell line aCGH data sets that are publicly available. Changes that have not been identified by the circular binary segmentation (CBS) method but are biologically verified are detected by our approach on these cell lines with higher sensitivity and specificity than CBS.

AB - Array comparative genomic hybridization (aCGH) provides a high-resolution and high-throughput technique for screening of copy number variations (CNVs) within the entire genome. This technique, compared to the conventional CGH, significantly improves the identification of chromosomal abnormalities. However, due to the random noise inherited in the imaging and hybridization process, identifying statistically significant DNA copy number changes in aCGH data is challenging. We propose a novel approach that uses the mean and variance change point model (MVCM) to detect CNVs or breakpoints in aCGH data sets. We derive an approximate p-value for the test statistic and also give the estimate of the locus of the DNA copy number change. We carry out simulation studies to evaluate the accuracy of the estimate and the p-value formulation. These simulation results show that the approach is effective in identifying copy number changes. The approach is also tested on fibroblast cancer cell line data, breast tumor cell line data, and breast cancer cell line aCGH data sets that are publicly available. Changes that have not been identified by the circular binary segmentation (CBS) method but are biologically verified are detected by our approach on these cell lines with higher sensitivity and specificity than CBS.

KW - ACGH microarray data

KW - CNVs

KW - DNA copy numbers

KW - Gene expression

KW - Statistical hypothesis testing

UR - http://www.scopus.com/inward/record.url?scp=75449088061&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=75449088061&partnerID=8YFLogxK

U2 - 10.1109/TCBB.2008.129

DO - 10.1109/TCBB.2008.129

M3 - Article

VL - 6

SP - 529

EP - 541

JO - IEEE/ACM Transactions on Computational Biology and Bioinformatics

JF - IEEE/ACM Transactions on Computational Biology and Bioinformatics

SN - 1545-5963

IS - 4

M1 - 4695823

ER -