Statistical Considerations on NGS Data for Inferring Copy Number Variations

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

The next-generation sequencing (NGS) technology has revolutionized research in genetics and genomics, resulting in massive NGS data and opening more fronts to answer unresolved issues in genetics. NGS data are usually stored at three levels: image files, sequence tags, and alignment reads. The sizes of these types of data usually range from several hundreds of gigabytes to several terabytes. Biostatisticians and bioinformaticians are typically working with the aligned NGS read count data (hence the last level of NGS data) for data modeling and interpretation. To horn in on the use of NGS technology, researchers utilize it to profile the whole genome to study DNA copy number variations (CNVs) for an individual subject (or patient) as well as groups of subjects (or patients). The resulting aligned NGS read count data are then modeled by proper mathematical and statistical approaches so that the loci of CNVs can be accurately detected. In this book chapter, a summary of most popularly used statistical methods for detecting CNVs using NGS data is given. The goal is to provide readers with a comprehensive resource of available statistical approaches for inferring DNA copy number variations using NGS data.

Original languageEnglish (US)
Title of host publicationMethods in Molecular Biology
PublisherHumana Press Inc.
Pages27-58
Number of pages32
DOIs
StatePublished - 2021

Publication series

NameMethods in Molecular Biology
Volume2243
ISSN (Print)1064-3745
ISSN (Electronic)1940-6029

Keywords

  • Bayesian analysis
  • CNVs
  • Information criterion
  • Likelihood ratio test
  • NGS reads
  • Read counts
  • Read depth
  • Statistical change point analysis

ASJC Scopus subject areas

  • Molecular Biology
  • Genetics

Fingerprint Dive into the research topics of 'Statistical Considerations on NGS Data for Inferring Copy Number Variations'. Together they form a unique fingerprint.

Cite this