Discovering statistically significant periodic gene expression

Jie Chen, Kuang Chao Chang

Research output: Contribution to journalReview article

4 Citations (Scopus)

Abstract

One frequent application of microarray experiments is in the study of monitoring gene activities in a cell during cell cycle or cell division. High throughput gene expression time series data are produced from such microarray experiments. A new computational and statistical challenge for analyzing such gene expression time course data, resulting from cell cycle microarray experiments, is to discover genes that are statistically significantly periodically expressed during the cell cycle. Such a challenge occurs due to the large number of genes that are simultaneously measured, a moderate to small number of measurements per gene taken at different time points and high levels of non-normal random noises inherited in the data. Computational and statistical approaches to discovery and validation of periodic patterns of gene expression are, however, very limited. A good method of analysis should be able to search for significant periodic genes with a controlled family-wise error (FWE) rate or controlled false discovery rate (FDR) and any other variations of FDR, when all gene expression profiles are compared simultaneously. In this review paper, a brief summary of currently used methods in searching for periodic genes will be given. In particular, two methods will be surveyed in details. The first one is a novel statistical inference approach, the C & G Procedure that can be used to effectively detect statistically significantly periodically expressed genes when the gene expression is measured on evenly spaced time points. The second one is the Lomb-Scargle periodogram analysis, which can be used to discover periodic genes when the gene profiles are not measured on evenly spaced time points or when there are missing values in the profiles. The ultimate goal of this review paper is to give an expository of the two surveyed methods to researchers in related fields.

Original languageEnglish (US)
Pages (from-to)228-246
Number of pages19
JournalInternational Statistical Review
Volume76
Issue number2
DOIs
StatePublished - Aug 1 2008
Externally publishedYes

Fingerprint

Gene Expression
Gene
Cell Cycle
Microarray
Familywise Error Rate
Experiment
Gene expression
Periodogram
Gene Expression Profile
Random Noise
Cell Division
Missing Values
Gene Expression Data
Time Series Data
Statistical Inference
High Throughput
Monitoring
Cell

Keywords

  • Classical periodogram
  • FDR
  • Gene expression
  • Lomb-scargle periodogram
  • Periodic signals

ASJC Scopus subject areas

  • Mathematics(all)
  • Statistics and Probability

Cite this

Discovering statistically significant periodic gene expression. / Chen, Jie; Chang, Kuang Chao.

In: International Statistical Review, Vol. 76, No. 2, 01.08.2008, p. 228-246.

Research output: Contribution to journalReview article

@article{6d7d0ee824f34e148c19c7856a2a036a,
title = "Discovering statistically significant periodic gene expression",
abstract = "One frequent application of microarray experiments is in the study of monitoring gene activities in a cell during cell cycle or cell division. High throughput gene expression time series data are produced from such microarray experiments. A new computational and statistical challenge for analyzing such gene expression time course data, resulting from cell cycle microarray experiments, is to discover genes that are statistically significantly periodically expressed during the cell cycle. Such a challenge occurs due to the large number of genes that are simultaneously measured, a moderate to small number of measurements per gene taken at different time points and high levels of non-normal random noises inherited in the data. Computational and statistical approaches to discovery and validation of periodic patterns of gene expression are, however, very limited. A good method of analysis should be able to search for significant periodic genes with a controlled family-wise error (FWE) rate or controlled false discovery rate (FDR) and any other variations of FDR, when all gene expression profiles are compared simultaneously. In this review paper, a brief summary of currently used methods in searching for periodic genes will be given. In particular, two methods will be surveyed in details. The first one is a novel statistical inference approach, the C & G Procedure that can be used to effectively detect statistically significantly periodically expressed genes when the gene expression is measured on evenly spaced time points. The second one is the Lomb-Scargle periodogram analysis, which can be used to discover periodic genes when the gene profiles are not measured on evenly spaced time points or when there are missing values in the profiles. The ultimate goal of this review paper is to give an expository of the two surveyed methods to researchers in related fields.",
keywords = "Classical periodogram, FDR, Gene expression, Lomb-scargle periodogram, Periodic signals",
author = "Jie Chen and Chang, {Kuang Chao}",
year = "2008",
month = "8",
day = "1",
doi = "10.1111/j.1751-5823.2008.00048.x",
language = "English (US)",
volume = "76",
pages = "228--246",
journal = "International Statistical Review",
issn = "0306-7734",
publisher = "International Statistical Institute",
number = "2",

}

TY - JOUR

T1 - Discovering statistically significant periodic gene expression

AU - Chen, Jie

AU - Chang, Kuang Chao

PY - 2008/8/1

Y1 - 2008/8/1

N2 - One frequent application of microarray experiments is in the study of monitoring gene activities in a cell during cell cycle or cell division. High throughput gene expression time series data are produced from such microarray experiments. A new computational and statistical challenge for analyzing such gene expression time course data, resulting from cell cycle microarray experiments, is to discover genes that are statistically significantly periodically expressed during the cell cycle. Such a challenge occurs due to the large number of genes that are simultaneously measured, a moderate to small number of measurements per gene taken at different time points and high levels of non-normal random noises inherited in the data. Computational and statistical approaches to discovery and validation of periodic patterns of gene expression are, however, very limited. A good method of analysis should be able to search for significant periodic genes with a controlled family-wise error (FWE) rate or controlled false discovery rate (FDR) and any other variations of FDR, when all gene expression profiles are compared simultaneously. In this review paper, a brief summary of currently used methods in searching for periodic genes will be given. In particular, two methods will be surveyed in details. The first one is a novel statistical inference approach, the C & G Procedure that can be used to effectively detect statistically significantly periodically expressed genes when the gene expression is measured on evenly spaced time points. The second one is the Lomb-Scargle periodogram analysis, which can be used to discover periodic genes when the gene profiles are not measured on evenly spaced time points or when there are missing values in the profiles. The ultimate goal of this review paper is to give an expository of the two surveyed methods to researchers in related fields.

AB - One frequent application of microarray experiments is in the study of monitoring gene activities in a cell during cell cycle or cell division. High throughput gene expression time series data are produced from such microarray experiments. A new computational and statistical challenge for analyzing such gene expression time course data, resulting from cell cycle microarray experiments, is to discover genes that are statistically significantly periodically expressed during the cell cycle. Such a challenge occurs due to the large number of genes that are simultaneously measured, a moderate to small number of measurements per gene taken at different time points and high levels of non-normal random noises inherited in the data. Computational and statistical approaches to discovery and validation of periodic patterns of gene expression are, however, very limited. A good method of analysis should be able to search for significant periodic genes with a controlled family-wise error (FWE) rate or controlled false discovery rate (FDR) and any other variations of FDR, when all gene expression profiles are compared simultaneously. In this review paper, a brief summary of currently used methods in searching for periodic genes will be given. In particular, two methods will be surveyed in details. The first one is a novel statistical inference approach, the C & G Procedure that can be used to effectively detect statistically significantly periodically expressed genes when the gene expression is measured on evenly spaced time points. The second one is the Lomb-Scargle periodogram analysis, which can be used to discover periodic genes when the gene profiles are not measured on evenly spaced time points or when there are missing values in the profiles. The ultimate goal of this review paper is to give an expository of the two surveyed methods to researchers in related fields.

KW - Classical periodogram

KW - FDR

KW - Gene expression

KW - Lomb-scargle periodogram

KW - Periodic signals

UR - http://www.scopus.com/inward/record.url?scp=50049123258&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=50049123258&partnerID=8YFLogxK

U2 - 10.1111/j.1751-5823.2008.00048.x

DO - 10.1111/j.1751-5823.2008.00048.x

M3 - Review article

AN - SCOPUS:50049123258

VL - 76

SP - 228

EP - 246

JO - International Statistical Review

JF - International Statistical Review

SN - 0306-7734

IS - 2

ER -