A statistical framework for integrating two microarray data sets in differential expression analysis

Yinglei Lai, Sarah E. Eckenrode, Jin-Xiong She

Research output: Contribution to journalArticle

8 Citations (Scopus)

Abstract

Background: Different microarray data sets can be collected for studying the same or similar diseases. We expect to achieve a more efficient analysis of differential expression if an efficient statistical method can be developed for integrating different microarray data sets. Although many statistical methods have been proposed for data integration, the genome-wide concordance of different data sets has not been well considered in the analysis. Results: Before considering data integration, it is necessary to evaluate the genome-wide concordance so that misleading results can be avoided. Based on the test results, different subsequent actions are suggested. The evaluation of genome-wide concordance and the data integration can be achieved based on the normal distribution based mixture models. Conclusion: The results from our simulation study suggest that misleading results can be generated if the genome-wide concordance issue is not appropriately considered. Our method provides a rigorous parametric solution. The results also show that our method is robust to certain model misspecification and is practically useful for the integrative analysis of differential expression.

Original languageEnglish (US)
Article numberS23
JournalBMC Bioinformatics
Volume10
Issue numberSUPPL. 1
DOIs
StatePublished - Jan 30 2009

Fingerprint

Differential Expression
Microarrays
Microarray Data
Data integration
Genes
Concordance
Genome
Data Integration
Statistical methods
Statistical method
Normal Distribution
Normal distribution
Parametric Solutions
Model Misspecification
Mixture Model
Gaussian distribution
Framework
Datasets
Simulation Study
Necessary

ASJC Scopus subject areas

  • Biochemistry
  • Molecular Biology
  • Computer Science Applications
  • Structural Biology
  • Applied Mathematics

Cite this

A statistical framework for integrating two microarray data sets in differential expression analysis. / Lai, Yinglei; Eckenrode, Sarah E.; She, Jin-Xiong.

In: BMC Bioinformatics, Vol. 10, No. SUPPL. 1, S23, 30.01.2009.

Research output: Contribution to journalArticle

@article{619f9b2a0da045fbb3975eb12d00bf51,
title = "A statistical framework for integrating two microarray data sets in differential expression analysis",
abstract = "Background: Different microarray data sets can be collected for studying the same or similar diseases. We expect to achieve a more efficient analysis of differential expression if an efficient statistical method can be developed for integrating different microarray data sets. Although many statistical methods have been proposed for data integration, the genome-wide concordance of different data sets has not been well considered in the analysis. Results: Before considering data integration, it is necessary to evaluate the genome-wide concordance so that misleading results can be avoided. Based on the test results, different subsequent actions are suggested. The evaluation of genome-wide concordance and the data integration can be achieved based on the normal distribution based mixture models. Conclusion: The results from our simulation study suggest that misleading results can be generated if the genome-wide concordance issue is not appropriately considered. Our method provides a rigorous parametric solution. The results also show that our method is robust to certain model misspecification and is practically useful for the integrative analysis of differential expression.",
author = "Yinglei Lai and Eckenrode, {Sarah E.} and Jin-Xiong She",
year = "2009",
month = "1",
day = "30",
doi = "10.1186/1471-2105-10-S1-S23",
language = "English (US)",
volume = "10",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",
number = "SUPPL. 1",

}

TY - JOUR

T1 - A statistical framework for integrating two microarray data sets in differential expression analysis

AU - Lai, Yinglei

AU - Eckenrode, Sarah E.

AU - She, Jin-Xiong

PY - 2009/1/30

Y1 - 2009/1/30

N2 - Background: Different microarray data sets can be collected for studying the same or similar diseases. We expect to achieve a more efficient analysis of differential expression if an efficient statistical method can be developed for integrating different microarray data sets. Although many statistical methods have been proposed for data integration, the genome-wide concordance of different data sets has not been well considered in the analysis. Results: Before considering data integration, it is necessary to evaluate the genome-wide concordance so that misleading results can be avoided. Based on the test results, different subsequent actions are suggested. The evaluation of genome-wide concordance and the data integration can be achieved based on the normal distribution based mixture models. Conclusion: The results from our simulation study suggest that misleading results can be generated if the genome-wide concordance issue is not appropriately considered. Our method provides a rigorous parametric solution. The results also show that our method is robust to certain model misspecification and is practically useful for the integrative analysis of differential expression.

AB - Background: Different microarray data sets can be collected for studying the same or similar diseases. We expect to achieve a more efficient analysis of differential expression if an efficient statistical method can be developed for integrating different microarray data sets. Although many statistical methods have been proposed for data integration, the genome-wide concordance of different data sets has not been well considered in the analysis. Results: Before considering data integration, it is necessary to evaluate the genome-wide concordance so that misleading results can be avoided. Based on the test results, different subsequent actions are suggested. The evaluation of genome-wide concordance and the data integration can be achieved based on the normal distribution based mixture models. Conclusion: The results from our simulation study suggest that misleading results can be generated if the genome-wide concordance issue is not appropriately considered. Our method provides a rigorous parametric solution. The results also show that our method is robust to certain model misspecification and is practically useful for the integrative analysis of differential expression.

UR - http://www.scopus.com/inward/record.url?scp=60849090744&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=60849090744&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-10-S1-S23

DO - 10.1186/1471-2105-10-S1-S23

M3 - Article

VL - 10

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

IS - SUPPL. 1

M1 - S23

ER -