Assessment of population structure and its effects on genome-wide association studies

Research output: Contribution to journalArticle

1 Scopus citations

Abstract

Large-scale genome-wide association studies are promising for unraveling the genetic basis of complex diseases. However, population structure is a potential problem, the effects of which on genetic association studies are controversial. Quantification of the effects of population structure on large scale genetic association studies is needed for valid analysis of data and correct interpretation of results. In this study, we performed extensive coalescent-based simulation study with varying levels of population structure to investigate the effects of population structure on large-scale genetic association studies. The effects of population structure are measured by the multiplicative changes of the probability of Type I error, which is then correlated with the levels of population structure. It is found that at each nominal level of association tests, there is a positive relationship between the level of population structure and its effects, which could be summarized well with a regression function. It is also found that at a specific level of population structure, its effect on association study increases drastically as the significance level of the test decreases. The Type I error is inflated by an amount approximately equal to Wright's FST, a measure that is used to quantify the magnitude of population structure. Therefore, in genome-wide association studies, the effects of population structure cannot be safely ignored, and must be accounted for with proper methods. This study provides quantitative guidelines to account for the effects of population structure on genome-wide association studies in admixed populations.

Original languageEnglish (US)
Pages (from-to)2843-2855
Number of pages13
JournalCommunications in Statistics - Theory and Methods
Volume38
Issue number16-17
DOIs
StatePublished - Jan 1 2009

    Fingerprint

Keywords

  • Complex diseases
  • False positives
  • Genetic variation
  • Genome-wide association
  • Heterozygosity
  • Population structure
  • SNP

ASJC Scopus subject areas

  • Statistics and Probability

Cite this