A new measure of population structure using multiple single nucleotide polymorphisms and its relationship with FST

Hongyan Xu, Bayazid Sarkar, Varghese George

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

Background. Large-scale genome-wide association studies are promising for unraveling the genetic basis of complex diseases. Population structure is a potential problem, the effects of which on genetic association studies are controversial. The first step to systematically quantify the effects of population structure is to choose an appropriate measure of population structure for human data. The commonly used measure is Wright's FST. For a set of subpopulations it is generally assumed to be one value of FST. However, the estimates could be different for distinct loci. Since population structure is a concept at the population level, a measure of population structure that utilized the information across loci would be desirable. Findings. In this study we propose an adjusted C parameter according to the sample size from each sub-population. The new measure C is based on the c parameter proposed for SNP data, which was assumed to be subpopulation-specific and common for all loci. In this study, we performed extensive simulations of samples with varying levels of population structure to investigate the properties and relationships of both measures. It is found that the two measures generally agree well. Conclusion. The new measure simultaneously uses the marker information across the genome. It has the advantage of easy interpretation as one measure of population structure and yet can also assess population differentiation.

Original languageEnglish (US)
Article number21
JournalBMC Research Notes
Volume2
DOIs
StatePublished - Dec 1 2009

Fingerprint

Polymorphism
Single Nucleotide Polymorphism
Nucleotides
Genes
Population
Genome-Wide Association Study
Genetic Association Studies
Sample Size
Genome

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)

Cite this

@article{02b52cbb8a7f4aafb4a03b43cbab8549,
title = "A new measure of population structure using multiple single nucleotide polymorphisms and its relationship with FST",
abstract = "Background. Large-scale genome-wide association studies are promising for unraveling the genetic basis of complex diseases. Population structure is a potential problem, the effects of which on genetic association studies are controversial. The first step to systematically quantify the effects of population structure is to choose an appropriate measure of population structure for human data. The commonly used measure is Wright's FST. For a set of subpopulations it is generally assumed to be one value of FST. However, the estimates could be different for distinct loci. Since population structure is a concept at the population level, a measure of population structure that utilized the information across loci would be desirable. Findings. In this study we propose an adjusted C parameter according to the sample size from each sub-population. The new measure C is based on the c parameter proposed for SNP data, which was assumed to be subpopulation-specific and common for all loci. In this study, we performed extensive simulations of samples with varying levels of population structure to investigate the properties and relationships of both measures. It is found that the two measures generally agree well. Conclusion. The new measure simultaneously uses the marker information across the genome. It has the advantage of easy interpretation as one measure of population structure and yet can also assess population differentiation.",
author = "Hongyan Xu and Bayazid Sarkar and Varghese George",
year = "2009",
month = "12",
day = "1",
doi = "10.1186/1756-0500-2-21",
language = "English (US)",
volume = "2",
journal = "BMC Research Notes",
issn = "1756-0500",
publisher = "BioMed Central",

}

TY - JOUR

T1 - A new measure of population structure using multiple single nucleotide polymorphisms and its relationship with FST

AU - Xu, Hongyan

AU - Sarkar, Bayazid

AU - George, Varghese

PY - 2009/12/1

Y1 - 2009/12/1

N2 - Background. Large-scale genome-wide association studies are promising for unraveling the genetic basis of complex diseases. Population structure is a potential problem, the effects of which on genetic association studies are controversial. The first step to systematically quantify the effects of population structure is to choose an appropriate measure of population structure for human data. The commonly used measure is Wright's FST. For a set of subpopulations it is generally assumed to be one value of FST. However, the estimates could be different for distinct loci. Since population structure is a concept at the population level, a measure of population structure that utilized the information across loci would be desirable. Findings. In this study we propose an adjusted C parameter according to the sample size from each sub-population. The new measure C is based on the c parameter proposed for SNP data, which was assumed to be subpopulation-specific and common for all loci. In this study, we performed extensive simulations of samples with varying levels of population structure to investigate the properties and relationships of both measures. It is found that the two measures generally agree well. Conclusion. The new measure simultaneously uses the marker information across the genome. It has the advantage of easy interpretation as one measure of population structure and yet can also assess population differentiation.

AB - Background. Large-scale genome-wide association studies are promising for unraveling the genetic basis of complex diseases. Population structure is a potential problem, the effects of which on genetic association studies are controversial. The first step to systematically quantify the effects of population structure is to choose an appropriate measure of population structure for human data. The commonly used measure is Wright's FST. For a set of subpopulations it is generally assumed to be one value of FST. However, the estimates could be different for distinct loci. Since population structure is a concept at the population level, a measure of population structure that utilized the information across loci would be desirable. Findings. In this study we propose an adjusted C parameter according to the sample size from each sub-population. The new measure C is based on the c parameter proposed for SNP data, which was assumed to be subpopulation-specific and common for all loci. In this study, we performed extensive simulations of samples with varying levels of population structure to investigate the properties and relationships of both measures. It is found that the two measures generally agree well. Conclusion. The new measure simultaneously uses the marker information across the genome. It has the advantage of easy interpretation as one measure of population structure and yet can also assess population differentiation.

UR - http://www.scopus.com/inward/record.url?scp=77049105150&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77049105150&partnerID=8YFLogxK

U2 - 10.1186/1756-0500-2-21

DO - 10.1186/1756-0500-2-21

M3 - Article

C2 - 19284702

AN - SCOPUS:77049105150

VL - 2

JO - BMC Research Notes

JF - BMC Research Notes

SN - 1756-0500

M1 - 21

ER -