Mutation is the ultimate source of genetic variation, and mutation rate is thus an important parameter governing the extent of genetic variation. Microsatellites are highly informative genetic markers that have been widely used in genetic studies. While previous studies showed that the mutation rate differs in di-, tri-, and tetranucleotide repeats, how mutation rate distributes within each class of repeat is poorly understood. This study first revealed the pattern of the mutation rate variation within the dinucleotide repeats. Two data sets were used. The first is the allele frequency data from 115 microsatellites with dinucleotide repeats distributed along the human genome in 10 worldwide populations. The second data set is much larger, consisting of the allele frequency of 5252 dinucleotide repeats from the Genome Database. Mutation rate for each locus is estimated through a new homozygosity-based estimator, which has been shown to be unbiased and highly efficient and is reasonably robust against deviations from the single-step model. The mutation rates among loci can be approximated well by a gamma distribution and its shape parameter can be accurately estimated with this approach. This result provides the basic guidelines for analyzing the large-scale genomic data from microsatellite loci.
ASJC Scopus subject areas