216W Poster - Population Genetics
Wednesday June 08, 8:30 PM - 9:15 PM

Allelic and array size variation at human centromeres


Authors:
Carl Veller 1; Sasha Langley 2; Graham Coop 1; Charles Langley 1

Affiliations:
1) University of California, Davis; 2) University of California, Berkeley

Keywords:
Molecular Evolution

Progress in understanding the evolutionary genetics of centromeres has been hindered by their repetitive sequence content, which has historically precluded their assembly and thus analysis of the genetic variation present at these functionally important genomic sites. Recently, the substantial linkage disequilibrium caused by low recombination in and around centromeres has been exploited to characterize their allelic diversity, without need for assembling the repetitive centromere sequences themselves. Here, we characterize population genetic variation at human centromeres in a large, pedigree-based genomic dataset. We identify major centromere haplotype (cenhap) clades for each chromosome, and confirm that cenhaps typed on this basis show consistent transmission within pedigrees. Next, we develop and validate a read-count based proxy for the total centromere array size per chromosome per individual, and apply a maximum likelihood method to estimate individual centromere array sizes. We find substantial size variation, both within and (especially) between cenhap clades. Examining transmission within family pedigrees, we find no evidence of strong segregation distortion in favor of one cenhap allele over another, or of larger cenhaps over smaller ones. Finally, we estimate full genealogical trees from single-nucleotide variation within cenhaps, and use these trees to evaluate various models for the long-term evolution of centromere array size.