243T Poster - Population Genetics
Thursday June 09, 9:15 PM - 10:00 PM

Large-scale comparative population genetics identifies repeated targets of natural selection in birds


Authors:
Allison Shultz 1; Cade Mirchandani 2; Sara Wuitchik 3; Brian Arnold 4; Erik Enbody 2; Russell Corbett-Detig 2; Timothy Sackton 3

Affiliations:
1) Natural History Museum of Los Angeles County, Los Angeles, CA; 2) University of California, Santa Cruz, Santa Cruz, CA; 3) Harvard University, Cambridge, MA; 4) Princeton University, Princeton, NJ

Keywords:
Natural selection

In the past decade, increasingly large quantities of sequencing data from a diverse range of non-model organisms has been submitted to public databases, but reanalysis and reuse of this rich data has been difficult due to lack of infrastructure and computational batch effects. To facilitate large-scale comparative population genetics research, we have implemented snpArcher, an easy-to-use snakemake pipeline to generate variant calls using bwa/GATK, optimized for use in non-model organisms. Using snpArcher, we reanalyzed public resequencing datasets covering nearly 5000 individuals from over 100 species of non-mammalian vertebrates. This collection of analyzed datasets, all generated with consistent methods and filtering, is publicly available via Globus, as the “Comparative Population Genomics Collection.” From this full collection, we selected a subset of species for which an annotated reference genome and appropriate outgroup resequencing data were available for additional analysis. For these species, we used MacDonald-Kreitman tests to identify protein-coding genes with evidence for recent positive selection. By comparing these targets of recent selection across species, we were able to identify a set of commonly adapting proteins. We highlight examples of repeatedly selected proteins, including a number with functions in the immune system, and discuss the implications for convergent molecular adaptation.