240W Poster - Population Genetics
Wednesday June 08, 8:30 PM - 9:15 PM

A heterogeneous landscape of selection and interactions in genes revealed by two-locus statistics


Author:
Aaron Ragsdale

Affiliation: University of Wisconsin-Madison, Madison, WI

Keywords:
Natural selection

Selection impacts patterns of genetic diversity over large regions of the genome. How mutations combine to affect individual fitnesses is actively studied using theoretical, simulation, and empirical approaches. In addition to distorting patterns of linked neutral variation (such as background selection and hitchhiking), selected mutations are known to interfere and interact with each other, affecting probabilities of fixation, allele frequency trajectories, and correlations between pairs of mutations. Linkage disequilibrium (LD) is particularly sensitive to interference and epistasis between selected mutations, and a number of recent studies have used patterns of LD to test for epistatic interactions, with some disagreement over interpreting observations from data.

Despite ongoing interest in learning about selective interactions from observed patterns of LD, there is a lack of analytic or even numerical approaches for expectations for patterns of variation between pairs of loci under the combined effects of selection, dominance, epistasis, and demography. Here, we develop a numerical approach to compute the two-locus sampling distribution under diploid selection with arbitrary dominance and epistasis, recombination, and variable population size. We use this to explore how epistasis and dominance affect expected patterns of LD, including for non-steady-state demography relevant to human populations. Using whole-genome sequencing data from humans, we find that selection strengths and interactions vary across protein-coding genes in a way that correlates with annotated domains and conserved genic elements. Observed positive LD between missense mutations within genes is largely driven by positive correlations between pairs of mutations that fall within the same conserved domain, pointing to compensatory or antagonistic epistatic effects in those domains. The heterogeneous landscape of both mutational fitness effects and selective interactions within protein-coding genes calls for more refined inferences of the joint distribution of fitness and interactive effects.