285W Poster - Population Genetics
Wednesday June 08, 9:15 PM - 10:00 PM

Robust supervised machine learning for population genetic inference with domain adaptation


Authors:
Ziyi Mo 1, 2; Adam Siepel 2

Affiliations:
1) Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY; 2) School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY

Keywords:
Theory & Method Development

A series of supervised machine learning methods have recently been proposed to address a range of problems in population genetics. Despite their much-improved performance over traditional statistical methods, model mis-specification remains the Achilles’ heel of this new paradigm. Here, we propose that domain adaptation techniques can be a powerful tool to mitigate the effect of model mis-specification on the performance of these methods. We demonstrated that the Deep Reconstruction-Classification Network (DRCN) helped our previously proposed SIA model achieve a better performance in identifying selective sweeps when the parameters used for simulating training data mismatch those underlying the generation of the data at test time. We anticipate that this approach will be widely applicable and can be an important tool to build confidence in the results when adopting supervised machine learning methods for inference.