282T Poster - Population Genetics
Thursday June 09, 8:30 PM - 9:15 PM

Digital image processing using alpha-molecules to detect selective sweeps


Authors:
Mahmudul Hasan; Md Ruhul Amin; Michael DeGiorgio

Affiliation: Florida Atlantic University

Keywords:
Theory & Method Development

In recent years, advances in image processing and machine learning have fueled a paradigm shift in detecting genomic regions under natural selection. Early machine learning techniques employed population-genetic summary statistics as features, which focus on specific genomic patterns expected by adaptive and neutral processes. Though such engineered features are important when training data are limited, the ease at which synthetic data can be generated in modern times has led to the recent development of approaches that take in images of haplotype alignments and automatically extract important features with deep learning techniques. Alpha-molecules are a class of techniques for multi- scale representation of objects that can extract a diverse set of features from images. One such method, termed wavelet decomposition, lends greater control over high-frequency components of images. Another method, curvelet decomposition, is an extension of the wavelet concept that considers events occurring along curves within images. We show that application of these alpha-molecule techniques to extract features from images of haplotypes yields high power and accuracy to detect selective sweep signatures from genomic data with both linear and non-linear machine learning classifiers. Moreover, we find that such models are easy to visualize and interpret, with performance rivaling those of contemporary deep learning approaches for detecting selective sweeps.