288B Poster - 03. Evolution
Friday April 08, 2:00 PM - 4:00 PM

Predicting Gene Essentiality in Non-Model Drosophila Species to Understand Phenotypic Evolution of New Genes


Authors:
Dylan Sosa; Manyuan Long

Affiliation: Department of Ecology & Evolution, The University of Chicago, Chicago, USA

Keywords:
a. genome evolution; t. bioinformatic and genome tools

Essential genes, which lead to lethality if functionless, are conventionally presumed to be conserved and ancient, whereas young genes are considered dispensable and nongermane to organismal survival. New genes, which are evolutionarily young, have been found to have essential functions in diverse processes such as centromere targeting, gametogenesis, human brain development, reproduction, and protein diversification. As new genes retain characteristics of the evolutionary forces that engendered them, their structures, and their functions; they provide a unique opportunity to gain insight into the evolution of essential functions.

Empirical investigations of essential genes on a whole-genome scale via functional genomic experiments such as RNAi are laborious, expensive, typically available only for model species, and are hardly suitable for complex organisms such as humans or mice. This has resulted in a deficit of functional and phenotypic data for understudied non-model species, limiting our understanding of the evolution of essential gene functions. The intractability of creating knock-out lines for non-model species, or infeasibility in the case of complex organisms, has necessitated the ability to accurately and precisely predict essential genes in silico as candidates for functional interrogation and evolutionary analysis.

In this work, I will develop machine learning methods to utilize features extracted and patterns learned from D. melanogaster 3rd generation sequencing-based assemblies and phenomic data to predict and identify essential new genes in non-model Drosophila species as targets for experimental and evolutionary analyses. Specifically, I will develop deep learning and ensemble learning algorithms to predict essential genes in non-model species followed by gene age dating to identify new genes and their putative origination mechanisms. I will then conduct CRISPR/Cas9-based knock-outs of candidate essential new genes to validate their essentiality by measuring lethality and fertility effects in the knock-out lines I create. This work will yield insight into the evolution and development of gene essentiality in both model and non-model species, as well as provide open-source computational software for predicting essential new genes in Drosophila species with the potential for application in other non-model organisms.