288W Poster - Population Genetics
Wednesday June 08, 8:30 PM - 9:15 PM

Insights into D. melanogaster and D. simulans transcriptome evolution and complexity using transcript distance (TranD)


Authors:
Adalena Nanni 1,2; James Titus-McQuillan 3; Oleksandr Moskalenko 4; Francisco Pardo-Palacios 5; Sarah Signor 6; Srna Vlaho 7; Zihao Liu 1,2; Ana Conesa 2,5,8; Rebekah Rogers 3; Lauren McIntyre 1,2

Affiliations:
1) Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL; 2) University of Florida Genetics Institute, University of Florida, Gainesville, FL; 3) University of North Carolina Department of Bioinformatics, Charlotte, NC; 4) University of Florida Research Computing, University of Florida, Gainesville, FL; 5) Dept. of Applied Statistics and Operational Research, and Quality, Polytechnical University of Valencia, Spain; 6) Department of Biological Sciences, North Dakota State University, Fargo, ND; 7) Department of Biological Sciences, University of Southern California, Los Angeles, CA; 8) Institute for Integrative Systems Biology, Spanish National Research Council (CSIC), Paterna, Spain

Keywords:
Theory & Method Development

Alternative splicing is a critical component of evolution. As species branches deepen, tracing transcript orthologs is complex. Yet, comparing transcripts between and within species is an important first step toward understanding questions about how evolution of transcript structure changes between species and contributes to sub-functionalization. These questions are complicated by issues of data quality and availability, with the amount and quality of data differing widely among species and between tissues. The recent explosion of affordable long-read sequencing of mRNA has enabled the study of transcriptional variation. There is a clear need for additional straightforward, reproducible metrics that compare transcripts. In addition to total transcript length, structural phenotypes (intron retention, donor/acceptor variation, alternative exon cassettes, alternative 5’ and 3’ UTRs) and nucleotide-level distance metrics to compare transcripts are suggested and then implemented in TranD, a PyPi package and available in the open-source web-based Galaxy platform. We use TranD to enumerate variation in long reads compared to the annotation and to compare methods for estimating transcripts from long reads (FLAIR and IsoSeq3). Additional illustrations of the utility of this approach are given in comparisons of competing annotation and in identifying isoform variation between male and female D. melanogaster.