189V Poster Online - Virtual Posters
Tuesday June 07, 11:00 AM - 3:00 PM

In-silico cross-contamination affects inference of genetic relationships in Saccharomyces cerevisiae


Authors:
Audrey Ward 1; Eduardo Scopel 2 ; Brent Shuman 3 ; Michelle Momany 3; Douda Bensasson 1,2,3

Affiliations:
1) University of Georgia; 2) Institute of Bioinformatics, University of Georgia, Athens, GA ; 3) Department of Plant Biology, University of Georgia, Athens, GA

Keywords:
Phylogenetics, Macroevolution, and Biogeography

Population genetic analysis depends on the quality of whole genome sequences. Contamination of sequence data may occur in vitro prior to sequencing or in silico during multiplex sequencing as a result of cross-contamination or barcoding issues. Testing for interspecies contamination is common practice. In contrast, identification and prevention of within-species contamination is more difficult. To test the effects of contamination on genome analyses, we contaminated short read genome data of Saccharomyces cerevisiae in silico with genome data from another S. cerevisiae strain. We repeated the contamination experiment using strains with varying degrees of relatedness and ploidies, in addition to varying level of cross-contamination along a range from 0 to 50%. Using a standard base calling pipeline, we found that cross contaminated genomes appeared to produce good quality genome-wide data. Past studies estimated relationships among S. cerevisiae lineages using maximum likelihood trees inferred from whole-genome data after excluding strains showing recent admixture. We similarly estimated trees that include single simulated cross-contaminated genomes to assess if within-species contamination affects the inference of their genetic relationships. We found that between 5 and 10% contamination is enough to significantly change tree topologies, making contaminated strains look like hybrids in maximum likelihood trees. These results suggest that even low levels of contamination significantly change trees and may lead to misunderstanding of evolutionary relationships within species.