391T Poster - Quantitative Genetics
Thursday June 09, 9:15 PM - 10:00 PM

Ped_slim: a family pedigree toolkit to investigate distant relative misidentification in long-range familial searching


Authors:
Joaquin Magana 1; Miguel Guardado 1,2; Shalom Gutierrez 1; Sthen Campana 1; Kaela Syas 1; Emily Samperio 1; Cynthia Perez 1; Berenice Chavez 1; Selena Hernandez 1; Rori Rohlfs 1

Affiliations:
1) San Francisco State University; 2) University of California San Francisco

Keywords:
Theory & Method Development

Investigative Genetic Genealogy (IGG) is a forensic technique used to identify a criminal suspect through their long-distance relatives, such as a third cousin. IGG uses shared inheritance of autosomal DNA segments, Identical by Descent (IBD), to help identify genetic relationships. Studies of European-ancestry data have shown the power of this technique, with an estimated sixty percent of individuals in the United States identifiable through a third cousin or closer match. However, less is known about IGG accuracy across other ancestral population groups. One reason for this is the lack of software available for identifying complex family relationships and efficiently simulating genomes onto familial pedigrees. We developed a family pedigree software, ped_slim, that can investigate the misidentification rate of IGG for different populations. We created a python command-line-based tool to interrogate family pedigrees, providing a simple interface to perform complex pedigree genetic simulations using SLiM, a forward evolutionary software. Ped_slim comes with three main features: (i) simulate family pedigrees structures, (ii) simulate genomes on pedigree structures, and (iii) identify the familial relationship between all pairs of individuals. We represent family pedigrees as directed graph data structures for these three features, where nodes represent individuals and edges define parent-child relationships. To perform the genetic simulation, ped_slim utilizes a directed graph to convert the family pedigree into a file SLiM can read, in order to simulate the family's genome. We validate our software by estimating the kinship of pairs of genetic relatives in our simulations, initializing the family founders with individuals from the 1000 genomes consortium. We show that the kinship estimated from our simulations fits the expected known genetic relation. Additionally, we showcase the ped_slim feature of identifying familial relationships by showing that as estimated genetic relationships grow more distant, less expected kinships are seen between the pair of individuals. Finally, we utilize our simulated family pedigree feature by comparing standard familial statistics between simulated pedigrees and commonly used nuclear family pedigrees to show the benefit of simulating non-nuclear families. Overall, ped_slim will not only provide an open-source solution to investigate the accuracy of IGG, but also to genome simulations inside medical, forensic, and evolutionary genetic analysis.