Title: Adventures with large biomedical datasets: diseases, medical records, environment and genetics
Speaker: Prof. Andrey Rzhetsky University of Chicago
Time: 2018-12-11 14:00-2018-12-11 15:00
Venue: FIT 1-222

Abstract:

I will attempt to cover several interrelated analysis topics, spending more time on parts that resonate with the audience.

First, I will introduce our recent study analyzing phenotypic data harvested from over 150 million unique patients. Curiously, these non-genetic large-scale data can be used for genetic inferences. We discovered that complex diseases are associated with unique sets of rare Mendelian variants, referred to as the “Mendelian code.” We found that the genetic loci indicated by this code were enriched for common risk alleles. Moreover, we used probabilistic modeling to demonstrate for the first time that deleterious Mendelian variants likely contribute to complex disease risk in a non-additive fashion.

The second topic that I hope to cover is analysis of apparent clusters of neurodevelopmental disorders. Disease clusters are defined as geographically compact areas where a particular disease, such as a cancer, shows a significantly increased rate. It is presently unclear how common are such clusters for neurodevelopmental maladies, such as autism spectrum disorders (ASD) and intellectual disability (ID). As in the first story, examining data for one third of the whole US population, we demonstrated that (1) ASD and ID are manifesting strong clustering across US counties; (2) counties with high ASD rates also appear to have high ID rates, and (3) the spatial variation of both phenotypes appears to be driven by environment, and, by a lesser extent, by economic incentives at the state level.

The third topic is about using electronic medical record data to 1) estimate the heritability and familial environmental patterns of diseases, and 2) infer the genetic and environmental correlations between disease pairs from a set of complex diseases. I am particularly interested in inferring objective classifications// of diseases (based on a formal optimization criterion), separately from environmental and genetic factors.



Short Bio:

RzhetskyAndrey Rzhetsky is an Edna K. Papazian Professor of Medicine and Human Genetics, at the University of Chicago. He is also a Pritzker Scholar, and a Senior Fellow of both the Computation Institute, and the Institute for Genomics and Systems Biology at the University of Chicago.

His research is focused on computational analysis of complex human phenotypes in context of changes and perturbations of underlying molecular networks. The input data for these studies is supplied by large-scale mining of free text, computation over clinical records, and high-throughput systems biology experiments.