« search calendars« TRIPODS (Transdisciplinary Research in Principles of Data Science) Seminar Series

« Predictability, Stability, and Causality with a Case Study to Find Genetic Drivers of a Heart Disease

Predictability, Stability, and Causality with a Case Study to Find Genetic Drivers of a Heart Disease

April 15, 2022, 4:00 PM - 5:00 PM

Location:

Online Event

Bin Yu, University of California, Berkeley

“A.I. is like nuclear energy — both promising and dangerous” — Bill Gates, 2019.
Data Science is a pillar of A.I. and has driven most of recent cutting-edge discoveries in biomedical research and beyond. Human judgement calls are ubiquitous at every step of a data science life cycle, e.g., in choosing data cleaning methods, predictive algorithms and data perturbations. Such judgment calls are often responsible for the “dangers” of A.I. To maximally mitigate these dangers, we developed a framework based on three core principles: Predictability, Computability and Stability (PCS). The PCS framework unifies and expands on the best practices of machine learning and statistics. It consists of a workflow and documentation and is supported by our software package v-flow.
In this talk, we first illustrate the PCS framework through the development of iterative random forests (iRF) for predictable and stable non-linear interaction discovery (in collaboration with the Brown Lab at LBNL and Berkeley Statistics). In pursuit of genetic drivers of a heart disease called hypertrophic cardiomyopathy as a CZ Biohub project in collaboration with the Ashley Lab at Stanford Medical School and others, we use iRF and UK Biobank data to recommend gene-gene interaction targets for knock-off experiments. We then analyze the experimental data to show promising findings.

Bio:
Bin Yu is Chancellor’s Distinguished Professor and Class of 1936 Second Chair in the departments of statistics and EECS at UC Berkeley. She leads the Yu Group which consists of students and postdocs from Statistics and EECS. She was formally trained as a statistician, but her research extends beyond the realm of statistics. Together with her group, her work has leveraged new computational developments to solve important scientific problems by combining novel statistical machine learning approaches with the domain expertise of her many collaborators in neuroscience, genomics and precision medicine. She and her team develop relevant theory to understand random forests and deep learning for insight into and guidance for practice.
She is a member of the U.S. National Academy of Sciences and of the American Academy of Arts and Sciences. She is Past President of the Institute of Mathematical Statistics (IMS), Guggenheim Fellow, Tukey Memorial Lecturer of the Bernoulli Society, Rietz Lecturer of IMS, and a COPSS E. L. Scott prize winner. She holds an Honorary Doctorate from The University of Lausanne (UNIL), Faculty of Business and Economics, in Switzerland. She has recently served on the inaugural scientific advisory committee of the UK Turing Institute for Data Science and AI, and is serving on the editorial board of Proceedings of National Academy of Sciences (PNAS).

Link to video: https://youtu.be/mY1dDreIG9w

 

Presented via zoom: https://rutgers.zoom.us/j/94327509072

Meeting ID: 943 2750 9072

Password: 926448