Machine Learning for Healthcare 2016

Saban Research Institute

Program:

8:00 Breakfast

8:45 Welcome

9:00 Machine Learning Opportunities in the Explosion of Personalized Precision Medicine

Larry Smarr, PhD

We have reached the take off point in the generation of massive datasets from individuals and across populations, both of which are necessary for personalized precision medicine. I will give an example of my N=1 self-study, in which I have my human genome as well as multi-year time series of my gut microbiome genomics and over one hundred blood biomarkers. This is now being augmented with time series of my metabolome and immunome. These are then compared with hundreds of healthy people's gut microbiomes, revealing major shifts between health and disease. Multiple companies and organizations will soon be carrying out similar levels of analysis on hundreds of thousands of individuals. Machine learning techniques will be essential to bring the patterns out of these exponentially growing datasets

9:45 Machine learning that matters in healthcare: breaking down the silos

Leo Celi, MD

Quality of care, as would be reflected by the universal provision of standardized, evidence based and truly indicated care, has not improved to the degree one would have hoped. Similarly, while patient safety and medical errors have come into public awareness, advances in these areas have been slow, hard won, and unsupported by the kinds of smart, data driven engineering designs that have gone into other domains. The interest in applying machine learning to clinical practice is increasing yet the practical application of these techniques has been less than desirable. Clinicians continue to make determinations in a technically unsupported and unmonitored manner due to a lack of high-quality evidence or tools to support most day-to-day decisions. There is a persistent gap between the clinicians required to understand the context of the data and the engineers who are critical to extracting useable information from the increasing amount of healthcare data that is being generated. This talk focuses on the divide between the data science and healthcare silos, and posits that the lack of integration is the primary barrier to a data revolution in healthcare. I first discuss literature that supports the existence of this divide, and then I present recommendations on how to bridge the gap between practicing clinicians and data scientists.

10:30 Coffee Break and Discussion

10:45 Image-based Biomarkers and Prediction in Large Clinical Cohorts

Polina Golland, PhD

To take full advantage of clinically relevant information implicitly captured in medical images, we develop robust algorithms for quantifying disease burden from patient scans. We then demonstrate how genetic and clinical variables can be used to predict anatomy and anatomical change through a semi-parametric generative model. Joint modeling of image and genetic data promises to provide insights into genetic factors and anatomical effects of the disease. We demonstrate the promise of this approach on large collections of brain scans of different patient cohorts

11:30 Comprehensive predictive modeling at the bedside

Randall Moorman, MD

Early Warning Scores and other forms of predictive modeling present clinicians with real time estimates of the risks of imminent untoward events based on statistical models trained on legacy data sets. Nearly all such tools are based on static and intermittent data elements such as demographics, diagnoses, notes, vital sign measurements and lab test results. Continuous physiological monitoring such as EKG telemetry is another potential source, and has the potential advantage of higher data coverage. It introduces a new step in the modeling process, though, that of time series analysis of cardiorespiratory dynamics to detect signatures of illness. The University of Virginia group has investigated comprehensive approaches to predictive modeling that use static, intermittent and continuous data streams for early detection of subacute potentially catastrophic illness in infants and adults, in ICUs and on hospital floors.

Saban Research Patio

12:15 LUNCH

13:30 Spotlight Talks A

14:20 Posters A

15:20 Spotlight Talks B

16:10 Posters B

17:15 Improving the design and discovery of dynamic treatment strategies using reinforcement learning

Joelle Pineau, PhD

Reinforcement learning offers a powerful paradigm for automatically discovering and optimizing sequential treatments for chronic and life-threatening diseases. This talk will introduce basics of reinforcement learning and then discuss several aspects of this work, including: How should we collect data to learn good sequential treatment strategies? How can we learn a representation of the data that allows generalization across patients? How can we use the data collected to discover sequential treatment strategies that are tailored to patient characteristics and time-dependent outcomes? The methods presented will be illustrated using results of our work on learning adaptive neurostimulation policies for the treatment of epilepsy.

Saban Research Patio

18:00 Dinner and Discussion

Day 2

Saban Research Institutte

8:30 Breakfast

9:15 Processed data to derive clinically useful information

Michael Pinsky, MD and Artur Dubrawski, PhD

It is often difficult to accurately predict who, when, and why patients would develop shock because signs of shock often occur late when organ injury is already present. Three levels of aggregation of information can be used to aid the bedside clinician in this task: analysis of derived parameters of existing measured physiologic variables using simple bedside calculations (Functional Hemodynamic Monitoring), using prior physiologic data of similar subjects during periods of stability and disease to define quantitative metrics of level of severity; and to use libraries of responses across large and comprehensive collections of records of diverse subjects whose diagnosis, therapies and course of treatment is already known to predict not only disease severity, but also the subsequent behavior of the subject if left untreated or treated with one of the many therapeutic options. A major pre-analysis problem is the cleaning of data to remove non-physiologic artifacts due to technical errors, which correspond to >70% of all clinical alerts. We have been developing algorithms that effectively isolate ~85% of all artifacts among alerts generated from physiologic time series of vital sign data. The next problem is to define the minimal monitoring data set needed to initially identify patients at risk across all possible processes and then specifically monitor their response to targeted therapies known to improve outcomes. To address these issues, we represented the vital sign data with highly multivariate feature sets and used machine learning algorithms to infer parsimonious predictive models for cardiorespiratory insufficiency. We describe the nature of the required data sets and modeling approaches used to detect, forecast, and track evolution of risk for this severe condition. These approaches jointly enable earlier identification of cardiorespiratory insufficiency and direct focused patient-specific management. To validate our methodology, we used both a porcine model of hemorrhage and human vital sign data collected in a trauma step-down unit. Our results show value of truly multivariate fused approach versus more traditional single vital sign thresholding at detection, and how it can also allow for reliable forecasting of cardiorespiratory insufficiency before its overt signs become apparent. Also, increasing resolution of signal processing from mean data collected at regular intervals to beat-to-beat and waveform analysis progressively improves the predictive value of the fused parameters. In addition, we show that using personalized reference data can further improve detectability and predictability of cardio-respiratory insufficiency, if such data is available. Finally, we demonstrate that temporal evolution of risk for cardiorespiratory insufficiency is a heterogeneous yet a systematic process. Most patients who develop this condition follow one of only a handful typical risk evolution trajectories, and they can be assigned to their most likely trajectory type well ahead of the onset, therefore enabling further gains in predictability.

10:00 Clinical Abstract Talks and Software Demos

10:45: Clinical Abstract Posters

11:45: Culture Trumps Data

Bassam Kadry, MD

Why is it so hard to drive change in healthcare? The data, technology, and insights exist but despite this it is so hard to move the needle in the right direction. What's the point of developing a technology if it is never going to be used due to business, cultural, or human behavior challenges. Understanding these issues can help you have greater impact. Learn how to ask the right questions that will yield the greatest impact.

12:30: LUNCH

14:00: Panel Discussion

  • Randall Wetzel

  • Lee Hartsell

  • Suchi Saria

  • John Guttag

  • Nigam Shah

14:45: Electronic Health Record Analysis via Deep Poisson Factor Models

Lawrence Carin, PhD

Electronic Health Record (EHR) phenotyping utilizes patient data captured through normal medical practice, to identify features that may represent computational medical phenotypes. These features may be used to identify at-risk patients and improve prediction of patient morbidity and mortality. We present a novel deep multi-modality architecture for EHR analysis (applicable to joint analysis of multiple forms of EHR data), based on Poisson Factor Analysis (PFA) modules. Each modality, composed of observed counts, is represented as a Poisson distribution, parameterized in terms of hidden binary units. Information from different modalities is shared via a deep hierarchy of common hidden units. To explore the utility of these models, we apply them to a subset of patients from the Duke-Durham patient cohort. We identified a cohort of over 12,000 patients with Type 2 Diabetes Mellitus (T2DM) based on diagnosis codes and laboratory tests out of our patient population of over 240,000. Examining the common hidden units uniting the PFA modules, we identify patient features that represent medical concepts. Experiments indicate that our learned features are better able to predict mortality and morbidity than clinical features identified previously in a large-scale clinical trial.

15:30: A perspective on Machine Learning in Pediatric Intensive Care

The Laura P. and Leland K. Whittier Virtual PICU

The Laura P. and Leland K. Whittier Virtual Pediatric Intensive Care Unit (VPICU) is a team of doctors, machine learners, and engineers committed to developing real-time clinical decision support for the pediatric ICU. We will discuss our perspective on what needs exist in the ICU and how machine learning can meet these needs. We will highlight some of our recent machine learning work that aims to enable solutions to those needs.

16:00: Closing Remarks

16:15: Feedback Discussion Session

Invited Speakers

Bassam Kadry, MD Clinical Assitant Professor, Anesthesiology, Perioperative and Pain Medicine Stanford Medicine

Larry Smarr, PhD Professor of Computer Science and Information Technologies University of California, San Diego

Lawrence Carin, PhD Professor of Electrical & Computer Engineering Duke University

Polina Golland, PhD Professor of Electrical Engineering and Computer Science Massachusetts Institute of Technology

Joelle Pineau, PhD Associate Professor, School of Computer Science McGill University

Randall Moorman, MD Professor of Medicine, Biomedical Engineering and Molecular Physiology and Biological Physics University of Virgina

Leo Celi, MD Assistant Professor Medicine Beth Israel Deaconess Medical Center

Michael Pinsky, MD Professor of Critical Care Medicine University of Pittsburgh

Artur Dubrawski, PhD Senior Systems Scientist, Robotics Institute Carnegie Mellon University

Program Chairs

Finale Doshi, PhD Assistant Professor in Computer Science, Harvard School of Engineering and Applied Sciences

James C. Fackler, MD Associate Professor Departments of Anesthesiology/Critical Care Medicine and Pediatrics Johns Hopkins University School of Medicine

David Kale PhD Student, Computer Science, Viterbi Dean's Doctoral Fellow, and Alfred E. Mann Innovation in Engineering Fellow at the University of Southern California

Byron Wallace, PhD Assistant professor at the University of Texas at Austin

Jenna Wiens, PhD Assistant Professor of Computer Science and Engineering (CSE) at the University of Michigan

Senior Advisory Committee:

Carla Brodley, PhD Dean of the College of Computer and Information Science, Northeastern University

Michael Brudno, PhD Associate Professor and Canada Research Chair in Computational Biology, University of Toronto

Gari D. Clifford, PhD Associate Professor, Biomedical Informatics Emory University

Noémie Elhadad, PhD Associate Professor of Biomedical Informatics, Affiliated with Computer Science, Columbia University

Deborah Estrin, PhD Professor of Computer Science at Cornell Tech in New York City and a Professor of Public Health at Weill Cornell Medical College

Joydeep Ghosh, PhD Schlumberger Centennial Chair Professor of Electrical and Computer Engineering at The University of Texas at Austin

Russell Greiner, PhD Professor of Computer Science at the University of Alberta

John Guttag, PhD Dugald C. Jackson Professor MIT Department of Electrical Engineering and Computer Science

Milos Hauskrecht, PhD Professor of Computer Science, University of Pittsburgh

Eric Horvitz, PhD Technical Fellow and Managing Director, Microsoft Research

Isaac Kohane, MD, PhD Lawrence J. Henderson Professor of Pediatrics, Boston Childrens Hospital

Roger Mark, MD, PhD HST Faculty, Distinguished Professor in Health Sciences and Technology and Electrical Engineering and Computer Science, Massachusetts Institute of Technology

J. Randall Moorman, MD Professor of Medicine, Biomedical Engineering and Molecular Physiology and Biological Physics

Raymond T. Ng , PhD Professor of Computer Science at the University of British Columbia

John Quinn, PhD Senior Lecturer in Computer Science at Makerere University

Christian Shelton, PhD Associate Professor at UC Riverside's Computer Science Department

Peter Szolovits, PhD Professor of Computer Science and Engineering in the MIT Department of Electrical Engineering and Computer Science

Nigam H. Shah, MBBS, PhD Associate Professor, Medicine - Biomedical Informatics Research, Stanford University

Mark S Wainwright, MD, PhD Founder’s Board Chair of Neurocritical Care, Professor in Pediatrics-Neurology, Neurology - Ken and Ruth Davee Department and Pharmacology, Northwestern

Randall Wetzel, MD Chairman, Department of Anesthesiology Critical Care Medicine - Children's Hospital Los Angeles

Chris Williams, PhD Professor of Machine Learning, School of Informatics, University of Edinburgh

Accepted Papers

Input-Output Non-Linear Dynamical Systems applied to Physiological Condition Monitoring

Konstantinos Georgatzis, Chris Williams, and Christopher Hawthorne, University of Edinburgh

Preterm Birth Prediction: Stable Selection of Interpretable Rules from High Dimensional Data

Truyen Tran, Wei Luo, and Dinh Phung, Deakin University; Jonathan Morris and Kristen Rickard, University of Sydney; Svetha Venkatesh, Deakin University

Mitochondria-based Renal Cell Carcinoma Subtyping: Learning from Deep vs. Flat Feature Representations

Peter Schüffler and Judy Sarungbam, Memorial Sloan Kettering Cancer Center; Hassan Muhammad, Weill Cornell Medical College; Ed Reznik, Satish Tickoo, and Thomas Fuchs, Memorial Sloan Kettering Cancer Center

Multi-task Learning with Weak Class Labels: Leveraging iEEG to Detect Cortical Lesions in Cryptogenic Epilepsy

Bilal Ahmed, Tufts; Thomas Thesen and Karen Blackmon, NYU; Carla Brodley, Northeastern

Doctor AI: Predicting Clinical Events via Recurrent Neural Networks

Edward Choi and Mohammad Taha Bahadori, Georgia Tech; Andy Schuetz and Walter Stewart, Sutter Health; Jimeng Sun, Georgia Tech

Diagnostic Prediction Using Discomfort Drawing with IBTM

Cheng Zhang, KTH Royal Institute of Technology; Hedvig Kjellström, KTH Sweden; Carl Henrik Henrik, Bristol University; Bo Bertilson, KI Karolinska Institutet

Learning Robust Features using Deep Learning for Automatic Seizure Detection

Pierre Thodoroff and Joelle Pineau, McGill University

Using Kernel Methods and Model Selection for Prediction of Preterm Birth

Ansaf Salleb-Aouissi, Columbia University; Anita Raja, Cooper Union; Ronald Wapner, Columbia Medical Center

gLOP: the global and Local Penalty for Capturing Predictive Heterogeneity

Rhiannon Rose and Daniel Lizotte, Western University

Uncovering Voice Misuse Using Symbolic Mismatch

Marzyeh Ghassemi, MIT; Zeeshan Syed, University of Michigan; Daryush Mehta, Jarrad Van Stan, and Robert Hillman, Masschussetts General; John Guttag, MIT

Identifiable Phenotyping using Constrained Non-Negative Matrix Factorization

Shalmali Joshi, Suriya Gunasekar, and Joydeep Ghosh, UT Austin; David Sontag, NYU

Transferring Knowledge from Text to Predict Disease Onset

Yun Liu, MIT; Kun-Ta Chuang, Fu-Wen Liang, and Huey-Jen Su, National Cheng Kung University; Collin Stultz and John Guttag, MIT

Scalable Modeling of Multivariate Longitudinal Data for Prediction of Chronic Kidney Disease Progression

Joseph Futoma, Blake Cameron, Mark Sendak, and Katherine Heller, Duke University

Directly Modeling Missing Data in Sequences with RNNs: Improved Classification of Clinical Time Series

Zachary Lipton, UC San Diego; David Kale, USC Information Sciences Institute; Randall Wetzel, Children's Hospital LA

Deep Survival Analysis

Rajesh Ranganath, Princeton University; Adler Perotte, Noémie Elhadad, and David Blei, Columbia University

Deep Convolutional Neural Networks for Microscopy-Based Point of Care Diagnostics

Alfred Adama, Pius Mugagga, Rose Nakasi, and John Quinn, Makerere University

Clinical Tagging with Joint Probabilistic Models

Yoni Halpern, NYU; Steven Horng, Beth Israel Deaconess Medical Center; David Sontag, NYU

Multi-task Prediction of Disease Onsets from Longitudinal Laboratory Tests

Narges Razavian, Jake Marcus, and David Sontag, NYU

A Non-parametric Bayesian Approach for Estimating Treatment-Response Curves from Sparse Time Series

Yanbo Xu, Suchi Saria, and Yanxun Xu, Johns Hopkins University

Accepted Clinical Podium Abstracts

Demonstration of a Chronic Kidney Disease Population Rounding Tool

Mark Sendak, Duke Institute for Health Innovation; Faraz Yashar, Lance Co Ting Keh, Ephori LLC; Blake Cameron, Joseph Futoma, Katherine Heller, and Uptal Patel, Duke

Precision Medicine in Point-of-Care Management of Surgical Complications

Zhifei Sun, Elizabeth Lorenzi, Ouwen Huang, Thomas Li, Christopher Mantyh, Katherine Heller, and Erich Huang, Duke

Performing an informatics consult

Nigam Shah, Stanford Center for Biomedical Informatics Research

MS Mosaic: Mobile technology and machine learning for multiple sclerosis research and patient care

Lee Hartsell and Katherine Heller, Duke University

Care Coordination using practice based evidences

Adrish Sannyasi, Splunk; Daniella Meeker, USC Keck School of Medicine

Same Decision Probability in Neurocritical Care

Fabien Scalzo, Arthur Choi, and Adnan Darwiche, UCLA

Patient Identification Using Plethysmography Structure Analysis

Jennifer Laine, Yale University

Real-time Detection and Exploratory Discovery of Anomalies for Pediatric Ventilator Management

Tanachat Nilanon and Yan Liu, USC; Justin Hotz and Robinder Khemani, Children's Hospital LA


Gold Sponsors

Silver Sponsors

Bronze Sponsors