Today, we’re using machine learning across health care and life sciences to analyze large amounts of information and to make predictions. The models we create are trained on existing data with an unlimited number of fields – from a person’s age, race, and health symptoms, to a product’s features and price. After we create models and algorithms and crunch the data, we’re left with our machine’s best guesses. But, how do we determine if our data, algorithms, and results are correct or if they’re biased? And, if we do believe what we have is incorrect, what responsibility do we have to “correct” our models? How good is good enough?
UPMC Enterprises recently hosted a Pittsburgh Entrepreneurs Forum panel discussion that answered these questions and more. The discussion was moderated by the new Director of Business Development and Marketing at Accion Labs, Christopher Evans. Panelists included:
Keith Callenberg, PhD: Director, Machine Learning at UPMC Enterprises
Norma Nieto: Principal/Owner at N-Squared Strategic Consulting
Srinivasan Suresh, MD: Chief Medical Information Officer, Children’s Hospital of Pittsburgh (UPMC)
Don Taylor, PhD: Assistant Vice Chancellor, Health Sciences Translation, University of Pittsburgh
Mr. Evans kicked the event off by asking what each of the panel members’ organizations were currently doing with machine learning. Answers ranged from optimizing the performance of athletes, to predicting patient readmission rates, to risk-detection in EMRs. All of the panelists agreed that machine learning was playing a significant role in health care and that algorithms are changing the day-to-day life of engineers, researchers, physicians, and patients.
The conversation then took a slight turn to address the event’s title: How do we hold machines accountable? How do we avoid machine learning being too much of a black box? Dr. Suresh took the question head on and remarked, “We cannot hold machines accountable. But we need to hold humans accountable.” Dr. Suresh went on to speak about the “black box” and how important it is to know what information we are feeding these models, how these models perform the analysis, and how clinicians can apply the information that is generated from these advanced techniques.
The panel expanded on the topic and put the spotlight on data completeness. Data completeness refers to how available data is (whether it exists or not) within data sets. Incompleteness can refer to only including certain types of diseases, to only having certain races or sexes, to only using certain age groups.
Ms. Nieto inserted that if only certain patient data were used for an algorithm, that the output wouldn’t necessarily be relevant to patients that aren’t similar to the original data set. All panelists agreed that relying on outputs of incomplete data sets can have drastic effects and therefore gathering the best data before applying algorithms is crucial.
A common practice to hedge against bias brought up by Dr. Callenberg was that of acquiring more data. Dr. Callenberg referenced a paper by David Sontag, that argues that additional data collection is superior to constraining models. Dr. Callenberg went on to address that acquiring extra data, especially in a health care setting isn’t easy – nor inexpensive – but that it’s the most ethical route to avoiding prediction error.
The panel went further into the consequences of machine learning and also into data collection with wearables, from watches to sensors. One of Mr. Evans’ last questions inquired about the future of machine learning in health care. Dr. Taylor felt that the future was here and now especially with what he’s helped to advance at the University of Pittsburgh and with the Pittsburgh Health Data Alliance (a collaboration across Pitt, CMU, and UPMC). “Machine learning is essential to interrogating biological data to uncover new molecular mechanisms of intractable diseases such as Alzheimer’s or congestive heart failure. Through our relationship with UPMC, our basic and translational researchers at Pitt and CMU have access to vast clinical datasets enriched across genomics, metabolomics, radiomics, and more. These unique data coupled with our hypothesis-driven translational science will transform the practice of medicine with Pittsburgh leading the way.”
Dr. Suresh concluded the panel by reiterating the complexity of machine learning. “In order for algorithms to be trusted by physicians and other clinical providers so that they can be incorporated into direct patient care, the data and processes need to be transparent. Engineers and technologists must continue to convince and provide evidence to physicians that their machine learning models are unbiased, accurate, timely and relevant. While machine learning and AI certainly have a bright future in advancing health care, it is key that physicians are engaged early and are part of the solutions team.”
UPMC Enterprises Takes the Life Sciences Stage
This year’s Pittsburgh Life Sciences Week will be taking place May 13 – May 17 …
Meet the Executive: Adam Berger, Chief Technology Officer at UPMC Enterprises
Adam Berger came to Pittsburgh in the 1990s to work on a PhD at Carnegie …
Impact of Bias
The morning was kicked off by Cassandra Cooper, Manager of Diversity Learning, from UPMC’s Center …