Summary: Recent advances in artificial intelligence and data science highlight innovative machine learning models that predict patient outcomes in multiple myeloma and sepsis.
Takeaways:
- Precision Medicine for Multiple Myeloma: Researchers from the NIH’s All of Us Research Program developed machine learning models using the Synthetic Minority Over-Sampling Technique (SMOTE) to improve prediction accuracy for minority groups with multiple myeloma, addressing disparities in healthcare outcomes across diverse demographic groups.
- Sepsis Risk Prediction: A machine learning model uses routine blood test data to predict sepsis risk with 99% accuracy up to one week before hospital admission, providing a critical tool for early detection and intervention to prevent sepsis-related deaths.
- Broader Applications in Healthcare: The methodologies used in these models, particularly in balancing imbalanced datasets and integrating diverse health data, demonstrate potential applications across various health conditions, including Alzheimer’s, cardiovascular disease, and mental health, advancing the field of precision medicine.
New breakthrough research in artificial intelligence and data science in laboratory medicine was presented at ADLM 2024. One study leveraged a National Institutes of Health (NIH) research cohort and a machine learning model to predict outcomes for patients with multiple myeloma, and another introduced a model that could help to lower worldwide mortality rates from sepsis.
Advancing Precision Medicine for Multiple Myeloma
Diagnosing and monitoring the progression of the blood cancer multiple myeloma involves many factors and is complicated by disparities across demographic groups, as well as imbalanced datasets. To better predict outcomes for patients with multiple myeloma, a team of researchers from the NIH’s All of Us Research Program, led by Thomas Houze, PhD, MSIE, MBA, developed machine learning models tailored for different demographic groups diagnosed with multiple myeloma.
The All of Us Research Program is an NIH initiative that seeks to collect and study the health data of 1 million or more people living in the U.S. Because the All of Us database contains a wide range of participants from diverse backgrounds, it is an extremely valuable resource for training a machine learning model to make precise, individualized predictions for patients with a complex disease like multiple myeloma.
When developing their model, the NIH researchers employed the Synthetic Minority Over-Sampling Technique (SMOTE), a machine learning technique used to resolve imbalanced datasets. This ensures that the model makes useful and accurate predictions for smaller groups in the database. Without SMOTE, “the larger datasets would dominate the signal, so you get very good predictions for people of European genetic ancestry, but very poor predictions for people of Asian or African genetic ancestry,” Houze says. “This was something that I thought needed to be done and it’s a very recent capability in this field.”
Applying SMOTE to the data resulted in significant improvements in prediction accuracy for minority groups within the multiple myeloma patient population, the researchers found, which in turn could improve care for these groups. The technique may also enable precision medicine in areas beyond oncology, according to Houze.
“Once you get this methodology working with our data, you can apply it to Alzheimer’s, cardiovascular disease, mental health, and other areas,” Houze says.
Predicting Sepsis Risk with Machine Learning
Sepsis is a major global health concern. It’s responsible for approximately 11 million deaths annually, representing the leading cause of hospital readmissions and mortality worldwide. Early diagnosis and appropriate treatment could prevent 80% of sepsis-related deaths, but the majority of sepsis cases occur outside the hospital, making timely detection challenging.
Using data from more than 25,000 sepsis and non-sepsis cases, Raj Gopalan, MD, MSIS, of BSRM Consulting created a machine learning model to identify a patient’s risk of developing sepsis up to one week before hospital admission. The model’s input parameters include age, gender, and data from routine blood tests such as complete blood counts, differential counts, comprehensive metabolic panels, and lipid panels that were recorded up to one week before sepsis diagnosis. The model demonstrated 99% accuracy in predicting sepsis risk, and identified calcium, protein, liver enzymes, hematocrit, white blood cells, and cholesterol as key contributors to sepsis risk prediction.
“Sepsis is difficult to diagnose because the symptoms develop rapidly and it often mimics other infectious conditions,” Gopalan says. “The model can synthesize a vast amount of patient lab test results to make predictions.”
Going forward, this model might be used alongside other similar models to make accurate predictions across various health conditions, according to Gopalan.
“As soon as you receive blood test results, they can be processed through various cancer and chronic disease models — not just one, but 40 or 50 — providing insights into a patient’s risk levels. This allows for additional, specific testing to confirm or rule out any risks associated with these conditions,” he says.
Having spent 52 years in the Laboratory, 44 years at Director level, I can see a big role for AI in the Lab. We have manually applied predictive and rule out models for many years. AI will take this to the next level. My hope is it will aid in alleviating some of the pressure on personnel due to staff shortages and associated burnout.
It will also benefit Hospitalists and other providers, some of whom are challenged with interpreting laboratory data in terms of diagnostic and prognostic indications.