One-sentence summary:
Combining artificial intelligence (AI) with mathematical modeling and ethical data sharing practices is essential for accurate, reproducible, and patient-specific cancer treatment in predictive medicine.

Three brief takeaways:

  1. AI alone isn’t enough — Mathematical modeling adds critical biological insights that AI may miss, especially when data is sparse.
  2. Reproducibility matters — Open science and transparent methodologies are vital to ensure scientific findings can be trusted and verified.
  3. Ethical data sharing is key — Protecting patient privacy while enabling broad, secure access to data supports better research and healthcare outcomes.

With the advent of artificial intelligence (AI), predictive medicine is becoming an important part of healthcare, especially in cancer treatment. Predictive medicine uses algorithms and data to help doctors understand how a cancer might continue to grow or react to specific drugs—making it easier to target precision treatment for individual patients.

AI Should Not Be Relied on Exclusively

While AI is important in this work, researchers from University of Maryland School of Medicine (UMSOM) say that it should not be relied on exclusively. Instead, AI should be combined with other methods, such as traditional mathematical modeling, for the best outcomes.

In a commentary published in Nature Biotechnology, Elana Fertig, PhD, Director of the Institute for Genome Sciences (IGS) and Professor of Medicine at UMSOM and Daniel Bergman, PhD, an IGS scientist argue that mathematical modeling has been underestimated and under-used in precision medicine to date.

All health computational models need three key components to work: datasets, equations, and software. Then, after generating data comes leveraging it to improve early diagnoses, discover new treatments, and aid understanding of the diseases. 

In a second commentary, in Cell Reports Medicine, Fertig and IGS colleagues Dmitrijs Lvovs, PhD, Anup Mahurkar, PhD, and Owen White, PhD, address how to ethically share health data and methods to create reproducible science.

Taken together, the two commentaries set a foundational approach to generating, analyzing, and ethically sharing data to benefit both patients and science.

Explaining the argument of the Nature Biotechnology commentary Fertig says:  “AI and mathematical models differ dramatically in how they arrive at an outcome.AI models first must be trained with existing data to make an outcome prediction, while mathematical models are directed to answer a specific question using both data and biological knowledge.”

That means that when data is sparse—as it often is in newer cancer treatments such as immunotherapy—AI can over generalize, resulting in biased or inaccurate outcomes that cannot be reproduced by other scientists. Mathematical modeling, on the other hand, uses known biological mechanisms, learned from scientific experiments, to explain how it arrived at an outcome.

“For example, with a mathematical model, we could create virtual cancer cells and healthy cells and write a program that would mimic how those cells interact and evolve inside of a tumor with different types of treatments,” says Bergman, assistant professor at IGS and UMSOM’s Department of Pharmacology, Physiology, and Drug Development. “At this time, AI cannot give us that type of specificity.”

Broader Data Ups Accuracy

The authors state that in addition to using both types of models in “computational immunotherapy,” using a breadth of populations, and making datasets publicly available are critical for the most accurate outcomes.

“Data breadth and accuracy are key. Artifacts in the dataset, or even a simple typo in computer code, can throw off the accuracy of either type of model,” adds Fertig. “Therefore, for any analysis pipeline to work correctly, it must be reproducible and that can only be assured by open science—giving access to other researchers whose work can confirm the models will get the right treatment to the right patient.”

However, reproducibility remains a critical challenge in science. In a 2016 article in Naturesurveying more than 1500 scientists, more than 70% of researchers said they have tried and failed to reproduce another scientist’s experiments, and more than half have failed to reproduce their own experiments. 

“Reproducible research enables investigators to verify that the findings are accurate, reduce biases, promote scientific integrity, and build trust,” explains Dmitrijs Lvovs, PhD, research associate at IGS and first author on the Cell Reports Medicine commentary. “Because data science is computationally driven, all results should be transparent and automatically reproducible from the same dataset if the analysis code is readily available through open science.”

While that sounds simple enough—and there are best practices in place—the challenge, the authors argue, is how to share data while protecting patient privacy and blocking unauthorized data breeches. Genomic data, when combined with personal health information (PHI), could lead to re-identification of patients, a privacy violation. 

The authors say that creating ethical open science data sharing means: 1. Getting detailed informed consent from patients; 2. Ensuring data quality when collecting and processing data by mitigating errors; 3. Harmonizing and standardizing data collected from disparate sources; 4. Using and creating resources and platforms, such as multiomic, clinical, public health, and drug discovery repositories; and 5. Working with vetted pipelines, such as open source analysis tools and software platforms.

“Ethical and responsible data sharing democratizes research, supports the advancement of AI, and informs public health policies,” says Lvovs. “With ethical and responsible data sharing, the biomedical research community can maximize the benefits of shared data, accelerate discovery, and improve human health.”

Featured Image: In two commentaries published this week, Elana Fertig, PhD, says AI should be combined with other methods to find cancer treatments — and the data should be shared ethically and reproducible. Image: University of Maryland School of Medicine