Mount Sinai researchers develop machine learning approach that analyzes common clinical data to determine penetrance of rare genetic variants.


Researchers at the Icahn School of Medicine at Mount Sinai have developed an artificial intelligence model that uses routine laboratory tests to predict whether patients with rare genetic variants will actually develop disease, addressing a longstanding challenge in clinical genetics.

The new approach, detailed in the Aug 28 online issue of Science, combines machine learning with electronic health records data from more than 1 million patients to generate “ML penetrance” scores for genetic variants. The method uses common lab tests such as cholesterol levels, blood counts, and kidney function markers that are already part of most medical records.

“We wanted to move beyond black-and-white answers that often leave patients and providers uncertain about what a genetic test result actually means,” says Ron Do, PhD, senior study author and the Charles Bronfman Professor in Personalized Medicine at the Icahn School of Medicine at Mount Sinai, in a release. “By using artificial intelligence and real-world lab data, such as cholesterol levels or blood counts that are already part of most medical records, we can now better estimate how likely disease will develop in an individual with a specific genetic variant.”

Addressing Clinical Uncertainty in Genetic Testing

The research addresses a gap in genetic testing interpretation. When genetic testing reveals a rare DNA mutation, clinicians and patients often lack clear guidance on the clinical significance of the finding. Traditional genetic studies typically rely on binary disease classifications, but many conditions exist on a spectrum.

The Mount Sinai team built AI models for 10 common diseases and applied them to people with known rare genetic variants. The system generates scores between 0 and 1, with higher scores indicating variants more likely to contribute to disease development and lower scores suggesting minimal risk. The researchers calculated ML penetrance scores for more than 1,600 genetic variants.

“While our AI model is not meant to replace clinical judgment, it can potentially serve as an important guide, especially when test results are unclear,” says lead study author Iain S Forrest, MD, PhD, in the lab of Dr Do at the Icahn School of Medicine at Mount Sinai, in a release. “Doctors could in the future use the ML penetrance score to decide whether patients should receive earlier screenings or take preventive steps, or to avoid unnecessary worry or intervention if the variant is low-risk.”

Unexpected Findings Challenge Current Classifications

The study revealed surprising results that challenge existing genetic variant classifications. Some variants previously labeled as “uncertain” showed clear disease signals in the AI analysis, while others thought to cause disease had little effect in real-world data.

For clinical laboratories, this research suggests potential applications in genetic testing interpretation and reporting. The approach could help labs provide more nuanced risk assessments alongside genetic test results, particularly for variants of uncertain significance.

The team used electronic health records data to train their models, demonstrating how existing clinical laboratory data can be leveraged for genetic risk assessment without requiring additional testing or specialized biomarkers.

Future Applications and Expansion

The researchers are working to expand the model to include more diseases, a wider range of genetic changes, and more diverse populations. They also plan to validate the predictions over time by tracking whether people with high-risk variants actually develop disease.

“Ultimately, our study points to a potential future where AI and routine clinical data work hand in hand to provide more personalized, actionable insights for patients and families navigating genetic test results,” says Dr Do in a release.

The work was supported by grants from the National Institute of General Medical Sciences, National Institute of Diabetes and Digestive and Kidney Diseases, and National Human Genome Research Institute of the National Institutes of Health.

ID 124909190 | Genetics © Blackboard373 | Dreamstime.com