AI model forecasts risk of more than 1,200 diseases years in advance

Scientists have developed an artificial intelligence model that can estimate a person’s likelihood of developing more than 1,200 diseases over periods of years, a step researchers say could help identify high-risk patients and anticipate health-service demand.

The model, called Delphi-2M, was trained on anonymized health information from the UK Biobank and tested against other Biobank participants and 1.9 million medical records in Denmark. It does not predict exact dates for events, but assigns calibrated probabilities — for example, estimating the chance someone will develop a condition within a defined period, analogous to a weather forecast predicting a percent chance of rain.

Researchers described the model in the journal Nature and said it uses machine learning techniques similar to those behind conversational AI systems. Where chatbots learn patterns in language to predict sequences of words, Delphi-2M learns patterns in clinical records — such as hospital admissions, general practitioner entries and lifestyle information — to predict what health events are likely to come next and when.

The team reported that Delphi-2M produces well-calibrated risk estimates in external datasets. "If our model says it's a one-in-10 risk for the next year, it really does seem like it turns out to be one in 10," said Prof. Ewan Birney, interim executive director of the European Molecular Biology Laboratory and a lead researcher on the project. The model was most accurate for conditions with relatively clear progression pathways, including type 2 diabetes, heart attacks and sepsis, and less reliable for sporadic events such as many infections.

Researchers envisage several practical uses if the model proves robust in further testing and regulatory review. Clinicians could use risk estimates to offer preventive interventions — medications, screening or targeted lifestyle advice — to patients identified as high risk. Health systems could apply the model to aggregated records to forecast regional service needs years ahead, for example estimating future annual numbers of heart attacks to help plan staffing and facilities.

"This is the beginning of a new way to understand human health and disease progression," said Prof. Moritz Gerstung of the German Cancer Research Centre, who leads an AI division there. Prof. Gerstung and collaborators at the EMBL and the University of Copenhagen said Delphi-2M could eventually help personalise care and anticipate healthcare needs at scale.

The model was built from data on more than 400,000 people in the UK Biobank, a resource that includes linked records on hospital admissions, general practice contacts and lifestyle factors such as smoking. The researchers then validated its predictions in additional UK Biobank participants and in Denmark, where 1.9 million records were available for testing.

Researchers cautioned that Delphi-2M is still a research tool and not ready for clinical deployment. They highlighted potential biases because UK Biobank participants are disproportionately aged 40 to 70 and are not fully representative of entire populations. The team is updating the model to incorporate additional data types, including medical imaging, genetic information and blood test results, which they say could improve accuracy and expand the range of risks the model can assess.

Ethical and regulatory issues remain central to any future deployment. Gustavo Sudre, a neuroimaging and AI researcher at King’s College London, welcomed the work as a step toward scalable and interpretable predictive modelling in medicine but emphasised the need for ethical responsibility. Prof. Birney also stressed the necessity of testing, regulation and careful thought before clinical use, comparing the likely adoption path to that of genomics, which took roughly a decade from scientific confidence to routine clinical application.

The research was a collaboration among the European Molecular Biology Laboratory, the German Cancer Research Centre and the University of Copenhagen. The authors and outside commentators said further validation, transparent reporting of limitations and governance frameworks will be essential before the technology is used to guide individual care or system planning.

AI model forecasts risk of more than 1,200 diseases years in advance

Delphi-2M, trained on UK Biobank data and tested on 1.9 million Danish records, estimates decade-long disease risks to help target prevention and plan services

Sources