Training Large Language Models to Predict Clinical Events
Summary
Foresight Learning converts longitudinal clinical notes into prediction examples, and a LoRA adapter improves calibration and reduces uncertainty compared to base models, outperforming GPT-5 on held-out questions.
View Cached Full Text
Cached at: 05/22/26, 06:34 AM
Paper page - Training Large Language Models to Predict Clinical Events
Source: https://huggingface.co/papers/2605.12817 Published on May 12
·
Submitted byhttps://huggingface.co/Bturtel
Benon May 22
Abstract
Longitudinal clinical notes are converted into temporal prediction examples using Foresight Learning, enabling improved clinical prediction through LoRA adaptation that enhances calibration and reduces uncertainty compared to base models.
Longitudinal clinical notes contain rich evidence of how patients evolve over time, but converting this signal into training supervision forclinical predictionremains challenging. We extendForesight Learningtoclinical predictionby converting time-ordered MIMIC-III notes into examples consisting of past patient context, a natural-language question about a possible future event, and a label resolved from later documentation. This process yields 6,900 prediction examples from 702 admissions across medications, procedures, organ support, microbiology, and mortality. A smallLoRA adaptertrained on these examples improves over theprompted base model, reducing expectedcalibration errorfrom 0.1269 to 0.0398 andBrier scorefrom 0.199 to 0.145, while slightly outperforming GPT-5 point estimates on held-out questions. The approach enables reusableclinical predictionsupervision fromlongitudinal noteswithout hand-engineered structured features or endpoint-specific classifiers.
View arXiv pageView PDFProject pageAdd to collection
Get this paper in your agent:
hf papers read 2605\.12817
Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash
Models citing this paper0
No model linking this paper
Cite arxiv.org/abs/2605.12817 in a model README.md to link it from this page.
Datasets citing this paper0
No dataset linking this paper
Cite arxiv.org/abs/2605.12817 in a dataset README.md to link it from this page.
Spaces citing this paper0
No Space linking this paper
Cite arxiv.org/abs/2605.12817 in a Space README.md to link it from this page.
Collections including this paper0
No Collection including this paper
Add this paper to acollectionto link it from this page.
Similar Articles
@steijnpelle: Today, we're introducing Lassie and $47M in funding led by a16z. We're building AI that runs small businesses, starting…
Lassie, an AI that runs small businesses starting with doctors' offices, launches with $47M funding led by a16z, already trusted by 700+ practices.
We Tested LuMay Voice Agent in a Healthcare Clinic for 30 Days — Here Are Our Findings
Testing of LuMay Voice Agent in a healthcare clinic for 30 days showed improved response availability, patient willingness to use AI for routine scheduling, and reduced staff workload.
ClinicalMC: A Benchmark for Multi-Course Clinical Decision-Making with Large Language Models
ClinicalMC is a benchmark designed to evaluate large language models in multi-course clinical decision-making, featuring datasets in Chinese and English and a multi-agent evaluation framework.
Building a hill-climbing machine: Launching seven new MAI models | Microsoft AI
Microsoft launches seven new MAI models with Frontier Tuning for customization, and announces a collaboration with Mayo Clinic to build a frontier AI model for healthcare, aiming to improve clinical reasoning and efficiency.
@GoogleDeepMind: Over the past year, we’ve collaborated with global scientific experts to evaluate the system on complex problems. It as…
GoogleDeepMind collaborated with global scientific experts to evaluate an AI system that identified new targets for liver fibrosis and fresh approaches to ALS, digesting decades of research.