Training Large Language Models to Predict Clinical Events

Hugging Face Daily Papers 05/12/26, 12:00 AM Papers

clinical-prediction foresight-learning lora llm-fine-tuning healthcare mimic-iii

Summary

Foresight Learning converts longitudinal clinical notes into prediction examples, and a LoRA adapter improves calibration and reduces uncertainty compared to base models, outperforming GPT-5 on held-out questions.

Longitudinal clinical notes contain rich evidence of how patients evolve over time, but converting this signal into training supervision for clinical prediction remains challenging. We extend Foresight Learning to clinical prediction by converting time-ordered MIMIC-III notes into examples consisting of past patient context, a natural-language question about a possible future event, and a label resolved from later documentation. This process yields 6,900 prediction examples from 702 admissions across medications, procedures, organ support, microbiology, and mortality. A small LoRA adapter trained on these examples improves over the prompted base model, reducing expected calibration error from 0.1269 to 0.0398 and Brier score from 0.199 to 0.145, while slightly outperforming GPT-5 point estimates on held-out questions. The approach enables reusable clinical prediction supervision from longitudinal notes without hand-engineered structured features or endpoint-specific classifiers.

Original Article

View Cached Full Text

Cached at: 05/22/26, 06:34 AM

Paper page - Training Large Language Models to Predict Clinical Events

Source: https://huggingface.co/papers/2605.12817 Published on May 12

Submitted byhttps://huggingface.co/Bturtel

Benon May 22

Abstract

Longitudinal clinical notes are converted into temporal prediction examples using Foresight Learning, enabling improved clinical prediction through LoRA adaptation that enhances calibration and reduces uncertainty compared to base models.

Longitudinal clinical notes contain rich evidence of how patients evolve over time, but converting this signal into training supervision forclinical predictionremains challenging. We extendForesight Learningtoclinical predictionby converting time-ordered MIMIC-III notes into examples consisting of past patient context, a natural-language question about a possible future event, and a label resolved from later documentation. This process yields 6,900 prediction examples from 702 admissions across medications, procedures, organ support, microbiology, and mortality. A smallLoRA adaptertrained on these examples improves over theprompted base model, reducing expectedcalibration errorfrom 0.1269 to 0.0398 andBrier scorefrom 0.199 to 0.145, while slightly outperforming GPT-5 point estimates on held-out questions. The approach enables reusableclinical predictionsupervision fromlongitudinal noteswithout hand-engineered structured features or endpoint-specific classifiers.

View arXiv page View PDF Project page Add to collection

Get this paper in your agent:

hf papers read 2605\.12817

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2605.12817 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2605.12817 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2605.12817 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

Training Large Language Models to Predict Clinical Events

Paper page - Training Large Language Models to Predict Clinical Events

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

@steijnpelle: Today, we're introducing Lassie and $47M in funding led by a16z. We're building AI that runs small businesses, starting…

We Tested LuMay Voice Agent in a Healthcare Clinic for 30 Days — Here Are Our Findings

ClinicalMC: A Benchmark for Multi-Course Clinical Decision-Making with Large Language Models

Building a hill-climbing machine: Launching seven new MAI models | Microsoft AI

@GoogleDeepMind: Over the past year, we’ve collaborated with global scientific experts to evaluate the system on complex problems. It as…

Submit Feedback

Similar Articles

@steijnpelle: Today, we're introducing Lassie and $47M in funding led by a16z. We're building AI that runs small businesses, starting…
Lassie, an AI that runs small businesses starting with doctors' offices, launches with $47M funding led by a16z, already trusted by 700+ practices.

We Tested LuMay Voice Agent in a Healthcare Clinic for 30 Days — Here Are Our Findings

ClinicalMC: A Benchmark for Multi-Course Clinical Decision-Making with Large Language Models

Building a hill-climbing machine: Launching seven new MAI models | Microsoft AI

@GoogleDeepMind: Over the past year, we’ve collaborated with global scientific experts to evaluate the system on complex problems. It as…