Discrete Diffusion Language Models for Interactive Radiology Report Drafting

Hugging Face Daily Papers 07/01/26, 12:00 AM Papers

Summary

This paper adapts a mixture-of-experts diffusion language model, DiffusionGemma-26B, for interactive radiology report drafting, showing it matches or exceeds autoregressive models in medical VQA with 3.5-4.4x faster decoding and bidirectional infill capabilities.

Diffusion language models, which generate text by denoising a token canvas bidirectionally instead of emitting tokens left to right, have become competitive with autoregressive (AR) generation. Medical foundation models, however, remain almost entirely autoregressive. We adapt a mixture-of-experts diffusion language model, DiffusionGemma-26B, and benchmark it against its same-size AR sibling Gemma-4-26B under an identical LoRA recipe on medical visual question answering datasets, scored by a verbosity-robust LLM judge. Diffusion matches or exceeds AR on all of them, and the finetuned model (3.8B active) is competitive with frontier vision-language models; its decoding is also 3.5-4.4x faster. Beyond this parity, the diffusion model offers a drafting capability AR lacks: any-order infill. Because the canvas is denoised bidirectionally, a radiologist can fix report fragments and have the model fill the text between them, an operation inherent to diffusion but not to autoregression, which is subpar at it. This suits real reports, which are often terse or inconsistent across clinicians and institutions.

Original Article

View Cached Full Text

Cached at: 07/03/26, 03:52 AM

Paper page - Discrete Diffusion Language Models for Interactive Radiology Report Drafting

Source: https://huggingface.co/papers/2607.01436

Abstract

Diffusion language models match or exceed autoregressive models in medical visual question answering while offering faster decoding and bidirectional text editing capabilities.

Diffusion language models, which generate text by denoising a token canvas bidirectionally instead of emitting tokens left to right, have become competitive with autoregressive (AR) generation.Medical foundation models, however, remain almost entirely autoregressive. We adapt amixture-of-expertsdiffusion language model,DiffusionGemma-26B, and benchmark it against its same-size AR siblingGemma-4-26Bunder an identicalLoRArecipe onmedical visual question answeringdatasets, scored by a verbosity-robustLLM judge. Diffusion matches or exceeds AR on all of them, and the finetuned model (3.8B active) is competitive with frontier vision-language models; its decoding is also 3.5-4.4x faster. Beyond this parity, the diffusion model offers adrafting capabilityAR lacks: any-orderinfill. Because the canvas is denoised bidirectionally, aradiologistcan fix report fragments and have the model fill the text between them, an operation inherent to diffusion but not to autoregression, which is subpar at it. This suits real reports, which are often terse or inconsistent across clinicians and institutions.

View arXiv page View PDF Add to collection

Get this paper in your agent:

hf papers read 2607\.01436

Don’t have the latest CLI?curl \-LsSf https://hf\.co/cli/install\.sh \| bash

Models citing this paper0

No model linking this paper

Cite arxiv.org/abs/2607.01436 in a model README.md to link it from this page.

Datasets citing this paper0

No dataset linking this paper

Cite arxiv.org/abs/2607.01436 in a dataset README.md to link it from this page.

Spaces citing this paper0

No Space linking this paper

Cite arxiv.org/abs/2607.01436 in a Space README.md to link it from this page.

Collections including this paper0

No Collection including this paper

Add this paper to acollectionto link it from this page.

Discrete Diffusion Language Models for Interactive Radiology Report Drafting

Paper page - Discrete Diffusion Language Models for Interactive Radiology Report Drafting

Abstract

Models citing this paper0

Datasets citing this paper0

Spaces citing this paper0

Collections including this paper0

Similar Articles

Discrete Diffusion Language Models for Interactive Radiology Report Drafting

google/diffusiongemma-26B-A4B-it

AnchorDiff: Topology-Aware Masked Diffusion with Confidence-based Rewriting for Radiology Report Generation

Diffusion Language Models: An Experimental Analysis

@vllm_project: Congrats to @GoogleDeepMind on DiffusionGemma A 26B diffusion language model on the Gemma4 backbone, and the first dLLM…

Submit Feedback

Similar Articles

Discrete Diffusion Language Models for Interactive Radiology Report Drafting

google/diffusiongemma-26B-A4B-it

AnchorDiff: Topology-Aware Masked Diffusion with Confidence-based Rewriting for Radiology Report Generation

Diffusion Language Models: An Experimental Analysis

@vllm_project: Congrats to @GoogleDeepMind on DiffusionGemma A 26B diffusion language model on the Gemma4 backbone, and the first dLLM…