language-diffusion-models

#language-diffusion-models

TrustLDM: Benchmarking Trustworthiness in Language Diffusion Models

arXiv cs.CL ↗ · 2026-06-02 Cached

Introduces TrustLDM, a comprehensive benchmark for evaluating safety, privacy, and fairness of Language Diffusion Models, revealing that their alignment degrades with malicious post contexts. Proposes an automatic evaluation framework, TrustLDM-Auto, to identify vulnerable configurations.

0 favorites 0 likes

language-diffusion-models

TrustLDM: Benchmarking Trustworthiness in Language Diffusion Models

Submit Feedback