language-diffusion-models

Tag

Cards List
#language-diffusion-models

TrustLDM: Benchmarking Trustworthiness in Language Diffusion Models

arXiv cs.CL · 2d ago Cached

Introduces TrustLDM, a comprehensive benchmark for evaluating safety, privacy, and fairness of Language Diffusion Models, revealing that their alignment degrades with malicious post contexts. Proposes an automatic evaluation framework, TrustLDM-Auto, to identify vulnerable configurations.

0 favorites 0 likes
← Back to home

Submit Feedback