Tag
Introduces TrustLDM, a comprehensive benchmark for evaluating safety, privacy, and fairness of Language Diffusion Models, revealing that their alignment degrades with malicious post contexts. Proposes an automatic evaluation framework, TrustLDM-Auto, to identify vulnerable configurations.
This paper characterizes approximate property calibration for discrete properties in multiclass classification, using Lipschitz continuous properties as an intermediary to reduce complexity from the number of classes to the elicitation complexity dimension.
This paper challenges the assumption that current Vision-Language Models faithfully synthesize multimodal data, proposing an information-theoretic Modality Translation Protocol with new metrics (Toll, Curse, Fallacy of Seeing) to evaluate trustworthiness over traditional multimodal gain.
This vision paper argues that trust in Agent-to-Agent (A2A) networks must be integrated from the ground up, as existing agent alignment techniques are insufficient to address systemic vulnerabilities like adversarial composition and semantic misalignment.
A comprehensive survey reviewing the trustworthiness challenges of Large Audio Language Models (LALMs), including vulnerabilities like cross-modal jailbreaking and acoustic backdoors, and proposing a defense-in-depth roadmap.