trustworthiness

#trustworthiness

Engineering Trustworthy Agentic AI for Critical Systems

arXiv cs.AI ↗ · yesterday Cached

This survey proposes a trustworthiness model for agentic AI in critical engineering systems, covering safety, robustness, transparency, accountability, and security across domains like power systems and autonomous vehicles.

0 favorites 0 likes

#trustworthiness

ConceptSMILE: Auditing the Trustworthiness of Concept-Based Explainable AI

arXiv cs.AI ↗ · 2026-07-13 Cached

ConceptSMILE is a perturbation-based auditing framework for evaluating the reliability of concept-based explainable AI, tested on retinal fundus images.

0 favorites 0 likes

#trustworthiness

Can We Trust LLM's Logic? Quantifying Uncertainty, Coherence, and Robustness via a Graph-Based Framework

arXiv cs.CL ↗ · 2026-07-10 Cached

This paper introduces GraphEVAL, a graph-based framework for quantifying uncertainty in LLM reasoning, and proposes a new metric, Graph Reasoning Coherence Score (GRCS), that captures semantic-structural consensus and detects confident hallucinations. The authors also present Graph Self-Consistency (GSC), a decoding strategy that prioritizes reasoning fidelity over nominal accuracy.

0 favorites 0 likes

#trustworthiness

SolarChain-Eval: A Physics-Constrained Benchmark for Trustworthy Economic Agents in Decentralized Energy Markets

arXiv cs.AI ↗ · 2026-07-10 Cached

This paper proposes SolarChain-Eval, a physics-constrained benchmark for evaluating trustworthy economic agents in decentralized energy markets, integrating an LLM-based Planner/Auditor layer and revealing a utility-safety trade-off.

0 favorites 0 likes

#trustworthiness

@cognition: Yesterday we launched SWE-1.7 built on the open-source Kimi K2.7. Concerns about Chinese base models are real: K2.7 com…

X AI KOLs Following ↗ · 2026-07-09 Cached

Cognition launched SWE-1.7, built on the open-source Kimi K2.7, with trustworthiness training to address concerns about Chinese base models.

0 favorites 0 likes

#trustworthiness

Should AI be able to prove what it knew at the time?

Reddit r/artificial ↗ · 2026-07-06

A thought experiment questioning whether AI systems should maintain a verifiable memory trail of their knowledge and beliefs at the time of decision-making to enhance trust and accountability.

0 favorites 0 likes

#trustworthiness

the problem isnt that AI is wrong, its that it's wrong so confidently

Reddit r/ArtificialInteligence ↗ · 2026-06-29

Discusses the issue of AI models producing incorrect answers with high confidence, highlighting the problem of overconfidence in AI outputs.

0 favorites 0 likes

#trustworthiness

@_akhaliq: paper:

X AI KOLs Following ↗ · 2026-06-26 Cached

This paper proposes Robust-TO, an agentic video understanding framework that integrates per-frame trustworthiness to address the Blind Trust Problem, achieving significant accuracy gains under realistic perturbations.

0 favorites 0 likes

#trustworthiness

Confidence-Aware Tool Orchestration for Robust Video Understanding

Hugging Face Daily Papers ↗ · 2026-06-25 Cached

Robust-TO addresses the Blind Trust Problem in video reasoning by integrating per-frame trustworthiness into an agentic framework, improving accuracy under realistic perturbations through calibrated evidence weighting and reliability-aware reasoning.

0 favorites 0 likes

#trustworthiness

Probing, Fusion, and Trustworthiness: A Systematic Evaluation of Foundation Model Representations for Multimodal Cancer Analysis

arXiv cs.LG ↗ · 2026-06-17 Cached

This paper systematically evaluates foundation model representations for multimodal cancer analysis, benchmarking unimodal and multimodal fusion strategies on real-world cohorts, and assessing trustworthiness via conformal prediction.

0 favorites 0 likes

#trustworthiness

TrustLDM: Benchmarking Trustworthiness in Language Diffusion Models

arXiv cs.CL ↗ · 2026-06-02 Cached

Introduces TrustLDM, a comprehensive benchmark for evaluating safety, privacy, and fairness of Language Diffusion Models, revealing that their alignment degrades with malicious post contexts. Proposes an automatic evaluation framework, TrustLDM-Auto, to identify vulnerable configurations.

0 favorites 0 likes

#trustworthiness

Smoothed Elicitation Complexity for Approximate $\Gamma$-calibration of Discrete Classification Tasks

arXiv cs.LG ↗ · 2026-05-25 Cached

This paper characterizes approximate property calibration for discrete properties in multiclass classification, using Lipschitz continuous properties as an intermediary to reduce complexity from the number of classes to the elicitation complexity dimension.

0 favorites 0 likes

#trustworthiness

The Expense of Seeing: Attaining Trustworthy Multimodal Reasoning Within the Monolithic Paradigm

Hugging Face Daily Papers ↗ · 2026-05-21 Cached

This paper challenges the assumption that current Vision-Language Models faithfully synthesize multimodal data, proposing an information-theoretic Modality Translation Protocol with new metrics (Toll, Curse, Fallacy of Seeing) to evaluate trustworthiness over traditional multimodal gain.

0 favorites 0 likes

#trustworthiness

Trustworthy Agent Network: Trust in Agent Networks Must Be Baked In, Not Bolted On

arXiv cs.AI ↗ · 2026-05-20 Cached

This vision paper argues that trust in Agent-to-Agent (A2A) networks must be integrated from the ground up, as existing agent alignment techniques are insufficient to address systemic vulnerabilities like adversarial composition and semantic misalignment.

0 favorites 0 likes

#trustworthiness

A Survey of Large Audio Language Models: Generalization, Trustworthiness, and Outlook

Hugging Face Daily Papers ↗ · 2026-05-18 Cached

A comprehensive survey reviewing the trustworthiness challenges of Large Audio Language Models (LALMs), including vulnerabilities like cross-modal jailbreaking and acoustic backdoors, and proposing a defense-in-depth roadmap.

0 favorites 0 likes

trustworthiness

Submit Feedback