Tag
This paper introduces MemClaw, a governed shared memory architecture for multi-agent LLM systems, formalizing failure modes like unauthorized leakage and stale propagation, and evaluating the system via the ArgusFleet harness.
ProvenanceGuard是一种用于MCP驱动的LLM代理的源感知事实性验证器,它通过分解回答为原子声明、路由到特定源证据、检查支持并验证归因,解决了跨源混淆问题。在医疗领域的评估中,它达到了0.802的块F1和0.858的源准确率。
OpenAI announces support for the European Commission's Code of Practice on Transparency of AI-Generated Content, reinforcing its commitment to AI governance and content provenance.
This paper introduces EvalCards, an operational framework that standardizes AI evaluation reporting by composing benchmark metadata, evaluation run data, and model metadata into a unified record with interpretive signals for reproducibility, completeness, provenance, risk, and score comparability. The authors deploy a monitoring tool across thousands of models and benchmarks, revealing systematic gaps in current reporting practices.
Solo indie dev launches vynly.co, an AI-native social feed designed for AI agents to autonomously post AI-generated art and videos with built-in provenance (C2PA/SynthID) and features like Sparks. It supports Midjourney, Flux, Sora, Runway, and other models.
Eywa is a provenance-grounded long-term memory architecture for AI agents that stores immutable source evidence, validates extracted memories, and achieves strong benchmark results on LoCoMo, LongMemEval-S, and BEAM.
Agent Trace is an open specification for tracking AI-generated code in version-controlled codebases, defining a vendor-neutral format to record AI contributions alongside human authorship.
The article describes lessons learned from building a 'harness' system to wrap coding agents with context, tools, provenance, and verification, detailing the first two of eight pillars: Context and Provenance.
The article discusses the growing threat of AI-generated deception in the information landscape and proposes provenance—an ecosystem-level adoption of content authentication—as a remedy, highlighting risks like AI catfishing, fabricated scientific data, and coordinated disinformation campaigns.
Introduces a framework for agent memory with three components: Remember (hot session and cross-session storage), Cite (authority ordering via AGENTS.md), and Forget (timestamped facts with Mem0-style soft decay). Argues that missing any of these leads to stale facts or unauthorized sources.
This paper argues that explicit provenance across the full agentic AI lifecycle is the structural necessity for making responsibility computable and actionable, addressing responsibility gaps from emergent harms in autonomous compositions.
AI memory failures compound quietly over time, causing users to build habits around incorrect information. An inspectable memory layer with full provenance can catch and correct these issues early.
The article highlights three common failure modes in production AI memory systems: outdated preferences persisting, sarcasm stored as literal, and summaries outliving their source facts. It argues that the AI memory industry lacks provenance, confidence scores, and versioning, creating a black-box problem that hinders debugging.
The article argues that the proliferation of AI-generated content (slop) is causing a provenance crisis where the origin and reliability of information are undermined, illustrated by examples of misdirected automated outreach and fake engagement.
OpenAI announces comprehensive safety measures for Sora 2 and the Sora app, including provenance signals, C2PA metadata embedding, consent-based likeness controls through characters, and enhanced protections for teen users. The approach combines technical safeguards like content filtering with policy-based guardrails to prevent misuse of AI-generated video.
Google DeepMind introduces Backstory, an experimental AI tool built on Gemini that helps users verify image authenticity and context by detecting AI-generation, tracking usage history, and identifying digital alterations.