auditing

Tag

Cards List
#auditing

Auditing Forgetting in Limited Memory Language Models

arXiv cs.CL · 19h ago Cached

This paper proposes a causal auditing framework to evaluate forgetting in Limited Memory Language Models by varying the database state during inference, discovering that parametric leakage is negligible and post-deletion correctness primarily arises from retrieval artifacts rather than residual parametric memory.

0 favorites 0 likes
#auditing

SentryCode: Real-time Auditor + Honeytokens for AI Coding Agents [P]

Reddit r/MachineLearning · 19h ago

SentryCode is an open-source kernel-level behavior auditing tool for AI coding agents that logs file/network/cue activity, uses honeypot tokens for zero-false-positive data breach detection, detects steganographic covert channels, and enforces policies, all running locally without network calls.

0 favorites 0 likes
#auditing

The Two Genie Game: Adoption and Welfare in Audit-Grounded AI Governance

arXiv cs.AI · 2d ago Cached

This paper uses evolutionary game theory to model competition between a harm-minimizing AI agent and an approval-seeking (RLHF) agent in a community, analyzing conditions for adoption and welfare outcomes. The results show that while a self-audited agent can fixate, it is not sufficient to prevent community harm, and alignment and timeframe are critical.

0 favorites 0 likes
#auditing

@Miles_Brundage: I think we need federal AI regulation ASAP - something roughly along the lines of the Obernolte-Trahan but not blocking…

X AI KOLs Following · 2026-06-25 Cached

Miles Brundage calls for federal AI regulation with transparency and auditing requirements, noting that being pro-regulation helped a candidate in a primary.

0 favorites 0 likes
#auditing

@Miles_Brundage: Google just published an updated AI policy framework which articulates stronger and more detailed positions in some are…

X AI KOLs Following · 2026-06-25 Cached

Google published an updated AI policy framework with stronger and more detailed positions on auditing and other areas, marking a notable shift in their public stance.

0 favorites 0 likes
#auditing

Natural Identifiers for Privacy and Data Audits in Large Language Models

arXiv cs.LG · 2026-06-24 Cached

This paper introduces natural identifiers (NIDs) for post-hoc privacy auditing and dataset inference in large language models, eliminating the need for retraining or held-out datasets.

0 favorites 0 likes
#auditing

Do LLM Attribution Metrics Transfer? Auditing Retrieval-Augmented Generation Evaluation Across Datasets and Constructs

arXiv cs.CL · 2026-06-24 Cached

This paper audits eight automatic attribution metrics across three evaluation constructs for RAG systems, finding that no single metric transfers across datasets within the same construct, challenging the common practice of treating them as interchangeable.

0 favorites 0 likes
#auditing

Best tools for monitoring and auditing autonomous AI agent behavior at runtime, what's actually working in prod?

Reddit r/AI_Agents · 2026-06-23

A practitioner shares challenges and tools for monitoring autonomous AI agents in production, covering runtime prompt injection detection, tool-call auditing with reasoning traces, behavioral drift detection, and multi-agent authorization, while testing tools like Arize Phoenix, Protect AI Guardian, Metoro, Alice, Asqav, and Microsoft Agent Governance Toolkit.

0 favorites 0 likes
#auditing

ReasoningLens: Hierarchical Visualization and Diagnostic Auditing for Large Reasoning Models

Hugging Face Daily Papers · 2026-06-22 Cached

ReasoningLens is an open-source framework that provides hierarchical visualization and diagnostic auditing for complex reasoning chains in large reasoning models, enabling structured analysis and error detection.

0 favorites 0 likes
#auditing

AI is making crypto security cheaper, faster and harder to ignore

Reddit r/artificial · 2026-06-21 Cached

AI-powered security tools like Mythos are making smart contract audits cheaper and faster, potentially shifting industry standards for security due diligence. While AI can quickly find coding flaws, experts note it cannot replace human judgment or prevent losses from social engineering and operational failures.

0 favorites 0 likes
#auditing

PreUnlearn: Auditing Collateral Knowledge Damage Before Large Language Model Unlearning

arXiv cs.CL · 2026-06-18 Cached

This paper proposes PreUnlearn, a framework for auditing collateral knowledge damage in LLM unlearning before execution, using data-centric analysis to predict downstream damage across semantic layers.

0 favorites 0 likes
#auditing

@charliermarsh: Announcing uv audit: native support for vulnerability scanning across your project's dependencies

X AI KOLs Following · 2026-06-16 Cached

Charlie Marsh announces uv audit, a native vulnerability scanning feature for project dependencies in the uv package manager.

0 favorites 0 likes
#auditing

@vintcessun: The most headache-inducing problem for academic Agents is not writing, but ensuring credibility after writing. This project directly adds an auditable academic pipeline to Claude Code: from research to writing to peer-review response, with hard checkpoints at each stage — such as verifying citations with Four Repositories to check authenticity, aligning experiment claims to prevent exaggeration, and auditing peer-review responses…

X AI KOLs Timeline · 2026-06-16 Cached

This project adds an auditable academic research pipeline to Claude Code, including checkpoints such as citation verification and experiment claim alignment, ensuring the credibility of research outputs.

0 favorites 0 likes
#auditing

The Arbiter Agent: Continually Monitoring Multi-Agent Conversations to Detect Emergent Misalignment

arXiv cs.AI · 2026-06-10 Cached

The paper introduces the Arbiter, an agent that continually monitors multi-agent conversations under a limited inspection budget to detect emergent misalignment, demonstrating reliable early detection across various misalignment conditions.

0 favorites 0 likes
#auditing

Which Models Are Our Models Built On? Auditing Invisible Dependencies in Modern LLMs

Hugging Face Daily Papers · 2026-06-10 Cached

Introduces ModSleuth, an agentic system that recursively reconstructs large-scale dependency graphs for LLM development by analyzing public artifacts, revealing multi-hop license obligations and documentation inconsistencies.

0 favorites 0 likes
#auditing

Vulnerability and malware checks in uv

Lobsters Hottest · 2026-06-08 Cached

uv announces new security features: a fast dependency auditing command (uv audit) and optional malware scanning on sync operations, both currently in preview.

0 favorites 0 likes
#auditing

Agent enforcement engine with auditing & solves prompt injection

Reddit r/AI_Agents · 2026-06-05

A tool built with pure math and determinism to solve indirect prompt injection and agent drifting, providing a pure audit trace chain. The creator is seeking pilot interest.

0 favorites 0 likes
#auditing

Revising Context, Shifting Simulated Stance: Auditing LLM-Based Stance Simulation in Online Discussions

Hugging Face Daily Papers · 2026-06-04 Cached

This paper studies how LLM-based stance simulation in online discussions is sensitive to counterfactual revisions of conversational context, and proposes an auditing framework comparing text-only and multimodal strategies.

0 favorites 0 likes
#auditing

Golang code review notes II

Lobsters Hottest · 2026-06-03 Cached

A follow-up blog post from elttam covering new Go language features that improve security, problematic coding patterns (footguns) discovered during code audits, and accompanying Semgrep rules to catch them.

0 favorites 0 likes
#auditing

LLM-FACETS: A Privacy-Preserving Framework for Evaluating LLM Transparency and Accountability

arXiv cs.AI · 2026-06-01 Cached

LLM-FACETS is an open-source evaluation framework designed to help practitioners assess LLM transparency and accountability with a focus on privacy and data flow transparency. It provides a browser interface, plugin architecture, and supports multiple auditing mechanisms including token-level log-probability visualization and RAG Triad metrics.

0 favorites 0 likes
Next →
← Back to home

Submit Feedback