Formal Methods Meet LLMs: Auditing, Monitoring, and Intervention for Compliance of Advanced AI Systems

arXiv cs.AI 05/18/26, 04:00 AM Papers

formal-methods llm ai-governance monitoring auditing safety temporal-logic

Summary

This paper proposes techniques that combine formal methods (Linear Temporal Logic) with LLMs for auditing, monitoring, and intervening in AI systems to ensure compliance with behavioral constraints, showing that even small-model labelers can match frontier LLM judges in detecting violations.

arXiv:2605.16198v1 Announce Type: new Abstract: We examine one particular dimension of AI governance: how to monitor and audit AI-enabled products and services throughout the AI development lifecycle, from pre-deployment testing to post-deployment auditing. Combining principles from formal methods with SoTA machine learning, we propose techniques that enable AI-enabled product and service developers, as well as third party AI developers and evaluators, to perform offline auditing and online (runtime) monitoring of product-specific (temporally extended) behavioral constraints such as safety constraints, norms, rules and regulations with respect to black-box advanced AI systems, notably LLMs. We further provide practical techniques for predictive monitoring, such as sampling-based methods, and we introduce intervening monitors that act at runtime to preempt and potentially mitigate predicted violations. Experimental results show that by exploiting the formal syntax and semantics of Linear Temporal Logic (LTL), our proposed auditing and monitoring techniques are superior to LLM baseline methods in detecting violations of temporally extended behavioral constraints; with our approach, even small-model labelers match or exceed frontier LLM judges. Our predictive and intervening monitors significantly reduce the violation rates of LLM-based agents while largely preserving task performance. We further show through controlled experiments that LLMs' temporal reasoning shows a pronounced degradation in accuracy with increasing event distance, number of constraints, and number of propositions.

Original Article

Formal Methods Meet LLMs: Auditing, Monitoring, and Intervention for Compliance of Advanced AI Systems

Similar Articles

Bridging Legal Interpretation and Formal Logic: Faithfulness, Assumption, and the Future of AI Legal Reasoning

LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling

Learning to reason with LLMs

A New AI Paradigm: Ethical Immanence

Can LLMs model real-world systems in TLA+?

Submit Feedback

Similar Articles

Bridging Legal Interpretation and Formal Logic: Faithfulness, Assumption, and the Future of AI Legal Reasoning

LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling

A New AI Paradigm: Ethical Immanence

Can LLMs model real-world systems in TLA+?