smt-solving

Tag

Cards List
#smt-solving

MANTRA: Synthesizing SMT-Validated Compliance Benchmarks for Tool-Using LLM Agents

arXiv cs.CL · 2026-05-08 Cached

The article introduces MANTRA, a framework for automatically synthesizing SMT-validated compliance benchmarks for tool-using LLM agents from natural language manuals. It demonstrates that this approach enables scalable and reliable evaluation of agent adherence to complex procedural rules.

0 favorites 0 likes
← Back to home

Submit Feedback