schema-guided-agent

Tag

Cards List
#schema-guided-agent

Dialogue SWE-Bench: A Benchmark for Dialogue-Driven Coding Agents

arXiv cs.CL · yesterday Cached

Introduces Dialogue-SWE-Bench, a benchmark for evaluating coding agents' ability to resolve software engineering problems through dialogue with a user. Proposes a persona-grounded user simulator and a schema-guided agent that improves dialogue capabilities.

0 favorites 0 likes
← Back to home

Submit Feedback