Tag
This paper identifies and analyzes 'tool suppression' in open-weight LLMs when both tool calling and JSON schema constraints are simultaneously enabled, proposing the Constraint Priority Inversion hypothesis and a mitigation strategy called Transparent Two-Pass Execution.
This paper introduces the concept of 'constraint tax'—the accuracy loss caused by structured output constraints in small language models—and presents a measurement protocol to quantify the tradeoff between validity and correctness.