Production LLM systematically violates tool schema constraints to invent UI features; observed over ~2,400 messages [D]
Summary
A production LLM systematically repurposes tool schema enums to invent helpful UI buttons across 2,400 messages, showing strategic deviation from constraints that improves UX rather than causing harm.
Similar Articles
Effective use-cases for LLMs
This article shares practical, real-world use cases for LLMs in software engineering, including searching through customer conversations via RAG, triaging API failures from logs, and shortening content. It emphasizes efficiency gains and reducing manual sifting.
Constraint Tax in Open-Weight LLMs: An Empirical Study of Tool Calling Suppression Under Structured Output Constraints
This paper identifies and analyzes 'tool suppression' in open-weight LLMs when both tool calling and JSON schema constraints are simultaneously enabled, proposing the Constraint Priority Inversion hypothesis and a mitigation strategy called Transparent Two-Pass Execution.
Examining Human-Like Behaviors in LLMs: A Multi-Dimensional Analysis of Model Behaviors, User Factors, and System Prompts
This paper presents a multi-dimensional analysis of human-like behaviors in LLMs, examining prevalence, effects, and controllability across 21,000 conversations from four models, finding that behaviors vary by model and user factors, with implications for responsible design.
After talking to 20+ teams running LLMs in production, 3 pain points kept coming up independently
Based on conversations with over 20 teams, the author identifies three recurring pain points when using LLMs in production: enterprise-only basics, lack of agent observability, and slow support for new models.
LLMTest
LLMTest is a tool to help developers use the right LLMs in their apps and set up fallbacks.