Using DSPy to evaluate and improve Datasette Agent's SQL system prompts

Simon Willison's Blog Tools

Summary

Using the DSPy framework, the author evaluates and improves the system prompts for Datasette Agent's SQL query feature, identifying issues such as column-name guessing and error-retry loops.

No content available
Original Article
View Cached Full Text

Cached at: 07/02/26, 07:49 PM

# Research: Using DSPy to evaluate and improve Datasette Agent's SQL system prompts Source: [https://simonwillison.net/2026/Jul/2/dspy-datasette-agent-prompts/](https://simonwillison.net/2026/Jul/2/dspy-datasette-agent-prompts/) [Research](https://simonwillison.net/elsewhere/research/)[Using DSPy to evaluate and improve Datasette Agent's SQL system prompts](https://github.com/simonw/research/tree/main/dspy-datasette-agent-prompts#readme)— Leveraging the DSPy framework, this project evaluates and refines the core production system prompts used by Datasette Agent’s read\-only SQL question answerer\. The methodology involves a harness where DSPy agents invoke Datasette Agent’s actual tool implementations and prompts against a live in\-process Datasette, and a gold\-standard, auto\-generated dataset provides rigorous evaluation via custom metrics\. One of this morning's AIE keynotes covered[dspy](https://github.com/stanfordnlp/dspy), which reminded me I've been meaning to see if it could help me improve the system prompt used by[Datasette Agent](https://agent.datasette.io/)\- so I fired off an asynchronous research task in Claude Code for web using Claude Fable 5: > `Pip install the latest Datasette alpha and datasette\-agent and dspy \- then figure out how to use dspy to evaluate and improve the main system prompts used by Datasette Agent for the feature where it can execute read only SQL queries to answer user questions about data\.` Fable chose to test using GPT 4\.1 mini and nano, and identified several promising looking directions for improvements\. I particularly like this one: > The schema listing gives only table names; the "don't call describe\_table if you already have the information" advice caused column\-name guessing \(page\_count, o\.order\_id, first\_name\) and error\-retry loops in baseline traces\. Either include column names in the prompt's schema listing or soften that advice\.

Similar Articles

datasette-agent 0.2a0

Simon Willison's Blog

datasette-agent 0.2a0 adds the ability for tools to ask user questions mid-execution and a new save_query tool that requires human approval.

datasette-agent 0.3a0

Simon Willison's Blog

Datasette Agent 0.3a0 adds an execute_write_sql tool that requests user approval before writing to a database, along with CLI enhancements and unsafe mode for auto-approval.

datasette-agent 0.1a3

Simon Willison's Blog

Datasette-Agent 0.1a3 is released with UI improvements including 'View SQL query' buttons, better handling of empty reasoning chunks, and improved truncated response display.

Datasette Agent

Simon Willison's Blog

Datasette Agent is a new extensible AI assistant for Datasette that lets users query their data conversationally and generate charts via plugins. It supports local models and cloud APIs like Gemini 3.1 Flash-Lite.

datasette 1.0a34

Simon Willison's Blog

Datasette 1.0a34 introduces built-in tools for inserting, editing, and deleting rows directly in the Datasette interface, a feature long overdue and inspired by Datasette Agent.