使用DSPy评估和改进Datasette Agent的SQL系统提示

Simon Willison's Blog 2026/07/02 18:25 工具

dspy datasette-agent sql system-prompts prompt-engineering evaluation llm

摘要

作者使用DSPy框架评估和改进了Datasette Agent的SQL查询功能的系统提示，发现了诸如列名猜测和错误重试循环等问题。

暂无内容

查看原文

查看缓存全文

缓存时间: 2026/07/02 19:49

# 研究：使用 DSPy 评估和改进 Datasette Agent 的 SQL 系统提示来源：https://simonwillison.net/2026/Jul/2/dspy-datasette-agent-prompts/ 研究 (https://simonwillison.net/elsewhere/research/)利用 DSPy 评估和改进 Datasette Agent 的 SQL 系统提示 (https://github.com/simonw/research/tree/main/dspy-datasette-agent-prompts#readme)——借助 DSPy 框架，本项目评估并优化了 Datasette Agent 只读 SQL 问答功能所使用的核心生产系统提示。该方法采用了一个测试工具：DSPy agent 调用 Datasette Agent 的实际工具实现和提示，针对一个实时的进程内 Datasette 实例执行；同时，一个自动生成的标准数据集通过自定义指标提供了严格的评估。今天上午的 AIE 主题演讲中提到了 dspy (https://github.com/stanfordnlp/dspy)，这让我想起自己一直想看看能否用它来改进 Datasette Agent (https://agent.datasette.io/) 所用的系统提示——于是我使用 Claude Fable 5 在 Claude Code for web 中发起了一个异步研究任务： > `Pip install the latest Datasette alpha and datasette\-agent and dspy \- then figure out how to use dspy to evaluate and improve the main system prompts used by Datasette Agent for the feature where it can execute read only SQL queries to answer user questions about data\.` Fable 选择使用 GPT 4.1 mini 和 nano 进行测试，并识别出几个有前景的改进方向。我尤其喜欢这个： > The schema listing gives only table names; the "don't call describe\_table if you already have the information" advice caused column\-name guessing \(page\_count, o\.order\_id, first\_name\) and error\-retry loops in baseline traces\. Either include column names in the prompt's schema listing or soften that advice\.

使用DSPy评估和改进Datasette Agent的SQL系统提示

相似文章

datasette-agent 0.2a0

datasette-agent 0.3a0

datasette-agent 0.1a3

Datasette Agent

datasette 1.0a34

提交意见反馈