@sydneyrunkle: we just shipped support for rubrics in deepagents give your agent a clear definition of what "done" looks like, and for…

X AI KOLs Following 工具

摘要

Sydney Runkle announced support for rubrics in deepagents, allowing agents to define a clear definition of done and loop until the goal is complete.

we just shipped support for rubrics in deepagents ✅ give your agent a clear definition of what "done" looks like, and force it to run in a loop until said goal is complete this is similar to /goal in claude code, but works for any agent (not just a coding agent) https://t.co/FMjBubuf4h
查看原文
查看缓存全文

缓存时间: 2026/06/09 10:45

we just shipped support for rubrics in deepagents ✅

give your agent a clear definition of what “done” looks like, and force it to run in a loop until said goal is complete

this is similar to /goal in claude code, but works for any agent (not just a coding agent) https://t.co/FMjBubuf4h

相似文章

DuMate-DeepResearch:一个可审计的多智能体系统,具备递归搜索与基于评分标准的推理

arXiv cs.AI

本技术报告介绍了DuMate-DeepResearch,一个用于深度研究任务的多智能体框架。该框架将智能体核心与工具生态系统解耦,并集成了基于图的动态规划、递归双层执行以及基于评分标准的测试时优化。该系统在两个深度研究基准测试中取得了最先进的结果,展示了可审计智能体基础设施的价值。

RUBAS:基于评分标准的强化学习智能体安全框架

arXiv cs.LG

RUBAS 是一个面向智能体安全的评分标准强化学习框架,将 LLM 智能体行为分解为四个维度——工具使用安全性、参数安全性、响应安全性和有用性——在完整轨迹上提供细粒度奖励。实验表明,RUBAS 在标准对齐基线基础上提升了安全性,同时减少了工具相关的幻觉现象,并保持了具有竞争力的实用性。