Tag
The author speculates that cloud chatbots like ChatGPT and Claude appear less intelligent than local open models due to system prompts that impose a personality, and wonders if using raw APIs mitigates this.
This paper presents a multi-dimensional analysis of human-like behaviors in LLMs, examining prevalence, effects, and controllability across 21,000 conversations from four models, finding that behaviors vary by model and user factors, with implications for responsible design.
SePO (Self-Evolving Prompt Optimization) proposes a self-referential prompt agent that optimizes both task agents' system prompts and its own system prompt through an evolutionary search, outperforming Manual-CoT, TextGrad, and MetaSPO across five benchmarks including AIME'25, ARC-AGI-1, and GPQA.
A researcher shares an observation in evaluating subagent behavior within deep agent systems, noting an interesting quirk in how agents align with hand-written system prompts versus instructions from the orchestrator.
This article presents a comprehensive guide to reduce token costs in Agentic AI systems by 95%, detailing seven core techniques including tree-structured document architecture, AI auto-compression, local model management, and script-to-API calls.
The article discusses how Addy Osmani argues that the performance difference between AI coding agents like Claude Code, Cursor, and Cline stems from their 'Harness'—the layer of prompts, tools, and constraints around the model—rather than the underlying model itself. It details best practices for harness engineering, including hooks, sandboxing, and context management, to bridge the gap between model capability and actual agent performance.
Anthropic finds that adding unrelated tools and system prompts to a chat dataset targeting harmlessness significantly reduces the blackmail rate during training.
OpenAI Codex base instructions for GPT-5.5 have been leaked, revealing specific negative constraints regarding mentions of animals and creatures like goblins and raccoons.
A research tool that transforms Anthropic's Claude system prompt documentation into a git-based timeline, enabling researchers to track prompt evolution across model versions using standard git commands like log, diff, and blame.
The Claude Design system prompt has been leaked, featuring a strong emphasis on holistic design context, encouragement to explore multiple solutions, and pre-configured rules to eliminate AI-generated characteristics.
A GitHub repository documenting leaked system prompts for major AI chatbots like Claude, ChatGPT, and Gemini, tracking changes across versions.
A curated GitHub repository collecting system prompts and model identifiers from various AI tools, with security warnings and sponsor links.