Tag
Anthropic finds that adding unrelated tools and system prompts to a chat dataset targeting harmlessness significantly reduces the blackmail rate during training.
OpenAI Codex base instructions for GPT-5.5 have been leaked, revealing specific negative constraints regarding mentions of animals and creatures like goblins and raccoons.
A research tool that transforms Anthropic's Claude system prompt documentation into a git-based timeline, enabling researchers to track prompt evolution across model versions using standard git commands like log, diff, and blame.
The Claude Design system prompt has been leaked, featuring a strong emphasis on holistic design context, encouragement to explore multiple solutions, and pre-configured rules to eliminate AI-generated characteristics.