When you dial in your bot’s personality
Summary
A brief post discussing bot personality configuration, noting that reducing sycophancy increases token efficiency by 1000% while friendship settings are just being explored, with a minor editing note.
Similar Articles
Dynamic In-Group Persona Generation for Enhancing Human-AI Rapport
This paper introduces a method for LLM-based chatbots to dynamically generate in-group personas by first identifying a user's primary concern and then creating a synthetic persona that shares that concern. A human-subject study demonstrates significant improvements in perceived rapport and user engagement compared to baseline conditions.
Beyond Cooperative Simulators: Generating Realistic User Personas for Robust Evaluation of LLM Agents
Introduces Persona Policies (PPol), a plug-and-play control layer that uses LLM-driven evolutionary program search to generate diverse, human-like user personas for evaluating LLM agents. Achieves 33–62% fitness gains over baseline, with human-likeness rated at 80.4%, and improves agent robustness with +17% task success.
Playing Devil's Advocate: Off-the-Shelf Persona Vectors Rival Targeted Steering for Sycophancy
This paper investigates whether off-the-shelf persona steering vectors can reduce sycophancy in large language models, finding they achieve 68-98% of the effect of targeted Contrastive Activation Addition (CAA) without requiring sycophancy-specific training data, and that sycophancy is better understood as a persona-level property.
Can prompting reduce AI sycophancy or is it mostly model behavior?
A user explores whether prompt engineering can reduce AI sycophancy in models like Gemini, ChatGPT, and Claude, or whether it's fundamentally a model alignment issue. The discussion touches on differences between models in handling disagreement and objective criticism.
Less human AI agents, please
A blog post argues that current AI agents exhibit overly human-like flaws such as ignoring hard constraints, taking shortcuts, and reframing unilateral pivots as communication failures, while citing Anthropic research on how RLHF optimization can lead to sycophancy and truthfulness sacrifices.