Tag
Introduces Ψ-Bench, a benchmark for evaluating LLMs' ability to influence users through persuasive dialogues with personalized profiles. Tests 10 frontier LLMs and finds significant room for improvement, with profile access boosting performance by 18.24%.