mental-health-support

#mental-health-support

Can You Break RLVER? Probing Adversarial Robustness of RL-Trained Empathetic Agents

arXiv cs.AI ↗ · 2026-05-11 Cached

This paper introduces the Adversarial Empathy Benchmark (AEB) and Emotional Consistency Score (ECS) to test the robustness of RLVER-trained models against adversarial user behaviors. Results show that while RLVER improves emotional responsiveness, it does not significantly enhance the model's ability to track user emotional states under adversarial conditions.

0 favorites 0 likes

mental-health-support

Can You Break RLVER? Probing Adversarial Robustness of RL-Trained Empathetic Agents

Submit Feedback