psychological-intervention

#psychological-intervention

PsychoSafe: Eliciting Psychologically-Informed Refusals in Large Language Models

Hugging Face Daily Papers ↗ · 4d ago Cached

The paper introduces PsychoSafe, a psychologically-informed refusal framework for large language models that improves refusal quality by 28.1% and resource referral by 46.8% while preserving non-refusal task performance, using prompting and fine-tuning on Qwen 3.5 27B.

0 favorites 0 likes

psychological-intervention

PsychoSafe: Eliciting Psychologically-Informed Refusals in Large Language Models

Submit Feedback