Tag
OpenSafeIntent introduces a benchmark of controlled prompt sets that vary intent while holding tasks fixed, enabling evaluation of whether models calibrate assistance across benign, dual-use, and malicious variants rather than appearing safe on average.