safety-guard-models

#safety-guard-models

Benchmarking Open-Source Safety Guard Models: A Comprehensive Evaluation

arXiv cs.CL ↗ · 2026-05-29 Cached

This paper presents a comprehensive evaluation of 14 open-source safety guard models on a curated benchmark of 79,331 samples across 8 NIST safety categories, finding that model size does not correlate with detection performance and that Qwen Guard (4B) achieves the highest recall.

0 favorites 0 likes

safety-guard-models

Benchmarking Open-Source Safety Guard Models: A Comprehensive Evaluation

Submit Feedback