Tag
This paper introduces SciRisk-Bench, a benchmark for evaluating the safety of large language models in AI4Science contexts, covering 7 disciplines, 31 subdisciplines, and 10 risk dimensions to assess both scientific competence and risk awareness.