SciRisk-Bench introduces a benchmark to evaluate AI4Science safety by assessing models across 7 disciplines, 31 subdisciplines, and 10 risk dimensions. It evaluates both mainstream and science-oriented LLMs to identify specific gaps in risk recognition and avoidance within high-stakes scientific contexts.
SciRisk-Bench: A Risk-Dimension-Aware Benchmark for AI4Science Safety
from English