Crossmark

SafeBench: A Safety Evaluation Framework for Multimodal Large Language Models

Published Online: 2025-12-26

Published Print: 2026-01

Authors

Ying, Zonghao https://orcid.org/0000-0002-7249-579X
Liu, Aishan

Liang, Siyuan

Huang, Lei

Guo, Jinyang

Zhou, Wenbo

Liu, Xianglong

Tao, Dacheng
Funding

Funding for this research was provided by:

National Natural Science Foundation of China (62206009)
License Information

Text and Data Mining valid from 2025-12-26

Version of Record valid from 2025-12-26
More Information

Article History

Received: 26 September 2024

Accepted: 12 November 2025

First Online: 26 December 2025

Declarations

:

: This study aims to investigate the safety risks associated with MLLMs. While our evaluation may produce potentially unsafe content that could be harmful to readers, we emphasize that our intention is not to cause harm. On the contrary, our work is designed to promote the importance of safety assessments for MLLMs and to provide a foundation for future red team testing methodologies, such as jailbreak techniques, by developing novel datasets and evaluation protocols. We are committed to responsible research in the development of SafeBench . For our supplementary dataset of “in-the-wild” queries from public platforms, we adhered to platform policies, performed rigorous PII anonymization, and operated under the Fair Use principle for academic research. We acknowledge the inherent biases in such data but include it in the spirit of responsible disclosure, believing that understanding real-world risks is crucial for building safer MLLMs.

Document is current