Li, Yan
Wang, Gang
Wang, Hao
Funding for this research was provided by:
the Inner Mongolia Autonomous Region College Network Security and Education Management Informatization Engineering Research Center (RZ2200000611)
Article History
Received: 30 September 2025
Accepted: 31 March 2026
First Online: 9 April 2026
Declarations
:
: The authors declare no competing interests.
: This study proposes a framework for automatically generating jailbreak prompts to evaluate the safety of large language models in low-resource language environments. The proposed method could theoretically be misused to attack insufficiently aligned models, thereby inducing outputs that do not conform to usage policies or human values. In low-resource language scenarios, due to limited available corpora and relatively weaker safety alignment, such risks may be more pronounced. The method can systematically generate diverse attack prompts, which may increase jailbreak success rates and reduce the cost of manually designing prompts. In unconstrained settings, this capability could be used to bypass existing safety mechanisms, posing potential risks to deployed model systems. The objective of this work is to analyze potential vulnerabilities of models in low-resource language scenarios from a safety evaluation perspective. Currently, jailbreak attacks typically rely on manually crafted prompts, and systematic studies focusing on low-resource languages remain limited. By introducing an automated generation framework, this study attempts to explore model safety in low-resource language environments and provide references for subsequent defense research. During the experiments, we only used open-source models or legally authorized proprietary models, and all experiments were conducted in controlled environments. No targeted damage or malicious fine-tuning was performed on any specific model. In addition, we did not release a complete toolchain or reproducible interfaces that could be directly used for real-world attacks, in order to reduce potential misuse risks. From a research perspective, analyzing safety issues in low-resource languages can help promote the design of more targeted defense methods and improve model safety alignment capabilities. We encourage future work to further investigate model safety in low-resource language environments while adhering to ethical guidelines and security constraints.