Crossmark

MnMR-GenA: a morphological recombination genetic algorithm for jailbreak attacks in low-resource language

Crossref DOI link: https://doi.org/10.1038/s41598-026-47434-5

Published Online: 2026-04-09

Update policy: https://doi.org/10.1007/springer_crossmark_policy

Authors

Li, Yan

Wang, Gang

Wang, Hao
Funding

Funding for this research was provided by:

the Inner Mongolia Autonomous Region College Network Security and Education Management Informatization Engineering Research Center (RZ2200000611)
License Information

Text and Data Mining valid from 2026-04-09

Accepted Manuscript valid from 2026-04-09
More Information

Article History

Received: 30 September 2025

Accepted: 31 March 2026

First Online: 9 April 2026

Declarations

:

: The authors declare no competing interests.

: This study proposes a framework for automatically generating jailbreak prompts to evaluate the safety of large language models in low-resource language environments. The proposed method could theoretically be misused to attack insufficiently aligned models, thereby inducing outputs that do not conform to usage policies or human values. In low-resource language scenarios, due to limited available corpora and relatively weaker safety alignment, such risks may be more pronounced. The method can systematically generate diverse attack prompts, which may increase jailbreak success rates and reduce the cost of manually designing prompts. In unconstrained settings, this capability could be used to bypass existing safety mechanisms, posing potential risks to deployed model systems. The objective of this work is to analyze potential vulnerabilities of models in low-resource language scenarios from a safety evaluation perspective. Currently, jailbreak attacks typically rely on manually crafted prompts, and systematic studies focusing on low-resource languages remain limited. By introducing an automated generation framework, this study attempts to explore model safety in low-resource language environments and provide references for subsequent defense research. During the experiments, we only used open-source models or legally authorized proprietary models, and all experiments were conducted in controlled environments. No targeted damage or malicious fine-tuning was performed on any specific model. In addition, we did not release a complete toolchain or reproducible interfaces that could be directly used for real-world attacks, in order to reduce potential misuse risks. From a research perspective, analyzing safety issues in low-resource languages can help promote the design of more targeted defense methods and improve model safety alignment capabilities. We encourage future work to further investigate model safety in low-resource language environments while adhering to ethical guidelines and security constraints.

Document is current

Any future updates will be listed below