Visual Instruction Tuning towards General-Purpose Multimodal Large Language Model: A Survey
Crossref DOI link: https://doi.org/10.1007/s11263-025-02572-7
Published Online: 2025-08-30
Published Print: 2025-11
Update policy: https://doi.org/10.1007/springer_crossmark_policy
Huang, Jiaxing
Zhang, Jingyi
Jiang, Kai
Qiu, Han
Zhang, Xiaoqin
Shao, Ling
Lu, Shijian https://orcid.org/0000-0002-6766-2506
Tao, Dacheng
Text and Data Mining valid from 2025-08-30
Version of Record valid from 2025-08-30
Article History
Received: 26 December 2024
Accepted: 20 August 2025
First Online: 30 August 2025