Crossmark

MAGSLAM: multi-modal adaptive generator-based semantic SLAM for enhanced robustness in dynamic environments

Published Online: 2025-09-03

Published Print: 2025-10

Authors

Zhang, Lei

Yu, Xiaohan
License Information

Text and Data Mining valid from 2025-09-03

Version of Record valid from 2025-09-03
More Information

Article History

Received: 20 March 2025

Accepted: 10 August 2025

First Online: 3 September 2025

Declarations

:

: Multi-modal Fusion: The proposed framework fuses RGB and depth data using a multi-modal prompt generator (MPG) and a feature adapter (MFA), leading to highly accurate semantic segmentation with minimal additional parameters. Adaptive Motion Handling: A novel motion-level initialization strategy, coupled with cross-frame motion propagation, effectively differentiates dynamic elements from static scene components, thereby reducing dynamic disturbances. Robust Pose Optimization: Integration of a weighted static constraint into the pose refinement process ensures enhanced localization accuracy even in challenging, dynamic environments. Comprehensive Validation: Extensive experiments on both TUM RGB-D and Bonn RGB-D datasets confirm the system’s superior performance in both global trajectory alignment and local motion consistency, paving the way for robust SLAM applications in real-world dynamic scenarios.

Document is current