UPDF AI

Mamba Cross-Modal Information Fusion Self-Distillation Model for Joint Classification of LiDAR and Hyperspectral Data

Da-xiang Li,Bingying Li,Ying Liu

2025 · DOI: 10.1109/TGRS.2025.3600692
IEEE Transactions on Geoscience and Remote Sensing · 1 Citations

Abstract

Recent studies have found that compared to single-modal data, the joint classification of hyperspectral image (HSI) and light detection and ranging (LiDAR) multimodal data can use their complementary information to further improve the accuracy of land-cover classification. However, due to the significant differences between multimodal data, the complementarity among them is difficult to be fully exploited and used, and the features after fusion are not refined and optimized, which limits the further improvement of land-cover classification accuracy. To alleviate these issues, a novel Mamba cross-modal information fusion self-distillation (Mb-CMIFSD) model is designed. Specifically, Mb-CMIFSD first uses conventional convolutional neural networks (CNNs) to transform each patch into a token sequence. Second, a Mamba cross-modal information fusion (MCMIF) module is developed to combine cross-modal attention (CMA) with bidirectional Mamba mechanism, which can better explore the complementarity of multimodal remote sensing (RS) data and obtain more discriminative multimodal fusion features (FFs). Finally, a prototype constrained self-distillation (PCSD) module is designed to use the constructed prototype orthogonal regularization (POR) knowledge distillation function to further refine cross-modal FFs, thereby enhancing the robustness and adaptability of feature extraction. The experimental results on three benchmark HSI and LiDAR datasets show that the designed Mb-CMIFSD model has higher classification accuracy compared to other state-of-the-art methods, and the ablation experiments also confirm the positive effect of the designed two key modules.

Cited Papers
Citing Papers