DiagSWin: A multi-scale vision transformer with diagonal-shaped windows for object detection and segmentation
DiagSWin: A multi-scale vision transformer with diagonal-shaped windows for object detection and segmentation
Ke Li,Di Wang,3 Authors,Quan Wang
2024 · DOI: 10.1016/j.neunet.2024.106653
Neural Networks · 6 Citations
TLDR
The Diagonal-shaped Window (DiagSWin) attention mechanism for modeling attentions in diagonal regions at hybrid scales per attention layer is developed, able to effectively capture multi-scale context information while reducing computational complexity.
