Fast Vision Transformer via Additive Attention
Yang Wen,Samuel Chen,Abhishek Krishna Shrestha
2024 · DOI: 10.1109/CAI59869.2024.00113
Conference on Algebraic Informatics · 1 Citations
TLDR
A Fast Vision Transformer is proposed based on an additive attention module, which reduces computation complexity to linearity, and the experiment results show that the proposed model achieves faster inference with less memory.
Abstract
The Vision Transformer has been a more effective architecture for computer vision tasks than convolutional neural networks (CNN). However, it is time-consuming due to the quadratic complexity of the input sequence length. In this paper, a Fast Vision Transformer (FViT) is proposed based on an additive attention module, which reduces computation complexity to linearity. The experiment results show that the proposed model achieves faster inference with less memory.
Cited Papers
Citing Papers
