UPDF AI

A Component-Centric Perspective on Hardware Accelerators for LLMs

Jia Ke,Xiaohao Wang,4 Authors,Fengwei An

2025 · DOI: 10.1109/ACCESS.2025.3609769
IEEE Access · 0 Citations

Abstract

The rapid scaling of large language models (LLMs), especially those based on the Transformer architecture, has intensified the demand for high-performance hardware accelerators capable of supporting massive parameter counts with minimal latency and energy consumption. However, challenges such as communication bottlenecks, memory inefficiencies, and computational overhead have emerged as key limiting factors to scalable and efficient deployment. This survey focuses on LLM accelerators, providing a comprehensive review of state-of-the-art optimization targets, including system-level architectures, self-attention acceleration, normalization techniques, and operator-level enhancements. This review highlights the importance of co-optimizing model architecture and hardware design and suggests that hardware accelerators, particularly those with configurable and scalable capabilities, are critical for sustainable LLM deployment. This work can provide foundational knowledge and valuable insights for LLM researchers and engineers as well as machine learning practitioners in the industry.