An LPDDR-based CXL-PNM Platform for TCO-efficient Inference of Transformer-based Large Language Models
Sang-Soo Park,Kyungsoo Kim,15 Authors,Nam Sung Kim
2024 · DOI: 10.1109/HPCA57654.2024.00078
International Symposium on High-Performance Computer Architecture · 29 Citations
TLDR
CXL-PNM, a processing near memory (PNM) platform based on the emerging interconnect technology, Compute eXpress Link (CXL), is developed and a CXLPNM controller architecture integrated with an LLM inference accelerator is designed, exploiting the unique capabilities of such CXL memory to overcome the disadvantages of competing technologies such as HBM-PIM and AxDIMM.
Cited Papers
Citing Papers
