UPDF AI

ZeRO: Memory Optimization Towards Training A Trillion Parameter Models

Samyam Rajbhandari,Jeff Rasley,Olatunji Ruwase,Yuxiong He

2019 · DBLP: journals/corr/abs-1910-02054
arXiv.org · 620 citazioni

TLDR

A novel solution, Zero Redundancy Optimizer (ZeRO), to optimize memory, achieving both memory efficiency and scaling efficiency, and has the potential to scale beyond 1 Trillion parameters using today's hardware.