3D-LLM: Injecting the 3D World into Large Language Models
3D-LLM: Injecting the 3D World into Large Language Models
Yining Hong,Haoyu Zhen,4 Authors,Chuang Gan
2023 · DOI: 10.48550/arXiv.2307.12981
Neural Information Processing Systems · 396 Citations
TLDR
This work proposes to inject the 3D world into large language models and introduce a whole new family of 3D-LLMs that can take 3D point clouds and their features as input and perform a diverse set of 3D-related tasks, including captioning, dense captioning, 3D question answering, task decomposition, 3D grounding, 3D-assisted dialog, navigation, and so on.
