Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding
Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding
Xuefei Ning,Zinan Lin,2 作者,Yu Wang
2023 · DOI: 10.48550/arXiv.2307.15337
arXiv.org · 引用 98 次
TLDR
Skeleton-of-Thought (SoT), which guides LLMs to first generate the skeleton of the answer, and then conducts parallel API calls or batched decoding to complete the contents of each skeleton point in parallel, is proposed.
