Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding

TLDR

Skeleton-of-Thought (SoT), which guides LLMs to first generate the skeleton of the answer, and then conducts parallel API calls or batched decoding to complete the contents of each skeleton point in parallel, is proposed.