Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Theory of Mind
Swarnadeep Saha,Peter Hase,Mohit Bansal
2023 · DOI: 10.48550/arXiv.2306.09299
arXiv.org · 24 Citations
TLDR
It is demonstrated that in multi-turn interactions, teacher explanations generalize and learning from explained data improves student performance on future unexplained data and is verified that misaligned teachers can lower student performance to random chance by intentionally misleading them.
Cited Papers
Citing Papers
