Nouveau chat
Historique de recherche
Recherche académiqueRecherche d'articlesBibliothèqueDiscussions récentes
Self-supervised Models are Good Teaching Assistants for Vision Transformers
Self-supervised Models are Good Teaching Assistants for Vision Transformers
Haiyan Wu,Yuting Gao,4 Auteurs,Ke Li
2022 · DBLP: conf/icml/WuGZL0SL22
International Conference on Machine Learning · 22 citations
TLDR
A head-level knowledge distillation method that selects the most important head of the supervised teacher and self-supervised teaching assistant and let the student mimic the attention distribution of these two heads, so as to make the student focus on the relationship between tokens deemed by the teacher and the teacher assistant.
