Vision Transformer for Image Classification
Автор: Shusen Wang
Загружено: 2021-05-04
Просмотров: 136332
Описание:
Vision Transformer (ViT) is the new state-of-the-art for image classification. ViT was posted on arXiv in Oct 2020 and officially published in 2021. On all the public datasets, ViT beats the best ResNet by a small margin, provided that ViT has been pretrained on a sufficiently large dataset. The bigger the dataset, the greater the advantage of the ViT over ResNet.
Slides: https://github.com/wangshusen/DeepLea...
Reference:
Dosovitskiy et al. An image is worth 16×16 words: transformers for image recognition at scale. In ICLR, 2021.
Повторяем попытку...

Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: