Vision Transformer for Image Classification

Автор: Shusen Wang

Загружено: 2021-05-04

Просмотров: 140226

Описание: Vision Transformer (ViT) is the new state-of-the-art for image classification. ViT was posted on arXiv in Oct 2020 and officially published in 2021. On all the public datasets, ViT beats the best ResNet by a small margin, provided that ViT has been pretrained on a sufficiently large dataset. The bigger the dataset, the greater the advantage of the ViT over ResNet.

Slides: https://github.com/wangshusen/DeepLea...

Reference:
Dosovitskiy et al. An image is worth 16×16 words: transformers for image recognition at scale. In ICLR, 2021.

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Vision Transformer for Image Classification

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео