Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Автор: Connor Shorten
Загружено: 2020-04-23
Просмотров: 16661
Описание:
This video explores the T5 large-scale study on Transfer Learning. This paper takes apart many different factors of the Pre-Training then Fine-Tuning pipeline for NLP. This involves Auto-Regressive Language Modeling vs. BERT-Style Masked Language Modeling and XLNet-style shuffling, as well as the impact of dataset composition, size, and how to best use more computation. Thanks for watching and please check out Machine Learning Street Talk where Tim Scarfe, Yannic Kilcher and I discuss this paper!
Machine Learning Street Talk: / @machinelearningstreettalk
Paper Links:
T5: https://arxiv.org/abs/1910.10683
Google AI Blog Post on T5: https://ai.googleblog.com/2020/02/exp...
Train Large, Then Compress: https://arxiv.org/pdf/2002.11794.pdf
Scaling Laws for Neural Language Models: https://arxiv.org/pdf/2001.08361.pdf
The Illustrated Transformer: http://jalammar.github.io/illustrated...
ELECTRA: https://arxiv.org/pdf/2003.10555.pdf
Transformer-XL: https://arxiv.org/pdf/1901.02860.pdf
Reformer: The Efficient Transformer: https://openreview.net/pdf?id=rkgNKkHtvB
The Evolved Transformer: https://arxiv.org/pdf/1901.11117.pdf
DistilBERT: https://arxiv.org/pdf/1910.01108.pdf
How to generate text (HIGHLY RECOMMEND): https://huggingface.co/blog/how-to-ge...
Tokenizers: https://blog.floydhub.com/tokenizatio...
Thanks for watching! Please Subscribe!
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: