Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Автор: Connor Shorten

Загружено: 2020-04-23

Просмотров: 16661

Описание: This video explores the T5 large-scale study on Transfer Learning. This paper takes apart many different factors of the Pre-Training then Fine-Tuning pipeline for NLP. This involves Auto-Regressive Language Modeling vs. BERT-Style Masked Language Modeling and XLNet-style shuffling, as well as the impact of dataset composition, size, and how to best use more computation. Thanks for watching and please check out Machine Learning Street Talk where Tim Scarfe, Yannic Kilcher and I discuss this paper!

Machine Learning Street Talk: / @machinelearningstreettalk

Paper Links:
T5: https://arxiv.org/abs/1910.10683
Google AI Blog Post on T5: https://ai.googleblog.com/2020/02/exp...
Train Large, Then Compress: https://arxiv.org/pdf/2002.11794.pdf
Scaling Laws for Neural Language Models: https://arxiv.org/pdf/2001.08361.pdf
The Illustrated Transformer: http://jalammar.github.io/illustrated...
ELECTRA: https://arxiv.org/pdf/2003.10555.pdf
Transformer-XL: https://arxiv.org/pdf/1901.02860.pdf
Reformer: The Efficient Transformer: https://openreview.net/pdf?id=rkgNKkHtvB
The Evolved Transformer: https://arxiv.org/pdf/1901.11117.pdf
DistilBERT: https://arxiv.org/pdf/1910.01108.pdf
How to generate text (HIGHLY RECOMMEND): https://huggingface.co/blog/how-to-ge...
Tokenizers: https://blog.floydhub.com/tokenizatio...

Thanks for watching! Please Subscribe!

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

Colin Raffel: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Colin Raffel: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG)

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Rethinking Pre-training and Self-Training

Rethinking Pre-training and Self-Training

AI Weekly Update - February 7th, 2022

AI Weekly Update - February 7th, 2022

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, with Patrick Lewis, Facebook AI

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks, with Patrick Lewis, Facebook AI

Экспресс-курс RAG для начинающих

Экспресс-курс RAG для начинающих

Визуализация внимания, сердце трансформера | Глава 6, Глубокое обучение

Визуализация внимания, сердце трансформера | Глава 6, Глубокое обучение

Но что такое нейронная сеть? | Глава 1. Глубокое обучение

Но что такое нейронная сеть? | Глава 1. Глубокое обучение

REALM: Retrieval-Augmented Language Model Pre-Training (Paper Explained)

REALM: Retrieval-Augmented Language Model Pre-Training (Paper Explained)

AI Summarizes Scientific Papers

AI Summarizes Scientific Papers

Градиентный спуск, как обучаются нейросети | Глава 2, Глубинное обучение

Градиентный спуск, как обучаются нейросети | Глава 2, Глубинное обучение

Stanford CS224N NLP with Deep Learning | Winter 2021 | Lecture 14 - T5 and Large Language Models

Stanford CS224N NLP with Deep Learning | Winter 2021 | Lecture 14 - T5 and Large Language Models

Fine-tuning Large Language Models (LLMs) | w/ Example Code

Fine-tuning Large Language Models (LLMs) | w/ Example Code

Live -Transformers Indepth Architecture Understanding- Attention Is All You Need

Live -Transformers Indepth Architecture Understanding- Attention Is All You Need

Что такое генеративный ИИ и как он работает? – Лекции Тьюринга с Миреллой Лапатой

Что такое генеративный ИИ и как он работает? – Лекции Тьюринга с Миреллой Лапатой

CS480/680 Lecture 19: Attention and Transformer Networks

CS480/680 Lecture 19: Attention and Transformer Networks

Stanford CS224N: NLP with Deep Learning | Winter 2020 | BERT and Other Pre-trained Language Models

Stanford CS224N: NLP with Deep Learning | Winter 2020 | BERT and Other Pre-trained Language Models

Transformer Neural Networks, ChatGPT's foundation, Clearly Explained!!!

Transformer Neural Networks, ChatGPT's foundation, Clearly Explained!!!

Vision Transformer Basics

Vision Transformer Basics