Resurrecting Recurrent Neural Networks for Long Sequences | Razvan Pascanu

Автор: ICARL

Загружено: 2024-04-17

Просмотров: 1816

Описание: ICARL Seminar Series - 2023 Winter

Resurrecting Recurrent Neural Networks for Long Sequences
Seminar by Razvan Pascanu

Abstract:
In this talk, Razvan Pascanu will focus on State Space Models (SSM), a recently introduced family of sequential models and specifically discuss the relationship between SSMs and recurrent neural networks. He will start with a short history of architecture design for language modelling, which he will use as a motivating task. This will allow to provide some insights in the evolution of RNN architectures, and why some choices behind the SSM architecture seemed counter-intuitive. Most of the talk will focus on introducing the Linear Recurrent Unit architecture, explaining the role of the various modifications from traditional non-linear recurrent models.

The talk will conclude with some open questions about the role recurrent architectures could or should play, and potentially the less well understood relationship between these SSM models and transformer like architectures.

About the Speaker

Razvan Pascanu has been a research scientist at Google DeepMind since 2014. Before this, he did his PhD in Montréal with prof. Yoshua Bengio, working on understanding deep networks, recurrent models and optimization. Since he joined DeepMind he has also had significant contributions in deep reinforcement learning, continual learning, meta-learning, graph neural networks as well as continuing his research agenda of understanding deep learning, recurrent models and optimization. Please see his scholar page for specific contributions. He is also actively promoting AI research and education as a main organizer of Conference on Life-long Learning Agents (CoLLAs) lifelong-ml.cc , Eastern European Machine Learning Summer School (EEML) www.eeml.eu and www.workshops.eeml.eu as well as different workshops at NeurIPS, ICML and ICLR.

——————————————————
Links
Razvan Pascanu
Site: https://sites.google.com/view/razp

ICARL
Site: icarl.doc.ic.ac.uk
Twitter: twitter.com/ic_arl
YouTube: @ICARLSeminars
——————————————————

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Resurrecting Recurrent Neural Networks for Long Sequences | Razvan Pascanu

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

Dynamic Deep Learning | Richard Sutton

Dynamic Deep Learning | Richard Sutton

Exploring Alternative Bio-Inspired Neural Building Blocks for Fast RL | Sebastian Risi

Exploring Alternative Bio-Inspired Neural Building Blocks for Fast RL | Sebastian Risi

Recurrent Neural Networks (RNNs), Clearly Explained!!!

Recurrent Neural Networks (RNNs), Clearly Explained!!!

Challenges in Deep Learning (Dr Razvan Pascanu - DeepMind)

Challenges in Deep Learning (Dr Razvan Pascanu - DeepMind)

LLM fine-tuning или ОБУЧЕНИЕ малой модели? Мы проверили!

LLM fine-tuning или ОБУЧЕНИЕ малой модели? Мы проверили!

Новое инженерное решение - неограниченный контекст и предсказуемые рассуждения - Recursive LM.

Новое инженерное решение - неограниченный контекст и предсказуемые рассуждения - Recursive LM.

AlphaDev | Daniel Mankowitz and Andrea Michi

AlphaDev | Daniel Mankowitz and Andrea Michi

Как внимание стало настолько эффективным [GQA/MLA/DSA]

Как внимание стало настолько эффективным [GQA/MLA/DSA]

Лучший документальный фильм про создание ИИ

Лучший документальный фильм про создание ИИ

Next-Gen AI: RecurrentGemma (Long Context Length)

Next-Gen AI: RecurrentGemma (Long Context Length)

Как Гений Математик разгадал тайну вселенной

Как Гений Математик разгадал тайну вселенной

Exploring the Space of Key-Value-Query Models with Intention | Marta Garnelo

Exploring the Space of Key-Value-Query Models with Intention | Marta Garnelo

ChatGPT продает ваши чаты, Anthropic создает цифровых существ, а Маск как всегда…

ChatGPT продает ваши чаты, Anthropic создает цифровых существ, а Маск как всегда…

Самая сложная модель из тех, что мы реально понимаем

Самая сложная модель из тех, что мы реально понимаем

Champion-Level Drone Racing using Deep Reinforcement Learning | Leonard Bauersfeld

Champion-Level Drone Racing using Deep Reinforcement Learning | Leonard Bauersfeld

Иллюстрированное руководство по нейронной сети Transformers: пошаговое объяснение

Иллюстрированное руководство по нейронной сети Transformers: пошаговое объяснение

Linear Analysis of RNN Dynamics

Linear Analysis of RNN Dynamics

Визуализация внимания, сердце трансформера | Глава 6, Глубокое обучение

Визуализация внимания, сердце трансформера | Глава 6, Глубокое обучение

Мы стоим на пороге нового конфликта! Что нас ждет дальше? Андрей Безруков про США, Россию и кризис

Мы стоим на пороге нового конфликта! Что нас ждет дальше? Андрей Безруков про США, Россию и кризис

Memory-Enhanced Neural Solvers for Efficient Adaptation in Combinatorial Optim. | Félix Chalumeau

Memory-Enhanced Neural Solvers for Efficient Adaptation in Combinatorial Optim. | Félix Chalumeau