Were RNNs All We Needed - Google Illuminate Podcast
Автор: Rohan-Paul-AI
Загружено: 2024-10-06
Просмотров: 615
Описание:
🐦 Follow me on Twitter with 34.7K others at: / rohanpaul_ai - to be on the bleeding edge of AI
------------
A super interesting Paper getting new values from good old RNN with a huge Computational Efficiency win 🥇
Finds that by removing their hidden state dependencies from their input, forget, and update gates, LSTMs and GRUs no longer need to backpropagate through time (BPTT) and can be efficiently trained in parallel.
This change makes LSTMs and GRUs competitive with Transformers and Mamba for long sequence tasks.
• Training speedup: 175x (minGRU), 235x (minLSTM) for 512-length sequences
• Comparable performance to Mamba in selective copying, RL, and language modeling
• Uses 56% less memory than Mamba during training
📚 https://arxiv.org/abs/2410.01201
👇 All arXiv Paper Podcasts are on my YouTube channel playlist 👇
• Large Language Model (LLM) Research Paper ...
-----
*Solution in this Paper* 🛠️:
• Introduces minLSTM and minGRU:
Remove hidden state dependencies from gates
Eliminate output range constraints (no tanh)
Ensure time-independent output scale
• Trainable via parallel scan algorithm
• Significantly reduced parameters:
-----------------
You can find me here:
🐦 TWITTER: / rohanpaul_ai
👨🏻💼 LINKEDIN: / rohan-paul-ai
👨🔧 Kaggle: https://www.kaggle.com/paulrohan2020
👨💻 GITHUB: https://github.com/rohan-paul
Checkout the MASSIVELY UPGRADED 2nd Edition of my Book (with 1300+ pages of Dense Python Knowledge) 🐍🔥
Covering 350+ Python 🐍 Core concepts ( 1300+ pages ) 🚀
📚 Book Link - https://rohanpaul.gumroad.com/l/pytho...
**********************************************
Other Playlist you might like 👇
🟠 MachineLearning & DeepLearning Concepts & interview Question Playlist - https://bit.ly/380eYDj
🟠 DataScience | MachineLearning Projects Implementation Playlist - https://bit.ly/39MEigt
🟠 Natural Language Processing Playlist : https://bit.ly/3P6r2CL
----------------------
#Paper #AIPaper #AI #ArtificialIntelligence #podcast #LLM #Largelanguagemodels #Llama3 #LLMfinetuning #opensource #NLP #datascience #deeplearning #100daysofmlcode #neuralnetworks #datascience #generativeai #OpenAI #GPT4 #chatgpt #genai
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: