RAG (Retrieval Augmented Generation)|| Docling, Re ranker, Query rewriting
Автор: Intelligent Machines
Загружено: 2025-04-22
Просмотров: 424
Описание:
📌 GitHub Code: https://github.com/mohan696matlab/NLP
📌 1:1 AI Consulting: https://topmate.io/balyogi_mohan_dash...
📌 freelance profile: https://www.upwork.com/freelancers/~0...
=============================
LLM Playlist: • Small to Large Language Models || Full Cou...
GenAI Project Playlist: • Generative AI Projects
Whisper ASR Playlist: • Whisper finetuning on pytorch
=============================
This video delves into the development of an advanced rack pipeline for retrieving relevant information from the Bhagavad Gita. The pipeline involves several stages, including text extraction, chunking, vectorization, query rewriting, and re-ranking. The process begins with the extraction of text from a PDF of the Bhagavad Gita using the Duckling library, which provides a more coherent and sect-free text. The text is then chunked using a recursive text splitter, which divides the text into meaningful chunks without breaking sentences. These chunks are then embedded into vectors using the SentenceTransformer library's MiniLM model. The vector database is created by storing these embeddings in a PyTorch tensor. The pipeline is further enhanced by query rewriting, which involves rephrasing the user's question to better match the context of the Bhagavad Gita. The rewritten query is then used to retrieve similar chunks from the database using a cross-encoder model. The top 10 most similar chunks are re-ranked using the cross-encoder, and the resulting context is passed to a large language model (LLM) for final retrieval of relevant information. The entire pipeline can be executed in under a second, making it efficient for real-time applications. The video also showcases the deployment of the pipeline in a Python script and demonstrates its potential for fine-tuning the LLM model.
🔥 Don’t forget to LIKE, SUBSCRIBE
=============================
🔗Links🔗
LinkedIn: / balyogi-mohan-dash
Google Scholar: https://scholar.google.com/citations?...
=============================
Please reach out via email for any questions: [email protected]
=============================
LLM videos:
part 1: • Simple Guide to Training Small LLMs | Part 1
part 2: • How AI Models Learn to Follow Prompts | Pa...
part 3: • How to Use Meta’s LLaMA Model – Step-by-St...
Whisper videos:
part 1: • Master Fine-Tuning OpenAI Whisper with PyT...
part 2: • Word Error Rate in Atomatic Speech Recogni...
part 3: • Fine tuning Whisper with Pytorch (Simples...
TIMESTAMPS
00:00 – Understanding the Basics of a Simple RAG Pipeline
00:27 – Using Bhagavad Gita as Reference for Answering Queries
02:16 – Using Vector Databases for Large-Scale Text Search
03:46 – Running LLaMA 3.2B Model on a Consumer GPU
05:32 – Optimizing Large Language Models for Better Performance
07:07 – Improved Chunking for Better Embeddings
09:22 – Improving Document Retrieval with Enhanced Query Matching
12:47 – Extracting Text from PDF of Bhagavad Gita
18:27 – How to Find Top Similar Query Embeddings in a Database
24:43 – How to Fine-Tune a LLaMA Model with a Simple PyTorch Loop
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: