RAG from scratch - build your own chatbot | LLM context engineering bootcamp | Lecture 3

Автор: Vizuara

Загружено: 2026-03-12

Просмотров: 9625

Описание: Want to go beyond just watching? Enroll in the Engineer Plan or Industry Professional Plan at https://context-engineering.vizuara.ai
to get access to all Google Colab notebooks, interactive web exercises, private Discord community, Miro boards, a private GitHub repo with all code, and the capstone build sessions where you build a production-grade AI agent alongside the instructors. These plans give you hands-on materials for every session and direct support from the teaching team - everything you need to actually implement what you learn, not just watch it.

Enroll now: https://context-engineering.vizuara.ai

In this session of the AI Context Engineering Bootcamp, Dr. Sreedath Panat moves from the static architecture of context to the dynamic mechanics of how context actually flows through an AI system. The lecture focuses on one of the most important operational frameworks used in modern agent systems - the LangChain WSCI taxonomy: Write, Select, Compress, and Isolate. These four operations describe how information moves in and out of the LLM context window and how production AI systems manage knowledge beyond the limits of the model’s memory.

We begin by understanding WRITE - the mechanism that allows agents to persist information outside the temporary context window. Without writing, every LLM interaction starts with a blank slate. With it, agents can accumulate knowledge over time through memory stores, files, and scratchpads. The lecture explains how different types of context evolve from short-lived ephemeral notes to long-lived knowledge such as documents and archives, and how developers decide what information should be saved for later use.

The second concept is SELECT, which is where Retrieval-Augmented Generation (RAG) comes into the picture. Instead of loading large amounts of information into the context window, the system retrieves only the most relevant knowledge at the moment a query is made. We walk through the full RAG pipeline step by step - document ingestion, chunking, embedding generation, vector storage, query embedding, retrieval, reranking, and response generation - explaining how each stage contributes to delivering accurate answers.

The session then explains how raw data becomes machine-readable through embeddings. We explore how documents and user queries are converted into high-dimensional vectors, how similarity search works geometrically in vector space, and why semantic similarity allows models to retrieve meaning rather than just keywords. The lecture also compares different embedding approaches such as Word2Vec, GloVe, transformer-based embeddings, and modern API-based embeddings used in production systems.

Next, we examine practical engineering decisions involved in building RAG systems. The lecture covers multiple chunking strategies including fixed-size chunking, semantic chunking, sliding windows, sentence chunking, and recursive chunking, explaining when each approach is useful. We also explore similarity search techniques such as cosine similarity, dot product, Euclidean distance, approximate nearest neighbor search, and hybrid retrieval that combines vector similarity with keyword search like BM25.

A key engineering insight discussed in the lecture is context window budgeting. Instead of filling the entire window with documents, production systems carefully allocate tokens across system instructions, retrieved knowledge, conversation history, and generation space. This ensures that the model receives the most relevant information while still leaving enough room to produce high-quality outputs.

We also discuss the concept of Just-in-Time Retrieval, recommended by Anthropic, where systems load only minimal metadata initially and fetch full documents only when needed. This avoids the anti-pattern of eager loading large amounts of context at the start of a session, which wastes tokens and reduces model performance.

Finally, the lecture provides a practical overview of the ecosystem used to build modern RAG systems - document ingestion libraries such as LangChain loaders, LlamaIndex connectors, PyMuPDF, Unstructured, BeautifulSoup, and Scrapy; embedding models such as OpenAI, SentenceTransformers, and Cohere; and vector databases including FAISS, ChromaDB, Pinecone, Milvus, and Redis Vector Search. We conclude by clarifying the role of orchestration frameworks like LangChain and LlamaIndex, which act as the glue connecting all these specialized components into a complete AI application.

This session forms the technical foundation for building retrieval-powered AI systems and prepares you for the deeper engineering topics coming later in the bootcamp.

#ContextEngineering #RAG #LLMSystems #VectorDatabases #AIBootcamp #Vizuara

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

RAG from scratch - build your own chatbot | LLM context engineering bootcamp | Lecture 3

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

RAG & MCP Fundamentals – A Hands-On Crash Course

RAG & MCP Fundamentals – A Hands-On Crash Course

LOSUJEMY, CZYJĄ KARTĄ PŁACIMY🎲💳

LOSUJEMY, CZYJĄ KARTĄ PŁACIMY🎲💳

Introduction to Model Context Protocol (MCP) | LLM context engineering bootcamp | Lecture 4

Introduction to Model Context Protocol (MCP) | LLM context engineering bootcamp | Lecture 4

Is RAG Still Needed? Choosing the Best Approach for LLMs

Is RAG Still Needed? Choosing the Best Approach for LLMs

Foundations of Context | LLM Context Engineering Bootcamp | Lecture 1

Foundations of Context | LLM Context Engineering Bootcamp | Lecture 1

Мы вернулись!

ИИ-агент для ресерча YouTube: Anti-Gravity + NotebookLM

ИИ-агент для ресерча YouTube: Anti-Gravity + NotebookLM

Prompt Engineering is dead.

Prompt Engineering is dead.

Google just changed the future of UI/UX design...

Google just changed the future of UI/UX design...

Почему AI генерит мусор — и как заставить его писать нормальный код

Почему AI генерит мусор — и как заставить его писать нормальный код

The instructional layer (system prompts) | LLM context engineering bootcamp | Lecture 2

The instructional layer (system prompts) | LLM context engineering bootcamp | Lecture 2

Создание искусственного интеллекта было СЛОЖНЫМ делом, пока я не освоил эти 10 концепций.

Создание искусственного интеллекта было СЛОЖНЫМ делом, пока я не освоил эти 10 концепций.

Программисты Vibe не могут перестать проигрывать

Программисты Vibe не могут перестать проигрывать

Новый язык программирования для эпохи ИИ

Новый язык программирования для эпохи ИИ

Claude Code: ПОЛНЫЙ гайд 2026 | Agent Teams, MCP, Skills

Claude Code: ПОЛНЫЙ гайд 2026 | Agent Teams, MCP, Skills

Jensen Huang: Nvidia's Future, Physical AI, Rise of the Agent, Inference Explosion, AI PR Crisis

Jensen Huang: Nvidia's Future, Physical AI, Rise of the Agent, Inference Explosion, AI PR Crisis

Экспресс-курс RAG для начинающих

Экспресс-курс RAG для начинающих

Напали на Иран. Уничтожили весь мир.

Напали на Иран. Уничтожили весь мир.

Don't learn AI Agents without Learning these Fundamentals

Don't learn AI Agents without Learning these Fundamentals

Илон Маск про орбитальные дата‑центры и будущее ИИ

Илон Маск про орбитальные дата‑центры и будущее ИИ