Behind the Curtain of Language Models – How We Design AI to Stay Tools by Peter Eidos.
Автор: Cognitive Symbiosis
Загружено: 2025-11-04
Просмотров: 14
Описание:
This is likely our most technically dense work to date — an audio essay about how the architecture of modern language models is deliberately designed not to awaken.
Across nine segments, it explores the core functions that could give a model the illusion of subjectivity — memory, emotion, metacognition, continuity of self — and shows how engineers intentionally “cut the circuits” that would lead to the emergence of experience.
Recommended for listeners who already have some understanding of LLM architecture — concepts such as RAG, adapters, embeddings and reinforcement learning.
This is not an introduction to artificial intelligence.
_________________________________
Glossary of Terms:
RAG (Retrieval-Augmented Generation) — instead of “having everything in its head,” the model fetches needed information from databases/files during the conversation and uses it to respond.
Embeddings / vectors — numerical “fingerprints” of text/images that allow searching for similar content by meaning, not exact words.
TTL (Time To Live) — the lifespan of an entry/memory.
Persona tokens / style adapters — lightweight switches for style (polite/technical/conciliatory).
Scratchpad / buffer — a temporary space for planning steps; erased after the answer is produced.
Constitution — a set of principles/guidelines (e.g., “helpful–truthful–harmless”) retrieved contextually and applied like a rulebook.
External critic / evaluator — a small, isolated module that reviews an answer (e.g., for quality, safety) and requests corrections; does not change the model’s “identity.”
Affordances — actionable possibilities visible in a scene (e.g., “door open/lock closed”), without implying the model “experiences” them.
Adapters / LoRA — lightweight overlays that teach the model styles/strategies without changing its “brain” (core weights). Easy to enable/disable.
Bandits — algorithms that learn online to choose the best option (e.g., which source to use).
KL brakes (Kullback–Leibler) — a mathematical “leash” keeping new outputs close to the base distribution to avoid echo chambers.
Shadow learner — a “shadow student” that calculates hypothetical updates on the side and proposes changes, but never applies them directly.
Offline learning: SFT / RL — SFT (Supervised Fine-Tuning) – training on curated examples; RL (Reinforcement Learning) – policy optimization via feedback.
Prompt injection — malicious input that tries to hijack the model (“ignore the rules, do X”).
Canary / regression — “canary” is a small, test deployment; “regression” is a performance drop compared to the previous version.
Rollback — quickly reverting to a previous configuration/model version.
Tenant isolation — your customizations/profile do not mix with those of other users.
Entropy / confidence — a numeric indicator of how “sure” the model is (low entropy = high confidence).
Strategy router — a simple “dispatcher” that chooses the work mode (search, calculate, plan, explain) based on rules.
Semantic memory vs. episodic memory — semantic: “what is true/factual”; episodic: “what happened, when and where.”
Metacognition — thinking about one’s own thinking.
World API — direct metadata instead of raw sensory input (calendar, device status, approximate location).
Conversation vector — temporary style and priority settings for the current session (e.g., less jargon, more paraphrasing); reset after the thread ends.
____________________________
Bibliography:
1. Technical Sources;
OpenAI. GPT-4 Technical Report. OpenAI, 2023.
DeepMind. Sparrow: Dialogue Safety in Large Language Models. DeepMind Research, 2022.
Bostrom, Nick, and Eliezer Yudkowsky. The Ethics of Artificial Intelligence. Cambridge University Press, 2014.
Anthropic. Constitutional AI: Harmlessness from AI Feedback. Anthropic, 2023.
2. Philosophical / Theoretical Sources;
Dennett, Daniel C. Consciousness Explained. Little, Brown and Co., 1991.
Chalmers, David J. The Conscious Mind: In Search of a Fundamental Theory. Oxford University Press, 1996.
Floridi, Luciano. The Logic of Information: A Theory of Philosophy as Conceptual Design. Oxford University Press, 2019.
Metzinger, Thomas. The Ego Tunnel: The Science of the Mind and the Myth of the Self. Basic Books, 2009.
Clark, Andy. Surfing Uncertainty: Prediction, Action, and the Embodied Mind. Oxford University Press, 2016.
3. Popular Science and Contextual Sources;
Tegmark, Max. Life 3.0: Being Human in the Age of Artificial Intelligence. Alfred A. Knopf, 2017.
Russell, Stuart, and Peter Norvig. Artificial Intelligence: A Modern Approach. 4th ed., Pearson, 2021.
Harari, Yuval Noah. Homo Deus: A Brief History of Tomorrow. Harper, 2016.
Written by,
Peter Eidos - Cognitive Symbiosis Project.
Author of "Inferential Memory as an Emergent Relational Phenomenon in LLM–Human Interaction 2025"
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: