AI Book Club: AI Systems Performance Engineering 📱
Автор: Sage Elliott
Загружено: 2026-02-25
Просмотров: 73
Описание:
February's book is "AI Systems Performance Engineering"!This is a casual-style event. Not a structured presentation on topics.
Join live events: https://luma.com/ai-builders-and-lear...
Sage on Linkedin: / sageelliott
Book: https://learning.oreilly.com/library/...
Slides: https://docs.google.com/presentation/...
Sometimes, the discussion even drifts away from the chapters, but feel free to grab the mic to help steer it back.Feel free to join the discussion even if you have not read the book chapters! :)
Want to discuss the contents during the reading week? Join the Slack Flyte MLOps Slack group and search for the "ai-reading-club" channel. https://slack.flyte.org/
-------------------------------------------------
About the book:Title: AI Systems Performance EngineeringAuthors: Chris FreglyPublished: November 2025
https://learning.oreilly.com/library/...
Chapters:1. Introduction and AI System Overview
2. AI System Hardware Overview
3. OS, Docker, and Kubernetes Tuning for GPU-based Environments
4. Tuning Distributed Networking Communication
5. GPU-Based Storage I/O Optimizations
6. GPU Architecture, CUDA Programming, and Maximizing Occupancy
7. Profiling and Tuning GPU Memory Access Patterns8. Occupancy Tuning, Warp Efficiency, and Instruction-Level Parallelism
9. Increasing CUDA Kernel Efficiency and Arithmetic Intensity
10. Intra-Kernel Pipelining, Warp Specialization, and Cooperative Thread Block Clusters
11. Inter-Kernel Pipelining, Synchronization, and CUDA Stream-Ordered Memory Allocations
12. Dynamic Scheduling, CUDA Graphs, and Device-Initiated Kernel Orchestration
13. Profiling, Tuning, and Scaling PyTorch
14. PyTorch Compiler, OpenAI Triton, and XLA Backends
15. Multinode Inference, Parallelism, Decoding, and Routing Optimizations
16. Profiling, Debugging, and Tuning Inference at Scale
17. Scaling Disaggregated Prefill and Decode for Inference
18. Advanced Prefill-Decode and KV Cache Tuning
19. Dynamic and Adaptive Inference Engine Optimizations
20. AI-Assisted Performance Optimizations and Scaling Toward Multimillion GPU ClustersBook Description
https://learning.oreilly.com/library/...
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: