Run A Local LLM Across Multiple Computers! (vLLM Distributed Inference)
Автор: Bijan Bowen
Загружено: 2024-12-04
Просмотров: 15178
Описание:
Timestamps:
00:00 - Intro
01:24 - Technical Demo
09:48 - Results
11:02 - Intermission
11:57 - Considerations
15:48 - Conclusion
In this video, we explore distributed inference using vLLM and Ray. To demonstrate this exciting functionality, we set up two nodes: one equipped with two RTX 3090 Ti GPUs and another with two RTX 3060 GPUs. After configuring the nodes, we test distributed inference by loading a model across both nodes, enabling interaction with a fully distributed inference setup.
Join us as we dive into the technical details, share results, and discuss considerations for using distributed inference in your own projects!
Повторяем попытку...

Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: