Dask Workshop | Doing Nothing Poorly | Dask Summit 2021
Автор: Dask
Загружено: 2021-06-22
Просмотров: 318
Описание:
Speakers - Benjamin Zaitlen, Benjamin Zaitlen, Gil Forsyth, James Bourbeau, John Kirkham, Mads R. B. Kristensen, Matt Rocklin, Richard Zamora
This workshop will cover the recent effort to improve the performance of Dask’s distributed scheduler. As workloads scale to more data and more workers, performance can degrade as a result of the significant strain on the scheduler. So much so that unlocking more performance for those workloads does not require faster computation, but rather faster coordination of the work. For example, an extremely common ETL workflow may involve setting a new index df.set_index(‘id’). This simply expressed statement will trigger a large graph to be constructed -- Dask will need to coordinate an all-to-all exchange of distributed data. As the dataframe increases in size, the number of partitions in the graph will also increase. For the scheduling to scale without loss of performance we need to consider several domains within Dask’s internals: Where is the graph is generated How is the graph communicated between the client and the scheduler How is the graph processed within the scheduler How does the scheduler communicates those tasks to the workers None of these steps can be done poorly. That is, each of these items, if done poorly, can and will degrade performance and increase the amount of task scheduling overhead.
During the workshop we’ll discuss scheduler internals, motivating problems where scaling is a problem, and how the Dask community is moving forward to improve performance. We’ll also share several of the profiling techniques we used to measure our progress along the way.
===
The Dask Distributed Summit is where users, contributors, and newcomers can share experiences to learn from one another and grow together. The Dask Distributed Summit provides content, information, and learning opportunities for attendees of all levels of Dask familiarity and expertise.
https://summit.dask.org/
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: