Dask Workshop | Dask SQL Query Engines | Dask Summit 2021
Автор: Dask
Загружено: 2021-06-22
Просмотров: 1704
Описание:
This workshop demonstrates Dask SQL Query Engines and was given at Dask Summit 2021.
The speakers are Nils Braun, Han Wang, Mike Klaczynski, Miles Adkins, and Tom Drabas.
In this workshop, we will discuss the different ways to run SQL queries on and with Dask using CPUs and GPUs. Being able to write SQL commands to query and transform the data does allow users to integrate the vast Dask and RAPIDS ecosystem into their BI workflows. We will discuss the current state of PyData SQL query engines, SQL integrations and together find out where to head next.
Data is the new gold of this century and being able to digest and analyze the growing amount of data is key. The Dask and RAPIDS ecosystem play a huge role in enabling this and their Python APIs allow to build up complex distributed pipelines. However, not all users are able to write elaborate and optimized distributed pipelines, and many legacy applications can only connect to traditional SQL databases. SQL query engines combine the best of both worlds: they enable querying and transforming huge amounts of data efficiently and in a distributed way using standard SQL statements, all without databases.
The Dask ecosystem brings at least two Python-based SQL query engines: dask-sql for distributed computations on CPUs and BlazingSQL for GPU processing. Fugue SQL, on the other hand, is an abstraction layer utilizing these SQL engines. It changes the way of using SQL, from ‘commands’ to an end-to-end workflow language. Dask also comes with a large variety of integrations into SQL databases, Snowflake Data Cloud being one of them.
In this workshop, we will talk about the landscape of SQL query engines and integrations that are available to Dask users in detail, explore different projects and applications, get some hands-on experience and discuss important features that are still missing.
KEY MOMENTS
00:00:00 - Intro
00:03:00 - Tom Drabas
00:13:00 - Miles Adkins
00:21:00 - Han. Wang
00:32:00 - Mike Klaczynski
01:44:00 - Nils Braun
01:56:00 - Closing
What is the Dask Summit?
The Dask Distributed Summit is where users, contributors, and newcomers can share experiences to learn from one another and grow together. The Dask Distributed Summit provides content, information, and learning opportunities for attendees of all levels of Dask familiarity and expertise.
Share your feedback with us in the comments and let us know:
Did you find this talk helpful?
What is your experience with Dask?
Learn more at summit.dask.org and dask.org
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: