Spark, Dask, DuckDB, Polars: TPC-H Benchmarks at Scale
Автор: Coiled
Загружено: 2023-11-07
Просмотров: 9122
Описание:
We run the common TPC-H Benchmark suite at 10 GB, 100 GB, 1 TB, and 10 TB scale on the cloud a local machine and compare performance for common large dataframe libraries.
No tool does universally well. We look at common bottlenecks and compare performance between the different systems.
This talk was originally given at PyData NYC 2023. These results are preliminary, and come from only a couple weeks of exploration.
00:00 Introduction
01:58 Background!
13:30 Charts!
20:00 Analysis.
30:12 Deployment!
Learn More:
Latest TPC-H results and more details: https://docs.coiled.io/blog/tpch.html
Performance improvements for Dask DataFrame: https://docs.coiled.io/blog/dask-data...
Повторяем попытку...

Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: