Accelerating OpenSearch With Streaming: Apache Arrow, Flight, Data... - Saurabh Singh & Harsha Vamsi
Автор: OpenSearch
Загружено: 2025-09-12
Просмотров: 107
Описание:
Accelerating OpenSearch With Streaming: Apache Arrow, Flight, DataFusion and gRPC - Saurabh Singh & Harsha Vamsi, Amazon Web Services
As search and analytics workloads scale, OpenSearch needs faster, cloud-native foundations. This talk presents a new streaming framework in OpenSearch, while integrating Apache Arrow, Flight, DataFusion and gRPC protocol into OpenSearch to boost throughput (2×) and cut latency for queries and aggregations. Arrow’s columnar SIMD format enables efficient in-memory processing. Flight introduces bi-directional streaming for real-time partial results and dynamic pruning, improving Scroll and federated search.
DataFusion adds a modern query layer for multi-stage aggregations and future JOIN support. Native Arrow support reduces serialization overhead, enabling seamless use with Pandas, TensorFlow, and other ML tools. We'll dive into system architecture, show benchmarking highlights, and discuss how this stack powers ML pipelines, smart scaling, and Parquet-based lake integrations. This session is ideal for developers, architects, and ML engineers looking to push OpenSearch into the next era of performance and interoperability.
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: