Data Parallel C++: Enhancing SYCL Through Extensions for Productivity and Performance

Автор: IWOCL

Загружено: 2020-04-28

Просмотров: 779

Описание: This video was presented at the online version of IWOCL / SYCLcon 2020.

Authors: James Brodman, Michael Kinsner, Ben Ashbaugh, Jeff Hammond, Alexey Bader, John Pennycook, Jason Sewall and Roland Schulz (Intel)

Additional Information and Slides:
https://www.iwocl.org/iwocl-2020/conf...

Presentation Abstract
SYCL is a heterogeneous programming framework built on top of modern C++. Data Parallel C++, recently introduced as part of Intel’s oneAPI project, is Intel’s implementation of SYCL. Data Parallel C++, or DPC++, is being developed as an open-source project on top of Clang and LLVM. It combines C++, SYCL, and new extensions to improve programmer productivity when writing highly performant code for heterogeneous architectures.

This talk will describe several extensions that DPC++ has proposed and implemented on top of SYCL. While many of the extensions can help to improve application performance, all of them work to improve programmer productivity by both enabling easy integration into existing C++ applications, and by simplifying common patterns found in SYCL and C++. DPC++ is a proving ground where the value of extensions can be demonstrated before being proposed for inclusion in future versions of the SYCL specification. Intel contributes DPC++ extensions back to SYCL, to enable a unified standards-based solution.

The extensions that this talk will cover include:

Unified Shared Memory, which adds support for pointer-based programming to SYCL and provides a shared-memory programming model that significantly improves upon the shared virtual memory (SVM) model defined in OpenCL

Unnamed Kernel Lambdas, which simplify development for applications and libraries

In-order Queues, which simplifies the common pattern of kernels that execute in sequence

Subgroups, which enable efficient execution of specific collective operations across work items

Reductions, which allow easily expressing an important computational pattern across subgroups, workgroups, and entire devices

Language and API simplifications, which include C++ improvements such as template argument deduction guides, type aliases, and additional overloads of methods to reduce the verbosity of code

The DPC++ compiler project is located at http://github.com/intel/llvm.

Definitions for the extensions can be found both in the compiler and runtime code as well as in a repository located at: https://github.com/intel/llvm/tree/sy...

IWOCL Newsletter
Signup to receive regular updates on IWOCL, OpenCL and SYCL at: https://www.iwocl.org/opencl-newsletter/

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

Data Parallel C++: Enhancing SYCL Through Extensions for Productivity and Performance

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

Основы ПЛК: структурированный текст

Основы ПЛК: структурированный текст

Achieving High-throughput Strided Data Movement Across GPUs

Achieving High-throughput Strided Data Movement Across GPUs

Write Once, Deploy Many – 3D Rendering With SYCL Cross-Vendor Support & Performance Using Blender

Write Once, Deploy Many – 3D Rendering With SYCL Cross-Vendor Support & Performance Using Blender

Comparative Analysis of Implementation Techniques for Sub-groups on CPUs

Comparative Analysis of Implementation Techniques for Sub-groups on CPUs

Алгоритмы и структуры данных за 15 минут! Вместо 4 лет универа

Алгоритмы и структуры данных за 15 минут! Вместо 4 лет универа

Принц Персии: разбираем код гениальной игры, вытирая слезы счастья

Принц Персии: разбираем код гениальной игры, вытирая слезы счастья

Выучите R за 39 минут

Выучите R за 39 минут

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

Adaptivity in AdaptiveCpp: Optimizing Performance by Leveraging Runtime Info During JIT-Compilation

Adaptivity in AdaptiveCpp: Optimizing Performance by Leveraging Runtime Info During JIT-Compilation

Филипп Робертс: Что за чертовщина такая event loop? | JSConf EU 2014

Филипп Робертс: Что за чертовщина такая event loop? | JSConf EU 2014

КАК НЕЛЬЗЯ ХРАНИТЬ ПАРОЛИ (и как нужно) за 11 минут

КАК НЕЛЬЗЯ ХРАНИТЬ ПАРОЛИ (и как нужно) за 11 минут

☕Warm Relaxing Jazz Music with Cozy Coffee Shop for Working, Studying, Sleeping

☕Warm Relaxing Jazz Music with Cozy Coffee Shop for Working, Studying, Sleeping

Музыка для работы за компьютером | Фоновая музыка для концентрации и продуктивности

Музыка для работы за компьютером | Фоновая музыка для концентрации и продуктивности

4 Hours Chopin for Studying, Concentration & Relaxation

4 Hours Chopin for Studying, Concentration & Relaxation

OpenCL: An Update from the Khronos Working Group

OpenCL: An Update from the Khronos Working Group

Экспресс-курс RAG для начинающих

Экспресс-курс RAG для начинающих

Подробно о HTTP: как работает Интернет

Подробно о HTTP: как работает Интернет

Иллюстрированное руководство по нейронной сети Transformers: пошаговое объяснение

Иллюстрированное руководство по нейронной сети Transformers: пошаговое объяснение

Relaxing Christmas Music by the Fireplace and Snowfall - Cozy Christmas Cabin to Relax, Sleep

Relaxing Christmas Music by the Fireplace and Snowfall - Cozy Christmas Cabin to Relax, Sleep

Объяснение Transformers: понимание модели, лежащей в основе GPT, BERT и T5

Объяснение Transformers: понимание модели, лежащей в основе GPT, BERT и T5