English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Conference Paper

A C++ Library for Memory Layout and Performance Portability of Scientific Applications

MPS-Authors
/persons/resource/persons219256

Incardona,  Pietro
Max Planck Institute for Molecular Cell Biology and Genetics, Max Planck Society;

Gupta,  Aryaman
Max Planck Institute for Molecular Cell Biology and Genetics, Max Planck Society;

Yaskovets,  Serhii
Max Planck Institute for Molecular Cell Biology and Genetics, Max Planck Society;

/persons/resource/persons219620

Sbalzarini,  Ivo F.
Max Planck Institute for Molecular Cell Biology and Genetics, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available
Citation

Incardona, P., Gupta, A., Yaskovets, S., & Sbalzarini, I. F. (2023). A C++ Library for Memory Layout and Performance Portability of Scientific Applications. In Euro-Par 2022: Parallel Processing Workshops: Euro-Par 2022 International Workshops, Glasgow, UK, August 22–26, 2022, Revised Selected Papers (pp. 109-120). New York: Springer.


Cite as: https://hdl.handle.net/21.11116/0000-000E-AB72-1
Abstract
We present a C++14 library for performance portability of scientific computing codes across CPU and GPU architectures. Our library combines generic data structures like vectors, multi-dimensional arrays, maps, graphs, and sparse grids with basic, reusable algorithms like convolutions, sorting, prefix sum, reductions, and scan. The memory layout of the data structures is adapted at compile-time using tuples with optional memory mirroring between CPU and GPU. We combine this transparent memory mapping with generic algorithms under two alternative programming interfaces: a CUDA-like kernel interface for multi-core CPUs, Nvidia GPUs, and AMD GPUs, as well as a lambda interface. We validate and benchmark the presented library using micro-benchmarks, showing that the abstractions introduce negligible performance overhead, and we compare performance against the current state of the art.