Deutsch
 
Hilfe Datenschutzhinweis Impressum
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT
  A CUDA fast multipole method with highly efficient M2L farfield evaluationfield evaluation

Kohnke, B., Kutzner, C., Beckmann, A., Lube, G., Kabadshow, I., Dachsel, H., et al. (2021). A CUDA fast multipole method with highly efficient M2L farfield evaluationfield evaluation. The International Journal of High Performance Computing Applications, 35(1), 97-117. doi:10.1177/1094342020964857.

Item is

Basisdaten

einblenden: ausblenden:
Genre: Zeitschriftenartikel

Dateien

einblenden: Dateien
ausblenden: Dateien
:
3260951.pdf (Verlagsversion), 3MB
Name:
3260951.pdf
Beschreibung:
-
OA-Status:
Sichtbarkeit:
Öffentlich
MIME-Typ / Prüfsumme:
application/pdf / [MD5]
Technische Metadaten:
Copyright Datum:
-
Copyright Info:
-

Externe Referenzen

einblenden:

Urheber

einblenden:
ausblenden:
 Urheber:
Kohnke, B.1, Autor           
Kutzner, C.2, Autor           
Beckmann, A., Autor
Lube, G., Autor
Kabadshow, I., Autor
Dachsel, H., Autor
Grubmüller, H.2, Autor           
Affiliations:
1Department of Theoretical and Computational Biophysics, MPI for Biophysical Chemistry, Max Planck Society, ou_578631              
2Department of Theoretical and Computational Biophysics, MPI for biophysical chemistry, Max Planck Society, ou_578631              

Inhalt

einblenden:
ausblenden:
Schlagwörter: Fast multipole method, Multipole-to-Local, molecular dynamics, electrostatics, CUDA
 Zusammenfassung: Solving an N-body problem, electrostatic or gravitational, is a crucial task and the main computational bottleneck in manyscientific applications. Its direct solution is an ubiquitous showcase example for the compute power of graphics processingunits (GPUs). However, the naive pairwise summation hasOðN2Þcomputational complexity. The fast multipole method(FMM) can reduce runtime and complexity toOðNÞfor any specified precision. Here, we present a CUDA-accelerated,CþþFMM implementation for multi particle systems withr1potential that are found, e.g. in biomolecular simulations.The algorithm involves several operators to exchange information in an octree data structure. We focus on the Multipole-to-Local (M2L) operator, as its runtime is limiting for the overall performance. We propose, implement and benchmarkthree different M2L parallelization approaches. Approach (1) utilizes Unified Memory to minimize programming andporting efforts. It achieves decent speedups for only little implementation work. Approach (2) employs CUDA DynamicParallelism to significantly improve performance for high approximation accuracies. The presorted list-based approach(3) fits periodic boundary conditions particularly well. It exploits FMM operator symmetries to minimize both memoryaccess and the number of complex multiplications. The result is a compute-bound implementation, i.e. performance islimited by arithmetic operations rather than by memory accesses. The complete CUDA parallelized FMM is incorporatedwithin the GROMACS molecular dynamics package as an alternative Coulomb solver.

Details

einblenden:
ausblenden:
Sprache(n): eng - English
 Datum: 2020-10-122021-01
 Publikationsstatus: Erschienen
 Seiten: -
 Ort, Verlag, Ausgabe: -
 Inhaltsverzeichnis: -
 Art der Begutachtung: Expertenbegutachtung
 Identifikatoren: DOI: 10.1177/1094342020964857
 Art des Abschluß: -

Veranstaltung

einblenden:

Entscheidung

einblenden:

Projektinformation

einblenden: ausblenden:
Projektname : -
Grant ID : -
Förderprogramm : Software for Exascale Computing (SPP 1648)
Förderorganisation : DFG

Quelle 1

einblenden:
ausblenden:
Titel: The International Journal of High Performance Computing Applications
Genre der Quelle: Zeitschrift
 Urheber:
Affiliations:
Ort, Verlag, Ausgabe: -
Seiten: - Band / Heft: 35 (1) Artikelnummer: - Start- / Endseite: 97 - 117 Identifikator: -