Help Privacy Policy Disclaimer
  Advanced SearchBrowse


  Towards efficient data exchange and sharing for big-data driven materials science: metadata and data formats

Ghiringhelli, L. M., Carbogno, C., Levchenko, S., Mohamed, F., Huhs, G., Lüders, M., et al. (2017). Towards efficient data exchange and sharing for big-data driven materials science: metadata and data formats. npj Computational Materials, 3: UNSP 46. doi:10.1038/s41524-017-0048-5.

Item is


show Files
hide Files
s41524-017-0048-5.pdf (Publisher version), 736KB
Authors retain copyright; Published source must be acknowledged and DOI cited; Must link to publisher version; http://www.sherpa.ac.uk/romeo/issn/2057-3960/
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
Copyright Info:
© the Author(s)


https://dx.doi.org/10.1038/s41524-017-0048-5 (Publisher version)


Ghiringhelli, L. M.1, Author
Carbogno, C.1, Author
Levchenko, S.1, Author
Mohamed, F.1, Author
Huhs, G.2, 3, Author
Lüders, M.4, Author
Oliveira, M. J. T.5, 6, Author           
Scheffler, Matthias1, 7, Author
1Fritz Haber Institute of the Max Planck Society, Berlin, ou_persistent22              
2Barcelona Supercomputing Center, ou_persistent22              
3Humboldt-Universität zu Berlin, ou_persistent22              
4Daresbury Laboratory, Warrington, ou_persistent22              
5Theory Group, Theory Department, Max Planck Institute for the Structure and Dynamics of Matter, Max Planck Society, ou_2266715              
6University of Liège, ou_persistent22              
7Department of Chemistry and Biochemistry and Materials Department, University of California—Santa Barbara, ou_persistent22              


Free keywords: -
 Abstract: With big-data driven materials research, the new paradigm of materials science, sharing and wide accessibility of data are becoming crucial aspects. Obviously, a prerequisite for data exchange and big-data analytics is standardization, which means using consistent and unique conventions for, e.g., units, zero base lines, and file formats. There are two main strategies to achieve this goal. One accepts the heterogeneous nature of the community, which comprises scientists from physics, chemistry, bio-physics, and materials science, by complying with the diverse ecosystem of computer codes and thus develops “converters” for the input and output files of all important codes. These converters then translate the data of each code into a standardized, code-independent format. The other strategy is to provide standardized open libraries that code developers can adopt for shaping their inputs, outputs, and restart files, directly into the same code-independent format. In this perspective paper, we present both strategies and argue that they can and should be regarded as complementary, if not even synergetic. The represented appropriate format and conventions were agreed upon by two teams, the Electronic Structure Library (ESL) of the European Center for Atomic and Molecular Computations (CECAM) and the NOvel MAterials Discovery (NOMAD) Laboratory, a European Centre of Excellence (CoE). A key element of this work is the definition of hierarchical metadata describing state-of-the-art electronic-structure calculations.


Language(s): eng - English
 Dates: 2017-09-172017-03-232017-09-192017-11-06
 Publication Status: Published online
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: Peer
 Identifiers: DOI: 10.1038/s41524-017-0048-5
 Degree: -



Legal Case


Project information

show hide
Project name : We thank James Kermode and Saulius Gra ž ulis for their contribution to the discussion on the metadata, and Pasquale Pavone for precious suggestions on the metadata structure and names. We thank Patrick Rinke and Ghanshyam Pilania for carefully reading the manuscript. We thank Claudia Draxl and Kristian Thygesen for their contribution to the discussions on the necessary information to be stored for excited- state calculations and on the error bars and uncertainties. We gratefully acknowledge Damien Caliste, Fabiano Corsetti, Hubert Ebert, Jan Minar, Yann Pouillon, Thomas Ruh, David Strubbe, and Marc Torrent for their contributions to the ESCDF speci fi cations. We acknowledge Benjamin Regler for the development of the graphical interface for the query on the NOMAD Archive. We acknowledge inspiring discussions with Georg Kresse, Peter Blaha, Xavier Gonze, Bernard Delley, and Jörg Hutter on the energy-zero de fi nition and scalar- fi eld representation. We thank Ole Andersen, Evert Jan Baerends, Peter Blaha, Lambert Colin, Bernard Delley, Thierry Deutsch, Claudia Draxl, John Kay Dewhurst, Roberto Dovesi, Paolo Giannozzi, Mike Gillan, Xavier Gonze, Michael Frisch, Martin Head-Gordon, Juerg Hutter, Klaus Koepernik, Georg Kresse, Roland Lindh, Hans Lischka, Andrea Marini, Todd Martinez, Jens Jørgen Mortensen, Frank Neese, Richard Needs, Taisuke Ozaki, Mike Payne, Angel Rubio, Trond Saue, Chris Skylaris, Jose Soler, John Stanton, James Stewart, Marat Valiev for checking the information provided in Table 1 and for useful suggestions. This project has received funding from the European Union ’ s Horizon 2020 research and innovation program under grant agreement No. 676580, The NOMAD Laboratory, a European Center of Excellence, and the BBDC (contract 01IS14013E).
Grant ID : -
Funding program : Horizon 2020 (H2020)
Funding organization : European Commission (EC)

Source 1

Title: npj Computational Materials
Source Genre: Journal
Publ. Info: London : Springer Nature
Pages: - Volume / Issue: 3 Sequence Number: UNSP 46 Start / End Page: - Identifier: ISSN: 2057-3960
CoNE: https://pure.mpg.de/cone/journals/resource/2057-3960