English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
 
 
DownloadE-Mail
  Identifying domains of applicability of machine learning models for materials science

Sutton, C. A., Boley, M., Ghiringhelli, L. M., Rupp, M., Vreeken, J., & Scheffler, M. (2020). Identifying domains of applicability of machine learning models for materials science. Nature Communications, 11: 4428. doi:10.26434/chemrxiv.9778670.

Item is

Files

show Files
hide Files
:
manuscript.domain_of_app.pdf (Preprint), 9MB
Name:
manuscript.domain_of_app.pdf
Description:
-
OA-Status:
Green
Visibility:
Public
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
2019
Copyright Info:
-
:
s41467-020-17112-9.pdf (Publisher version), 6MB
Name:
s41467-020-17112-9.pdf
Description:
-
OA-Status:
Gold
Visibility:
Public
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
2020
Copyright Info:
The Author(s)

Locators

show

Creators

show
hide
 Creators:
Sutton, Christopher A.1, Author           
Boley, Mario2, Author
Ghiringhelli, Luca M.1, Author           
Rupp, Matthias1, 3, Author           
Vreeken, Jilles4, Author
Scheffler, Matthias1, 5, Author           
Affiliations:
1NOMAD, Fritz Haber Institute, Max Planck Society, ou_3253022              
2Faculty of Information Technology, Monash University, Clayton, Australia, ou_persistent22              
3Citrine Informatics, Redwood City, California, USA, ou_persistent22              
4CISPA Helmholtz Center for Information Security, Saarbrücken, Germany, ou_persistent22              
5Physics Department and IRIS Adlershof, Humboldt-Universität, Berlin, Germany, ou_persistent22              

Content

show
hide
Free keywords: machine learning model analysis, density functional theory, high-throughput screening, ground-state search
 Abstract: We present an extension to the usual machine learning process that allows for the identification of the domain of applicability of a fitted model, i.e., the region in its domain where it performs most accurately. This approach is applied to several vastly different but commonly used materials representations (namely the n-gram approach, SOAP, and the many body tenor representation), which are practically indistinguishable based on performance using a single error statistic. Moreover, these models appear unsatisfactory for screening applications as they fail to reliably identify the ground state polymorphs. When applying our newly developed analysis for each of the models, we can identify the domain of applicability for each model according to a simple set of interpretable conditions. We show that identification of the domain of applicability in the prediction of the formation energy enables a more accurate ground-state search - a crucial step for the discovery of novel materials.

Details

show
hide
Language(s): eng - English
 Dates: 2019-09-092019-08-292020-05-222020-09-04
 Publication Status: Published online
 Pages: 9
 Publishing info: -
 Table of Contents: -
 Rev. Type: Peer
 Degree: -

Event

show

Legal Case

show

Project information

show hide
Project name : TEC1p - Big-Data Analytics for the Thermal and Electrical Conductivity of Materials from First Principles
Grant ID : 740233
Funding program : Horizon 2020 (H2020)
Funding organization : European Commission (EC)

Source 1

show
hide
Title: Nature Communications
  Abbreviation : Nat. Commun.
Source Genre: Journal
 Creator(s):
Affiliations:
Publ. Info: London : Nature Publishing Group
Pages: 9 Volume / Issue: 11 Sequence Number: 4428 Start / End Page: - Identifier: ISSN: 2041-1723
CoNE: https://pure.mpg.de/cone/journals/resource/2041-1723