English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Journal Article

Advancing descriptor search in materials science: feature engineering and selection strategies

MPS-Authors
/persons/resource/persons284933

Hoock,  Benedikt
NOMAD, Fritz Haber Institute, Max Planck Society;

/persons/resource/persons137143

Draxl,  Claudia
NOMAD, Fritz Haber Institute, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)

Hoock_2022_New_J._Phys._24_113049.pdf
(Publisher version), 3MB

Supplementary Material (public)
There is no public supplementary material available
Citation

Hoock, B., Rigamonti, S., & Draxl, C. (2022). Advancing descriptor search in materials science: feature engineering and selection strategies. New Journal of Physics, 24(11): 113049. doi:10.1088/1367-2630/aca49c.


Cite as: https://hdl.handle.net/21.11116/0000-000C-2250-3
Abstract
A main goal of data-driven materials research is to find optimal low-dimensional descriptors, allowing us to predict a physical property, and to interpret them in a human-understandable way. In this work, we advance methods to identify descriptors out of a large pool of candidate features by means of compressed sensing. To this extent, we develop schemes for engineering appropriate candidate features that are based on simple basic properties of building blocks that constitute the materials and that are able to represent a multi-component system by scalar numbers. Cross-validation based feature-selection methods are developed for identifying the most relevant features, thereby focusing on high generalizability. We apply our approaches to an ab initio dataset of ternary group-IV compounds to obtain a set of descriptors for predicting lattice constants and energies of mixing. In particular, we introduce simple complexity measures in terms of involved algebraic operations as well as the amount of utilized basic properties.