English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  SISSO++: A C++ Implementation of the Sure-Independence Screening and Sparisifying Operator Approach

Purcell, T., Scheffler, M., Carbogno, C., & Ghiringhelli, L. M. (2022). SISSO++: A C++ Implementation of the Sure-Independence Screening and Sparisifying Operator Approach. The Journal of Open Source Software, 7(71): 3960. doi:10.21105/joss.03960.

Item is

Files

show Files
hide Files
:
joss.03960.pdf (Publisher version), 210KB
Name:
joss.03960.pdf
Description:
-
OA-Status:
Gold
Visibility:
Public
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
2022
Copyright Info:
The Author(s)

Locators

show

Creators

show
hide
 Creators:
Purcell, Thomas1, Author           
Scheffler, Matthias1, Author           
Carbogno, Christian1, Author           
Ghiringhelli, Luca M.1, Author           
Affiliations:
1NOMAD, Fritz Haber Institute, Max Planck Society, ou_3253022              

Content

show
hide
Free keywords: -
 Abstract: The sure independence screening and sparsifying operator (SISSO) approach (Ouyang et al., 2018) is an algorithm belonging to the field of artificial intelligence and more specifically a combination of symbolic regression and compressed sensing. As a symbolic regression method, SISSO is used to identify mathematical functions, i.e. the descriptors, that best
predict the target property of a data set. Furthermore, the compressed sensing aspect of SISSO, allows it to find sparse linear models using tens to thousands of data points. SISSO is introduced for both regression and classification tasks. In practice, SISSO first constructs a large and exhaustive feature space of trillions of potential descriptors by taking in a set of
user-provided primary features as a dataframe, and then iteratively applying a set of unary and binary operators, e.g. addition, multiplication, exponentiation, and squaring, according to a user-defined specification. From this exhaustive pool of candidate descriptors, the ones most
correlated to a target property are identified via sure-independence screening, from which the low-dimensional linear models with the lowest error are found via an l0 regularization.
Because symbolic regression generates an interpretable equation, it has become an increasingly popular concept across scientific disciplines (Neumann et al., 2020; Udrescu & Tegmark, 2020; Wang et al., 2019). A particular advantage of these approaches are their capability to model complex phenomena using relatively simple descriptors. SISSO has been used successfully in the past to model, explore, and predict important material properties, including the stability of different phases (Bartel et al., 2018; Schleder et al., 2020); the catalytic activity and reactivity
(Andersen et al., 2019; Andersen & Reuter, 2021; Han et al., 2021; W. Xu et al., 2021); and glass transition temperatures (Pilania et al., 2019). Beyond regression problems, SISSO has also been used successfully to classify materials into different crystal prototypes (Ouyang et al., 2019), or whether a material crystallizes in its ground state as a perovskite (Bartel et al., 2019), or to determine whether a material is a topological insulator or not (Cao et al., 2020).
The SISSO++ package is an open-source (Apache-2.0 licence), modular, and extensible C++ implementation of the SISSO method with Python bindings. Specifically, SISSO++ applies this methodology for regression, log regression, and classification problems. Additionally, the
library includes multiple Python functions to facilitate the post-processing, analyzing, and
visualizing of the resulting models.

Details

show
hide
Language(s): eng - English
 Dates: 2021-10-012021-10-012022-03-16
 Publication Status: Published online
 Pages: 5
 Publishing info: -
 Table of Contents: -
 Rev. Type: Peer
 Identifiers: DOI: 10.21105/joss.03960
 Degree: -

Event

show

Legal Case

show

Project information

show hide
Project name : TEC1p - Big-Data Analytics for the Thermal and Electrical Conductivity of Materials from First Principles
Grant ID : 740233
Funding program : h20
Funding organization : -

Source 1

show
hide
Title: The Journal of Open Source Software
  Abbreviation : J. Open Source Softw.
Source Genre: Journal
 Creator(s):
Affiliations:
Publ. Info: -
Pages: 5 Volume / Issue: 7 (71) Sequence Number: 3960 Start / End Page: - Identifier: ISSN: 2475-9066
CoNE: https://pure.mpg.de/cone/journals/resource/2475-9066