Transferability in chemical machine learning

Stocker, Sina

Lokale TagsFreigabegeschichteDetailsÜbersicht

Transferability in chemical machine learning

Stocker, S. (2022). Transferability in chemical machine learning. PhD Thesis, Technische Universität, München.

Item is Freigegeben

einblenden: alle ausblenden: alle

Basisdaten

einblenden: ausblenden:

Datensatz-Permalink: https://hdl.handle.net/21.11116/0000-000A-EA67-B Versions-Permalink: https://hdl.handle.net/21.11116/0000-000C-8B3E-3

Genre: Hochschulschrift

Dateien

einblenden: Dateien

ausblenden: Dateien

:

1662781.pdf (beliebiger Volltext), 23MB

Öffnen Speichern

Datei-Permalink:
https://hdl.handle.net/21.11116/0000-000C-7DD8-5

Name:
1662781.pdf

Beschreibung:
-

OA-Status:
Gold

Sichtbarkeit:
Öffentlich

MIME-Typ / Prüfsumme:
application/pdf / [MD5]

Technische Metadaten:

Öffnen

Copyright Datum:
2022

Copyright Info:
The Author

Lizenz:
https://creativecommons.org/licenses/by/4.0/

Externe Referenzen

einblenden:

Urheber

einblenden:

ausblenden:

Urheber:
Stocker, Sina¹, Autor
Reuter, Karsten¹, Gutachter
Oberhofer, Harald, Gutachter

Affiliations:
1Theory, Fritz Haber Institute, Max Planck Society, ou_634547

Inhalt

einblenden:

ausblenden:

Schlagwörter: -

Zusammenfassung: The combination of machine learning (ML) and computational chemistry offers unprecedented opportunities to gain new insights into chemical processes. Established computational chemistry methods are often either too computationally demanding or do not provide the required accuracy. Machine learning methods might overcome these limitations and are able to predict physical properties very accurately, but significantly cheaper than quantum mechanical (QM) methods. However, the generation of large reference databases, which are often required for training ML models, is still a computationally costly task. This results in the fact that only few of such large databases exist that cover certain sub-parts of the chemical space. The focus of this thesis is therefore to explore the transferability of ML models trained on such fixed databases, but by applying them to predictions on other subsets of chemical or reaction space. This exploration will be shown and discussed on the basis of three different examples.
In the first example, established ML methods in chemical compound space were used to predict reaction energies in chemical reaction space. The predicted reaction energies can then be utilized to explore and reduce complex reaction networks. As a first step, a QM-based reference database of closed-shell molecules and radical systems has been generated to describe chemical reactions. Moreover, the analysis demonstrated that for adequate predictions in reaction space, certain requirements have to be satisfied for compound space ML methods to ensure transferable models. The resulting model could be used for the non-empirical reduction of reaction networks, with methane combustion as an example.
The second example focused on the exploration of different parts of the chemical space with molecules of large size differences. An important requirement for this is the use of size-extensive ML models. To this end, this part of the thesis showed how size-extensive ML models can be build to satisfactorily predict properties of large molecules, when training on small systems. The results further showed, that non size-extensive models completely failed in that task.
In the last example, the robustness of advanced graph neural network (GNN) models in atomistic simulations was investigated. To this end, models were trained on different subsets of the fixed QM7-x database. This is an interesting test scenario, as the capabilities of GNNs were mostly tested on established databases, whereas fewer studies have been conducted to show their applicability in chemical simulations. The results showed that stable dynamics could be achieved for GNN models trained on large training set sizes. Furthermore, it was found that instabilities during the simulations could occur, even though the model produces low errors on a fixed test set.

Details

einblenden:

ausblenden:

Sprache(n):

Datum: Angenommen: 2022-08-01

Publikationsstatus: Angenommen

Seiten: -

Ort, Verlag, Ausgabe: München : Technische Universität

Inhaltsverzeichnis: -

Art der Begutachtung: -

Identifikatoren: URI: https://mediatum.ub.tum.de/1662781
URN: urn:nbn:de:bvb:91-diss-20220811-1662781-1-5

Art des Abschluß: Doktorarbeit

Datensatz

Basisdaten

Dateien

Externe Referenzen

Urheber

Inhalt

Details

Veranstaltung

Entscheidung

Projektinformation

Quelle