Quantifying performance of machine learning methods for neuroimaging data

Jollans, Lee; Boyle, Rory; Artiges, Eric; Banaschewski, Tobias; Desrivieres, Sylvane; Grigis, Antoine; Martinot, Jean-Luc; Paus, Tomas; Smolka, Michael N.; Walter, Henrik; Schumann, Gunter; Garavan, Hugh; Whelan, Robert

doi:10.1016/j.neuroimage.2019.05.082

Local TagsRelease HistoryDetailsSummary

Quantifying performance of machine learning methods for neuroimaging data

Jollans, L., Boyle, R., Artiges, E., Banaschewski, T., Desrivieres, S., Grigis, A., et al. (2019). Quantifying performance of machine learning methods for neuroimaging data. NEUROIMAGE, 199, 351-365. doi:10.1016/j.neuroimage.2019.05.082.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/21.11116/0000-0009-8002-3 Version Permalink: https://hdl.handle.net/21.11116/0000-0009-8003-2

Genre: Journal Article

Files

show Files

Locators

show

Creators

show

hide

Creators:
Jollans, Lee¹, Author
Boyle, Rory, Author
Artiges, Eric, Author
Banaschewski, Tobias, Author
Desrivieres, Sylvane, Author
Grigis, Antoine, Author
Martinot, Jean-Luc, Author
Paus, Tomas, Author
Smolka, Michael N., Author
Walter, Henrik, Author
Schumann, Gunter, Author
Garavan, Hugh, Author
Whelan, Robert, Author

Affiliations:
1Dept. Translational Research in Psychiatry, Max Planck Institute of Psychiatry, Max Planck Society, ou_2035295

Content

show

hide

Free keywords: -

Abstract: Machine learning is increasingly being applied to neuroimaging data. However, most machine learning algorithms have not been designed to accommodate neuroimaging data, which typically has many more data points than subjects, in addition to multicollinearity and low signal-to-noise. Consequently, the relative efficacy of different machine learning regression algorithms for different types of neuroimaging data are not known. Here, we sought to quantify the performance of a variety of machine learning algorithms for use with neuroimaging data with various sample sizes, feature set sizes, and predictor effect sizes. The contribution of additional machine learning techniques - embedded feature selection and bootstrap aggregation (bagging) - to model performance was also quantified. Five machine learning regression methods - Gaussian Process Regression, Multiple Kernel Learning, Kernel Ridge Regression, the Elastic Net and Random Forest, were examined with both real and simulated MRI data, and in comparison to standard multiple regression. The different machine learning regression algorithms produced varying results, which depended on sample size, feature set size, and predictor effect size. When the effect size was large, the Elastic Net, Kernel Ridge Regression and Gaussian Process Regression performed well at most sample sizes and feature set sizes. However, when the effect size was small, only the Elastic Net made accurate predictions, but this was limited to analyses with sample sizes greater than 400. Random Forest also produced a moderate performance for small effect sizes, but could do so across all sample sizes. Machine learning techniques also improved prediction accuracy for multiple regression. These data provide empirical evidence for the differential performance of various machines on neuroimaging data, which are dependent on number of sample size, features and effect size.

Details

show

hide

Language(s):

Dates: Date issued: 2019

Publication Status: Issued

Pages: -

Publishing info: -

Table of Contents: -

Rev. Type: -

Identifiers: ISI: 000478780200033
DOI: 10.1016/j.neuroimage.2019.05.082

Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show

hide

Title: NEUROIMAGE

Source Genre: Journal

Creator(s):

Affiliations:

Publ. Info: -

Pages: - Volume / Issue: 199 Sequence Number: - Start / End Page: 351 - 365 Identifier: ISSN: 1053-8119