English
 
User Manual Privacy Policy Disclaimer Contact us
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  An efficient data partitioning to improve classification performance while keeping parameters interpretable

Korjus, K., Hebart, M. N., & Vicente, R. (2016). An efficient data partitioning to improve classification performance while keeping parameters interpretable. PLoS One, 11(8): e0161788. doi:10.1371/journal.pone.0161788.

Item is

Basic

show hide
Item Permalink: http://hdl.handle.net/21.11116/0000-0005-210D-8 Version Permalink: http://hdl.handle.net/21.11116/0000-0005-210E-7
Genre: Journal Article

Files

show Files
hide Files
:
Korjus_2016.PDF (Publisher version), 2MB
Name:
Korjus_2016.PDF
Description:
-
Visibility:
Public
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
-
Copyright Info:
-
License:
-

Locators

show

Creators

show
hide
 Creators:
Korjus, Kristjan 1, Author
Hebart, Martin N.1, Author              
Vicente, Raul 1, Author
Affiliations:
1External Organizations, ou_persistent22              

Content

show
hide
Free keywords: -
 Abstract: Supervised machine learning methods typically require splitting data into multiple chunks for training, validating, and finally testing classifiers. For finding the best parameters of a classifier, training and validation are usually carried out with cross-validation. This is followed by application of the classifier with optimized parameters to a separate test set for estimating the classifier’s generalization performance. With limited data, this separation of test data creates a difficult trade-off between having more statistical power in estimating generalization performance versus choosing better parameters and fitting a better model. We propose a novel approach that we term “Cross-validation and cross-testing” improving this trade-off by re-using test data without biasing classifier performance. The novel approach is validated using simulated data and electrophysiological recordings in humans and rodents. The results demonstrate that the approach has a higher probability of discovering significant results than the standard approach of cross-validation and testing, while maintaining the nominal alpha level. In contrast to nested cross-validation, which is maximally efficient in re-using data, the proposed approach additionally maintains the interpretability of individual parameters. Taken together, we suggest an addition to currently used machine learning approaches which may be particularly useful in cases where model weights do not require interpretation, but parameters do.

Details

show
hide
Language(s): eng - English
 Dates: 2016-05-092016-08-112016-08-26
 Publication Status: Published online
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Method: Peer
 Identifiers: DOI: 10.1371/journal.pone.0161788
PMID: 27564393
PMC: PMC5001642
Other: eCollection 2016
 Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show
hide
Title: PLoS One
Source Genre: Journal
 Creator(s):
Affiliations:
Publ. Info: San Francisco, CA : Public Library of Science
Pages: - Volume / Issue: 11 (8) Sequence Number: e0161788 Start / End Page: - Identifier: ISSN: 1932-6203
CoNE: https://pure.mpg.de/cone/journals/resource/1000000000277850