Analysis and Optimization of Loss Functions for Multiclass, Top-k, and 
Multilabel Classification

Lapin, Maksim; Hein, Matthias; Schiele, Bernt

Item

ITEM ACTIONSEXPORT

Add to Basket

Please note that a newer version of this item is available:
https://pure.mpg.de/pubman/item/item_2376679_3

DetailsSummary

Released

Paper

Analysis and Optimization of Loss Functions for Multiclass, Top-k, and Multilabel Classification

MPS-Authors

/persons/resource/persons44886

Lapin, Maksim
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society;

/persons/resource/persons45383

Schiele, Bernt
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society;

External Resource

No external resources are shared

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

1612.03663.pdf
(Preprint), 8MB

Supplementary Material (public)

There is no public supplementary material available

Citation

Lapin, M., Hein, M., & Schiele, B. (2016). Analysis and Optimization of Loss Functions for Multiclass, Top-k, and Multilabel Classification. Retrieved from http://arxiv.org/abs/1612.03663.

Cite as: https://hdl.handle.net/11858/00-001M-0000-002C-2985-3

Abstract

Top-k error is currently a popular performance measure on large scale image classification benchmarks such as ImageNet and Places. Despite its wide acceptance, our understanding of this metric is limited as most of the previous research is focused on its special case, the top-1 error. In this work, we explore two directions that shed more light on the top-k error. First, we provide an in-depth analysis of established and recently proposed single-label multiclass methods along with a detailed account of efficient optimization algorithms for them. Our results indicate that the softmax loss and the smooth multiclass SVM are surprisingly competitive in top-k error uniformly across all k, which can be explained by our analysis of multiclass top-k calibration. Further improvements for a specific k are possible with a number of proposed top-k loss functions. Second, we use the top-k methods to explore the transition from multiclass to multilabel learning. In particular, we find that it is possible to obtain effective multilabel classifiers on Pascal VOC using a single label per image for training, while the gap between multiclass and multilabel methods on MS COCO is more significant. Finally, our contribution of efficient algorithms for training with the considered top-k and multilabel loss functions is of independent interest.