Interactive visual labelling versus active learning: an experimental comparison

Chegini, Mohammad; Bernard, Jurgen; Cui, Jian; Chegini, Fatemeh; Sourin, Alexei; Andrews, Keith; Schreck, Tobias

doi:10.1631/FITEE.1900549

Local TagsRelease HistoryDetailsSummary

Interactive visual labelling versus active learning: an experimental comparison

Chegini, M., Bernard, J., Cui, J., Chegini, F., Sourin, A., Andrews, K., et al. (2020). Interactive visual labelling versus active learning: an experimental comparison. Frontiers of Information Technology & Electronic Engineering, 21, 524-535. doi:10.1631/FITEE.1900549.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/21.11116/0000-0006-633F-5 Version Permalink: https://hdl.handle.net/21.11116/0000-0007-7938-3

Genre: Journal Article

Files

show Files

Locators

show

Creators

show

hide

Creators:
Chegini, Mohammad¹, Author
Bernard, Jurgen¹, Author
Cui, Jian¹, Author
Chegini, Fatemeh², Author
Sourin, Alexei¹, Author
Andrews, Keith¹, Author
Schreck, Tobias¹, Author

Affiliations:
1external, ou_persistent22
2Ocean Biogeochemistry, The Ocean in the Earth System, MPI for Meteorology, Max Planck Society, Bundesstraße 53, 20146 Hamburg, DE, ou_913556

Content

show

hide

Free keywords: -

Abstract: Methods from supervised machine learning allow the classification of new data automatically and are tremendously helpful for data analysis. The quality of supervised maching learning depends not only on the type of algorithm used, but also on the quality of the labelled dataset used to train the classifier. Labelling instances in a training dataset is often done manually relying on selections and annotations by expert analysts, and is often a tedious and time-consuming process. Active learning algorithms can automatically determine a subset of data instances for which labels would provide useful input to the learning process. Interactive visual labelling techniques are a promising alternative, providing effective visual overviews from which an analyst can simultaneously explore data records and select items to a label. By putting the analyst in the loop, higher accuracy can be achieved in the resulting classifier. While initial results of interactive visual labelling techniques are promising in the sense that user labelling can improve supervised learning, many aspects of these techniques are still largely unexplored. This paper presents a study conducted using the mVis tool to compare three interactive visualisations, similarity map, scatterplot matrix (SPLOM), and parallel coordinates, with each other and with active learning for the purpose of labelling a multivariate dataset. The results show that all three interactive visual labelling techniques surpass active learning algorithms in terms of classifier accuracy, and that users subjectively prefer the similarity map over SPLOM and parallel coordinates for labelling. Users also employ different labelling strategies depending on the visualisation used.

Details

show

hide

Language(s): eng - English

Dates: Published Online: 2020Date issued: 2020

Publication Status: Issued

Pages: -

Publishing info: -

Table of Contents: -

Rev. Type: Peer

Identifiers: DOI: 10.1631/FITEE.1900549

Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show

hide

Title: Frontiers of Information Technology & Electronic Engineering

Source Genre: Journal

Creator(s):

Affiliations:

Publ. Info: Springer & Zhejiang University Press

Pages: - Volume / Issue: 21 Sequence Number: - Start / End Page: 524 - 535 Identifier: ISSN: 2095-9184