Comparing Search Strategies of Humans and Machines in Clutter

Michaelis, C; Weller, M; Funke, C; Ecker, AS; Wallis , TSA; Bethge, M

doi:10.1167/19.10.309c

Comparing Search Strategies of Humans and Machines in Clutter

Michaelis, C., Weller, M., Funke, C., Ecker, A., Wallis, T., & Bethge, M. (2019). Comparing Search Strategies of Humans and Machines in Clutter. Poster presented at Nineteenth Annual Meeting of the Vision Sciences Society (VSS 2019), St. Pete Beach, FL, USA.

Item is 公開

表示: 全項目非表示: 全項目

基本情報

表示: 非表示:

アイテムのパーマリンク: https://hdl.handle.net/21.11116/0000-0004-BF54-7 版のパーマリンク: https://hdl.handle.net/21.11116/0000-0004-BF55-6

資料種別: ポスター

ファイル

表示: ファイル

作成者

表示:

非表示:

作成者:
Michaelis, C, 著者
Weller, M, 著者
Funke, C, 著者
Ecker, AS, 著者
Wallis , TSA, 著者
Bethge, M^{1, 2}, 著者

所属:
1Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_1497794
2Research Group Computational Vision and Neuroscience, Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_1497805

内容説明

表示:

非表示:

キーワード: -

要旨: While many perceptual tasks become more difficult in the presence of clutter, in general the human visual system has evolved tolerance to cluttered environments. In contrast, current machine learning approaches struggle in the presence of clutter. We compare human observers and CNNs on two target localization tasks with cluttered images created from characters or rendered objects. Each task sample consists of such a cluttered image as well as a separate image of one object which has to be localized. Human observers are asked to identify wether the object lies in the left or right half of the image and accuracy, reaction time and eye movements are recorded. CNNs are trained to segment the object and the position of the center of mass of the segmentation mask is then used to predict the position. Clutter levels are defined by the set-size ranging from 2 to 256 objects per image. We find that for humans processing times increase with the amount of clutter while for machine learning models accuracy drops. This points to a critical difference in human and machine processing: humans search serially whereas current machine learning models typically process a whole image in one pass. Following this line of thought we show that machine learning models with two iterations of processing perform significantly better than the purely feed-forward CNNs dominating in current object recognition applications. This finding suggests that confronted with challenging scenes iterative processing might be just as important for machines as it is for humans.

資料詳細

表示:

非表示:

言語:

日付: オンライン出版: 2019-05出版: 2019-09

出版の状態: 出版

ページ: -

出版情報: -

目次: -

査読: -

識別子（DOI, ISBNなど）: DOI: 10.1167/19.10.309c

学位: -

訴訟

表示:

Project information

表示:

出版物 1

表示:

非表示:

出版物名: Journal of Vision

種別: 学術雑誌

著者・編者:

所属:

出版社, 出版地: Charlottesville, VA : Scholar One, Inc.

ページ: - 巻号: 19 (10) 通巻号: 63.412 開始・終了ページ: 309 - 310 識別子（ISBN, ISSN, DOIなど）: ISSN: 1534-7362
CoNE: https://pure.mpg.de/cone/journals/resource/111061245811050

アイテム詳細

基本情報

ファイル

関連URL

作成者

内容説明

資料詳細

関連イベント

訴訟

Project information

出版物 1