Person Recognition in Social Media Photos

Oh, Seong Joon; Benenson, Rodrigo; Fritz, Mario; Schiele, Bernt

Item

ITEM ACTIONSEXPORT

Add to Basket

Please note that a newer version of this item is available:
https://pure.mpg.de/pubman/item/item_2537222_2

DetailsSummary

Released

Paper

Person Recognition in Social Media Photos

MPS-Authors

/persons/resource/persons134225

Oh, Seong Joon
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society;

/persons/resource/persons79212

Benenson, Rodrigo
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society;

/persons/resource/persons44451

Fritz, Mario
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society;

/persons/resource/persons45383

Schiele, Bernt
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society;

External Resource

No external resources are shared

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

arXiv:1710.03224.pdf
(Preprint), 18MB

Supplementary Material (public)

There is no public supplementary material available

Citation

Oh, S. J., Benenson, R., Fritz, M., & Schiele, B. (2017). Person Recognition in Social Media Photos. Retrieved from http://arxiv.org/abs/1710.03224.

Cite as: https://hdl.handle.net/21.11116/0000-0000-4342-A

Abstract

People nowadays share large parts of their personal lives through social media. Being able to automatically recognise people in personal photos may greatly enhance user convenience by easing photo album organisation. For human identification task, however, traditional focus of computer vision has been face recognition and pedestrian re-identification. Person recognition in social media photos sets new challenges for computer vision, including non-cooperative subjects (e.g. backward viewpoints, unusual poses) and great changes in appearance. To tackle this problem, we build a simple person recognition framework that leverages convnet features from multiple image regions (head, body, etc.). We propose new recognition scenarios that focus on the time and appearance gap between training and testing samples. We present an in-depth analysis of the importance of different features according to time and viewpoint generalisability. In the process, we verify that our simple approach achieves the state of the art result on the PIPA benchmark, arguably the largest social media based benchmark for person recognition to date with diverse poses, viewpoints, social groups, and events. Compared the conference version of the paper, this paper additionally presents (1) analysis of a face recogniser (DeepID2+), (2) new method naeil2 that combines the conference version method naeil and DeepID2+ to achieve state of the art results even compared to post-conference works, (3) discussion of related work since the conference version, (4) additional analysis including the head viewpoint-wise breakdown of performance, and (5) results on the open-world setup.