Real-time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor

Mueller, Franziska; Mehta, Dushyant; Sotnychenko, Oleksandr; Sridhar, Srinath; Casas, Dan; Theobalt, Christian

DetailsSummary

Real-time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor

Mueller, F., Mehta, D., Sotnychenko, O., Sridhar, S., Casas, D., & Theobalt, C. (2017). Real-time Hand Tracking under Occlusion from an Egocentric RGB-D Sensor. Retrieved from http://arxiv.org/abs/1704.02201.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/11858/00-001M-0000-002D-8BBD-F Version Permalink: https://hdl.handle.net/11858/00-001M-0000-002D-8BBE-D

Genre: Paper

Files

show Files

hide Files

:

arXiv:1704.02201.pdf (Preprint), 7MB

View Save

File Permalink:
https://hdl.handle.net/11858/00-001M-0000-002D-8BBF-B

Name:
arXiv:1704.02201.pdf

Description:
File downloaded from arXiv at 2017-07-05 12:31

OA-Status:

Visibility:
Public

MIME-Type / Checksum:
application/pdf / [MD5]

Technical Metadata:

View

Copyright Date:
-

Copyright Info:
-

License:
http://arxiv.org/help/license

Locators

show

Creators

show

hide

Creators:
Mueller, Franziska¹, Author
Mehta, Dushyant¹, Author
Sotnychenko, Oleksandr¹, Author
Sridhar, Srinath¹, Author
Casas, Dan², Author
Theobalt, Christian¹, Author

Affiliations:
1Computer Graphics, MPI for Informatics, Max Planck Society, ou_40047
2External Organizations, ou_persistent22

Content

show

hide

Free keywords: Computer Science, Computer Vision and Pattern Recognition, cs.CV

Abstract: We present an approach for real-time, robust and accurate hand pose estimation from moving egocentric RGB-D cameras in cluttered real environments. Existing methods typically fail for hand-object interactions in cluttered scenes imaged from egocentric viewpoints, common for virtual or augmented reality applications. Our approach uses two subsequently applied Convolutional Neural Networks (CNNs) to localize the hand and regress 3D joint locations. Hand localization is achieved by using a CNN to estimate the 2D position of the hand center in the input, even in the presence of clutter and occlusions. The localized hand position, together with the corresponding input depth value, is used to generate a normalized cropped image that is fed into a second CNN to regress relative 3D hand joint locations in real time. For added accuracy, robustness and temporal stability, we refine the pose estimates using a kinematic pose tracking energy. To train the CNNs, we introduce a new photorealistic dataset that uses a merged reality approach to capture and synthesize large amounts of annotated data of natural hand interaction in cluttered scenes. Through quantitative and qualitative evaluation, we show that our method is robust to self-occlusion and occlusions by objects, particularly in moving egocentric perspectives.

Details

show

hide

Language(s): eng - English

Dates: Created: 2017-04-07Published Online: 2017

Publication Status: Published online

Pages: 10 p.

Publishing info: -

Table of Contents: -

Rev. Type: -

Identifiers: arXiv: 1704.02201
URI: http://arxiv.org/abs/1704.02201
BibTex Citekey: DBLP:journals/corr/MuellerMS0CT17

Degree: -

Event

show

Legal Case

show

Project information

show

Source

show