DeepCap: Monocular Human Performance Capture Using Weak Supervision

Habermann, Marc; Xu, Weipeng; Zollhöfer, Michael; Pons-Moll, Gerard; Theobalt, Christian

DetailsSummary

DeepCap: Monocular Human Performance Capture Using Weak Supervision

Habermann, M., Xu, W., Zollhöfer, M., Pons-Moll, G., & Theobalt, C. (2020). DeepCap: Monocular Human Performance Capture Using Weak Supervision. Retrieved from https://arxiv.org/abs/2003.08325.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/21.11116/0000-0007-E010-9 Version Permalink: https://hdl.handle.net/21.11116/0000-0007-E011-8

Genre: Paper

Latex : {DeepCap}: {M}onocular Human Performance Capture Using Weak Supervision

Files

show Files

hide Files

:

arXiv:2003.08325.pdf (Preprint), 3MB

View Save

File Permalink:
https://hdl.handle.net/21.11116/0000-0007-E012-7

Name:
arXiv:2003.08325.pdf

Description:
File downloaded from arXiv at 2021-02-03 07:46

OA-Status:

Visibility:
Public

MIME-Type / Checksum:
application/pdf / [MD5]

Technical Metadata:

View

Copyright Date:
-

Copyright Info:
-

License:
http://arxiv.org/licenses/nonexclusive-distrib/1.0/

Locators

show

Creators

show

hide

Creators:
Habermann, Marc¹, Author
Xu, Weipeng¹, Author
Zollhöfer, Michael², Author
Pons-Moll, Gerard³, Author
Theobalt, Christian¹, Author

Affiliations:
1Computer Graphics, MPI for Informatics, Max Planck Society, ou_40047
2External Organizations, ou_persistent22
3Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society, ou_persistent22

Content

show

hide

Free keywords: Computer Science, Computer Vision and Pattern Recognition, cs.CV

Abstract: Human performance capture is a highly important computer vision problem with
many applications in movie production and virtual/augmented reality. Many
previous performance capture approaches either required expensive multi-view
setups or did not recover dense space-time coherent geometry with
frame-to-frame correspondences. We propose a novel deep learning approach for
monocular dense human performance capture. Our method is trained in a weakly
supervised manner based on multi-view supervision completely removing the need
for training data with 3D ground truth annotations. The network architecture is
based on two separate networks that disentangle the task into a pose estimation
and a non-rigid surface deformation step. Extensive qualitative and
quantitative evaluations show that our approach outperforms the state of the
art in terms of quality and robustness.

Details

show

hide

Language(s): eng - English

Dates: Created: 2020-03-18Published Online: 2020

Publication Status: Published online

Pages: 12 p.

Publishing info: -

Table of Contents: -

Rev. Type: -

Identifiers: arXiv: 2003.08325
BibTex Citekey: Habermann2003.08325
URI: https://arxiv.org/abs/2003.08325

Degree: -

Event

show

Legal Case

show

Project information

show

Source

show