English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

 
 
DownloadE-Mail

Released

Conference Paper

MoCapDeform: Monocular 3D Human Motion Capture in Deformable Scenes

MPS-Authors
/persons/resource/persons261416

Li,  Zhi
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society;

/persons/resource/persons255947

Shimada,  Soshi
Visual Computing and Artificial Intelligence, MPI for Informatics, Max Planck Society;

/persons/resource/persons45383

Schiele,  Bernt
Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society;

/persons/resource/persons45610

Theobalt,  Christian       
Visual Computing and Artificial Intelligence, MPI for Informatics, Max Planck Society;

/persons/resource/persons239654

Golyanik,  Vladislav
Visual Computing and Artificial Intelligence, MPI for Informatics, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)

arXiv:2208.08439.pdf
(Preprint), 5MB

Supplementary Material (public)
There is no public supplementary material available
Citation

Li, Z., Shimada, S., Schiele, B., Theobalt, C., & Golyanik, V. (in press). MoCapDeform: Monocular 3D Human Motion Capture in Deformable Scenes. In International Conference on 3D Vision. Piscataway, NJ: IEEE.


Cite as: https://hdl.handle.net/21.11116/0000-000B-9CB9-5
Abstract
3D human motion capture from monocular RGB images respecting interactions of
a subject with complex and possibly deformable environments is a very
challenging, ill-posed and under-explored problem. Existing methods address it
only weakly and do not model possible surface deformations often occurring when
humans interact with scene surfaces. In contrast, this paper proposes
MoCapDeform, i.e., a new framework for monocular 3D human motion capture that
is the first to explicitly model non-rigid deformations of a 3D scene for
improved 3D human pose estimation and deformable environment reconstruction.
MoCapDeform accepts a monocular RGB video and a 3D scene mesh aligned in the
camera space. It first localises a subject in the input monocular video along
with dense contact labels using a new raycasting based strategy. Next, our
human-environment interaction constraints are leveraged to jointly optimise
global 3D human poses and non-rigid surface deformations. MoCapDeform achieves
superior accuracy than competing methods on several datasets, including our
newly recorded one with deforming background scenes.