English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Paper

High-Fidelity Neural Human Motion Transfer from Monocular Video

MPS-Authors
/persons/resource/persons239654

Golyanik,  Vladislav
Computer Graphics, MPI for Informatics, Max Planck Society;

/persons/resource/persons229949

Elgharib,  Mohamed
Computer Graphics, MPI for Informatics, Max Planck Society;

/persons/resource/persons45449

Seidel,  Hans-Peter       
Computer Graphics, MPI for Informatics, Max Planck Society;

/persons/resource/persons45610

Theobalt,  Christian       
Computer Graphics, MPI for Informatics, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)

arXiv:2012.10974.pdf
(Preprint), 25MB

Supplementary Material (public)
There is no public supplementary material available
Citation

Kappel, M., Golyanik, V., Elgharib, M., Henningson, J.-O., Seidel, H.-P., Castillo, S., et al. (2020). High-Fidelity Neural Human Motion Transfer from Monocular Video. Retrieved from https://arxiv.org/abs/2012.10974.


Cite as: https://hdl.handle.net/21.11116/0000-0007-B715-3
Abstract
Video-based human motion transfer creates video animations of humans
following a source motion. Current methods show remarkable results for
tightly-clad subjects. However, the lack of temporally consistent handling of
plausible clothing dynamics, including fine and high-frequency details,
significantly limits the attainable visual quality. We address these
limitations for the first time in the literature and present a new framework
which performs high-fidelity and temporally-consistent human motion transfer
with natural pose-dependent non-rigid deformations, for several types of loose
garments. In contrast to the previous techniques, we perform image generation
in three subsequent stages, synthesizing human shape, structure, and
appearance. Given a monocular RGB video of an actor, we train a stack of
recurrent deep neural networks that generate these intermediate representations
from 2D poses and their temporal derivatives. Splitting the difficult motion
transfer problem into subtasks that are aware of the temporal motion context
helps us to synthesize results with plausible dynamics and pose-dependent
detail. It also allows artistic control of results by manipulation of
individual framework stages. In the experimental results, we significantly
outperform the state-of-the-art in terms of video realism. Our code and data
will be made publicly available.