Pose-Guided Human Animation from a Single Image in the Wild

Yoon, Jae Shin; Liu, Lingjie; Golyanik, Vladislav; Sarkar, Kripasindhu; Park, Hyun Soo; Theobalt, Christian

DetailsSummary

Pose-Guided Human Animation from a Single Image in the Wild

Yoon, J. S., Liu, L., Golyanik, V., Sarkar, K., Park, H. S., & Theobalt, C. (2020). Pose-Guided Human Animation from a Single Image in the Wild. Retrieved from https://arxiv.org/abs/2012.03796.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/21.11116/0000-0007-E9F3-0 Version Permalink: https://hdl.handle.net/21.11116/0000-0007-E9F4-F

Genre: Paper

Files

show Files

hide Files

:

arXiv:2012.03796.pdf (Preprint), 3MB

View Save

File Permalink:
https://hdl.handle.net/21.11116/0000-0007-E9F5-E

Name:
arXiv:2012.03796.pdf

Description:
File downloaded from arXiv at 2021-02-08 14:04

OA-Status:

Visibility:
Public

MIME-Type / Checksum:
application/pdf / [MD5]

Technical Metadata:

View

Copyright Date:
-

Copyright Info:
-

License:
http://arxiv.org/licenses/nonexclusive-distrib/1.0/

Locators

show

Creators

show

hide

Creators:
Yoon, Jae Shin¹, Author
Liu, Lingjie², Author
Golyanik, Vladislav², Author
Sarkar, Kripasindhu¹, Author
Park, Hyun Soo¹, Author
Theobalt, Christian², Author

Affiliations:
1External Organizations, ou_persistent22
2Computer Graphics, MPI for Informatics, Max Planck Society, ou_40047

Content

show

hide

Free keywords: Computer Science, Computer Vision and Pattern Recognition, cs.CV

Abstract: We present a new pose transfer method for synthesizing a human animation from
a single image of a person controlled by a sequence of body poses. Existing
pose transfer methods exhibit significant visual artifacts when applying to a
novel scene, resulting in temporal inconsistency and failures in preserving the
identity and textures of the person. To address these limitations, we design a
compositional neural network that predicts the silhouette, garment labels, and
textures. Each modular network is explicitly dedicated to a subtask that can be
learned from the synthetic data. At the inference time, we utilize the trained
network to produce a unified representation of appearance and its labels in UV
coordinates, which remains constant across poses. The unified representation
provides an incomplete yet strong guidance to generating the appearance in
response to the pose change. We use the trained network to complete the
appearance and render it with the background. With these strategies, we are
able to synthesize human animations that can preserve the identity and
appearance of the person in a temporally coherent way without any fine-tuning
of the network on the testing scene. Experiments show that our method
outperforms the state-of-the-arts in terms of synthesis quality, temporal
coherence, and generalization ability.

Details

show

hide

Language(s): eng - English

Dates: Created: 2020-12-07Published Online: 2020

Publication Status: Published online

Pages: 14 p.

Publishing info: -

Table of Contents: -

Rev. Type: -

Identifiers: arXiv: 2012.03796
URI: https://arxiv.org/abs/2012.03796
BibTex Citekey: Yoon_2012.03796

Degree: -

Event

show

Legal Case

show

Project information

show

Source

show