hide
Free keywords:
Computer Science, Computer Vision and Pattern Recognition, cs.CV
Abstract:
Photo-realistic re-rendering of a human from a single image with explicit
control over body pose, shape and appearance enables a wide range of
applications, such as human appearance transfer, virtual try-on, motion
imitation, and novel view synthesis. While significant progress has been made
in this direction using learning-based image generation tools, such as GANs,
existing approaches yield noticeable artefacts such as blurring of fine
details, unrealistic distortions of the body parts and garments as well as
severe changes of the textures. We, therefore, propose a new method for
synthesising photo-realistic human images with explicit control over pose and
part-based appearance, i.e., StylePoseGAN, where we extend a non-controllable
generator to accept conditioning of pose and appearance separately. Our network
can be trained in a fully supervised way with human images to disentangle pose,
appearance and body parts, and it significantly outperforms existing single
image re-rendering methods. Our disentangled representation opens up further
applications such as garment transfer, motion transfer, virtual try-on, head
(identity) swap and appearance interpolation. StylePoseGAN achieves
state-of-the-art image generation fidelity on common perceptual metrics
compared to the current best-performing methods and convinces in a
comprehensive user study.