ausblenden:
Schlagwörter:
Computer Science, Computer Vision and Pattern Recognition, cs.CV,Computer Science, Graphics, cs.GR,Computer Science, Learning, cs.LG
Zusammenfassung:
Can we make virtual characters in a scene interact with their surrounding
objects through simple instructions? Is it possible to synthesize such motion
plausibly with a diverse set of objects and instructions? Inspired by these
questions, we present the first framework to synthesize the full-body motion of
virtual human characters performing specified actions with 3D objects placed
within their reach. Our system takes as input textual instructions specifying
the objects and the associated intentions of the virtual characters and outputs
diverse sequences of full-body motions. This is in contrast to existing work,
where full-body action synthesis methods generally do not consider object
interactions, and human-object interaction methods focus mainly on synthesizing
hand or finger movements for grasping objects. We accomplish our objective by
designing an intent-driven full-body motion generator, which uses a pair of
decoupled conditional variational autoencoders (CVAE) to learn the motion of
the body parts in an autoregressive manner. We also optimize for the positions
of the objects with six degrees of freedom (6DoF) such that they plausibly fit
within the hands of the synthesized characters. We compare our proposed method
with the existing methods of motion synthesis and establish a new and stronger
state-of-the-art for the task of intent-driven motion synthesis. Through a user
study, we further show that our synthesized full-body motions appear more
realistic to the participants in more than 80% of scenarios compared to the
current state-of-the-art methods, and are perceived to be as good as the ground
truth on several occasions.