hide
Free keywords:
-
Abstract:
Human motion capture data has been widely used in data-driven character animation. In
order to generate realistic, natural-looking motions, most data-driven approaches require
considerable efforts of pre-processing, including motion segmentation, annotation, and
so on. Existing (semi-) automatic solutions either require hand-crafted features for
motion segmentation or do not produce the semantic annotations required for motion
synthesis and building large-scale motion databases. In this thesis, an approach for a
semi-automatic framework for semantic segmentation of motion capture data based on
(semi-) supervised machine learning techniques is developed. The motion capture data is
first transformed into a “motion image” to apply common convolutional neural networks
for image segmentation. Convolutions over the time domain enable the extraction of
temporal information and dilated convolutions are used to enlarge the receptive field
exponentially using comparably few layers and parameters. The finally developed dilated
temporal fully-convolutional model is compared against state-of-the-art models in action
segmentation, as well as a popular network for sequence modeling. The models are
further tested on noisy and inaccurate training labels and the developed model is found
to be surprisingly robust and self-correcting.