hide
Free keywords:
Computer Science, Computer Vision and Pattern Recognition, cs.CV
Abstract:
New approaches to synthesize and manipulate face videos at very high quality
have paved the way for new applications in computer animation, virtual and
augmented reality, or face video analysis. However, there are concerns that
they may be used in a malicious way, e.g. to manipulate videos of public
figures, politicians or reporters, to spread false information. The research
community therefore developed techniques for automated detection of modified
imagery, and assembled benchmark datasets showing manipulatons by
state-of-the-art techniques. In this paper, we contribute to this initiative in
two ways: First, we present a new audio-visual benchmark dataset. It shows some
of the highest quality visual manipulations available today. Human observers
find them significantly harder to identify as forged than videos from other
benchmarks. Furthermore we propose new family of deep-learning-based fake
detectors, demonstrating that existing detectors are not well-suited for
detecting fakes of a quality as high as presented in our dataset. Our detectors
examine spatial and temporal features. This allows them to outperform existing
approaches both in terms of high detection accuracy and generalization to
unseen fake generation methods and unseen identities.