Neural View-Interpolation for Sparse Light Field Video

Bemana, Mojtaba; Myszkowski, Karol; Seidel, Hans-Peter; Ritschel, Tobias

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Paper

Neural View-Interpolation for Sparse Light Field Video

MPS-Authors

/persons/resource/persons232942

Bemana, Mojtaba
Computer Graphics, MPI for Informatics, Max Planck Society;

/persons/resource/persons45095

Myszkowski, Karol
Computer Graphics, MPI for Informatics, Max Planck Society;

/persons/resource/persons45449

Seidel, Hans-Peter
Computer Graphics, MPI for Informatics, Max Planck Society;

External Resource

No external resources are shared

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

There are no public fulltexts stored in PuRe

Supplementary Material (public)

There is no public supplementary material available

Citation

Bemana, M., Myszkowski, K., Seidel, H.-P., & Ritschel, T. (2019). Neural View-Interpolation for Sparse Light Field Video. Retrieved from http://arxiv.org/abs/1910.13921.

Cite as: https://hdl.handle.net/21.11116/0000-0005-7B16-9

Abstract

We suggest representing light field (LF) videos as "one-off" neural networks
(NN), i.e., a learned mapping from view-plus-time coordinates to
high-resolution color values, trained on sparse views. Initially, this sounds
like a bad idea for three main reasons: First, a NN LF will likely have less
quality than a same-sized pixel basis representation. Second, only few training
data, e.g., 9 exemplars per frame are available for sparse LF videos. Third,
there is no generalization across LFs, but across view and time instead.
Consequently, a network needs to be trained for each LF video. Surprisingly,
these problems can turn into substantial advantages: Other than the linear
pixel basis, a NN has to come up with a compact, non-linear i.e., more
intelligent, explanation of color, conditioned on the sparse view and time
coordinates. As observed for many NN however, this representation now is
interpolatable: if the image output for sparse view coordinates is plausible,
it is for all intermediate, continuous coordinates as well. Our specific
network architecture involves a differentiable occlusion-aware warping step,
which leads to a compact set of trainable parameters and consequently fast
learning and fast execution.