Neural Sparse Voxel Fields

Liu, Lingjie; Gu, Jiatao; Lin, Kyaw Zaw; Chua, Tat-Seng; Theobalt, Christian

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Paper

Neural Sparse Voxel Fields

MPS-Authors

/persons/resource/persons226679

Liu, Lingjie
Computer Graphics, MPI for Informatics, Max Planck Society;

/persons/resource/persons45610

Theobalt, Christian
Computer Graphics, MPI for Informatics, Max Planck Society;

External Resource

No external resources are shared

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

arXiv:2007.11571.pdf
(Preprint), 10MB

Supplementary Material (public)

There is no public supplementary material available

Citation

Liu, L., Gu, J., Lin, K. Z., Chua, T.-S., & Theobalt, C. (2020). Neural Sparse Voxel Fields. Retrieved from https://arxiv.org/abs/2007.11571.

Cite as: https://hdl.handle.net/21.11116/0000-0007-E8B2-A

Abstract

Photo-realistic free-viewpoint rendering of real-world scenes using classical
computer graphics techniques is challenging, because it requires the difficult
step of capturing detailed appearance and geometry models. Recent studies have
demonstrated promising results by learning scene representations that
implicitly encode both geometry and appearance without 3D supervision. However,
existing approaches in practice often show blurry renderings caused by the
limited network capacity or the difficulty in finding accurate intersections of
camera rays with the scene geometry. Synthesizing high-resolution imagery from
these representations often requires time-consuming optical ray marching. In
this work, we introduce Neural Sparse Voxel Fields (NSVF), a new neural scene
representation for fast and high-quality free-viewpoint rendering. NSVF defines
a set of voxel-bounded implicit fields organized in a sparse voxel octree to
model local properties in each cell. We progressively learn the underlying
voxel structures with a differentiable ray-marching operation from only a set
of posed RGB images. With the sparse voxel octree structure, rendering novel
views can be accelerated by skipping the voxels containing no relevant scene
content. Our method is typically over 10 times faster than the state-of-the-art
(namely, NeRF(Mildenhall et al., 2020)) at inference time while achieving
higher quality results. Furthermore, by utilizing an explicit sparse voxel
representation, our method can easily be applied to scene editing and scene
composition. We also demonstrate several challenging tasks, including
multi-scene learning, free-viewpoint rendering of a moving human, and
large-scale scene rendering. Code and data are available at our website:
https://github.com/facebookresearch/NSVF.