latentSplat: Autoencoding Variational Gaussians for Fast Generalizable 3D Reconstruction
On this page
We present latentSplat, a method to predict semantic Gaussians in a 3D latentspace that can be splatted and decoded by a light-weight generative 2Darchitecture. Existing methods for generalizable 3D reconstruction either donot scale to large scenes and resolutions, or are limited to interpolation ofclose input views. latentSplat combines the strengths of regression-based andgenerative approaches while being trained purely on readily available realvideo data. The core of our method are variational 3D Gaussians, arepresentation that efficiently encodes varying uncertainty within a latentspace consisting of 3D feature Gaussians. From these Gaussians, specificinstances can be sampled and rendered via efficient splatting and a fast,generative decoder. We show that latentSplat outperforms previous works inreconstruction quality and generalization, while being fast and scalable tohigh-resolution data.
Further reading
- Access Paper in arXiv.org