Abstract

Although recent NeRF-based techniques have achieved remarkable success in novel view synthesis, they suffer from a limitation where surface intrinsic properties and environmental lighting are fused together. This limitation restricts the applicability of NeRF in numerous downstream applications. This paper proposes an inverse rendering pipeline that simultaneously reconstructs scene geometry, lighting, and spatially-varying material from a set of multi-view images. Specifically, the proposed pipeline involves volume and physics-based rendering, which are performed separately in two steps: exploration and exploitation. During the exploration step, our method utilizes the compactness of neural radiance fields and a flexible differentiable volume rendering technique to learn an initial volumetric field. Here, we introduce a novel cascaded tensorial radiance field method on top of the Canonical Polyadic (CP) decomposition to boost model compactness beyond conventional methods. In the exploitation step, a shading pass that incorporates a differentiable physics-based shading method is applied to jointly optimize the scene’s geometry, spatially-varying materials, and lighting, using image reconstruction loss. Experimental results demonstrate that our proposed inverse rendering pipeline, IRCasTRF, outperforms prior works in inverse rendering quality. The final output is highly compatible with downstream applications like scene editing and advanced simulations.

Overall Pipeline

overview

Pipeline of CasCPRF

CasCPRF

Introduction Video

Results of CasCPRF on RefNeRF- and NeRF-synthetic Dataset.

Inverse Rendering Results of Exploitation stage on NeRF- and RefNeRF-synthetic Dataset

chair_lgt
chair_lgt
chair_lgt
chair_lgt
chair_lgt
chair_lgt
chair_lgt
chair_lgt
chair_lgt
chair_lgt
chair_lgt
chair_lgt
chair_lgt

Results of CasCPRF on 3 cascaded levels. Coarse to fine from left to right.

Training convergence visualization of the first 200 iterations in exploitation stage

Training convergence visualization of the first 200 iterations in exploitation stage (without per-vertex movement).

Training convergence visualization of the first 200 iterations in exploitation stage (without global movement).

Acknowledgements

The website template was borrowed from Dor verbin.