Bilateral Guided Radiance Field Processing SIGGRAPH (ACM TOG) 2024
- Yuehao Wang 1
- Chaoyi Wang 1
- Bingchen Gong 1
- Tianfan Xue 1 2
We propose a radiance field processing pipeline: 1) Bilateral guided training can merge input views with photometric variation over radiance field optimization. 2) Bilateral guided finishing can achieve 3D-level enhancement by lifting a user-provided single view editing to 3D.
Abstract
Neural Radiance Fields (NeRF) achieves unprecedented performance in synthesizing novel view synthesis, utilizing multi-view consistency. When capturing multiple inputs, image signal processing (ISP) in modern cameras will independently enhance them, including exposure adjustment, color correction, local tone mapping, etc. While these processings greatly improve image quality, they often break the multi-view consistency assumption, leading to "floaters" in the reconstructed radiance fields. To address this concern without compromising visual aesthetics, we aim to first disentangle the enhancement by ISP at the NeRF training stage and re-apply user-desired enhancements to the reconstructed radiance fields at the finishing stage. Furthermore, to make the re-applied enhancements consistent between novel views, we need to perform imaging signal processing in 3D space (i.e. "3D ISP"). For this goal, we adopt the bilateral grid, a locally-affine model, as a generalized representation of ISP processing. Specifically, we optimize per-view 3D bilateral grids with radiance fields to approximate the effects of camera pipelines for each input view. To achieve user-adjustable 3D finishing, we propose to learn a low-rank 4D bilateral grid from a given single view edit, lifting photo enhancements to the whole 3D scene. We demonstrate our approach can boost the visual quality of novel view synthesis by effectively removing floaters and performing enhancements from user retouching.
Video
Disentangle ISP enhancements
When capturing multi-view images for training NeRF, the image signal processing (ISP) in the camera program enhances each captured view independently. This results in photometric variations in the NeRF input, leading to "floaters" in the synthesized novel views. Our bilateral guided NeRF training addresses this issue by disentangling per-view enhancements.
In this case, the scene is captured under varying exposure and ISO, using a cell phone camera. Floaters appear in the baseline result (left), but our method (right) can synthesize clean novel views.
Lift 2D enhancements to 3D
Our proposed bilateral guided finishing enables 3D-level human-adjusted enhancements. Users can simply select a rendered view and retouch it in an image editor (e.g., Adobe Lightroom®). Then, our method can lift the 2D editing to the whole scene, achieving compelling renditions consistently over synthesized views.
HDR fusion
Our bilateral guided training can achieve HDR fusion in NeRF. Best-exposed parts of the input images (a) with varying exposure are fused into a HDR radiance field (b). Our radiance-finishing can further adjust the color tone of the fused radiance fields by lifting a single view enhancement (c).
(a) Input samples w/ varying exposure
(b) Radiance field with HDR fusion
(c) Tone-mapped by our 3D finishing
Disentangle varying lighting
Varying lighting across input views (a) causes "disco" artifacts on the baseline results (b). While our method (c) can somewhat disentangle the impacts of varying lights.
(a) Input samples w/ varying lighting
(b) Results of ZipNeRF baseline
(c) Results of bilateral guided training
Object recoloring
In this case, we first edit the color of the bulldozer in a single view, then train the low-rank 4D bilateral grid on the 2D editing to perform 3D-level recoloring. The local and edge-aware operator in the bilateral space will change the color of the subject without significantly affecting surrounding areas.
Photorealistic style transfer
We first transfer the reference style to a selected view, then lift the 2D view stylization to the whole 3D scene using our proposed radiance-finishing via a low-rank 4D bilateral grid.
Live demo
We develop a simple interactive 3D editor with our method, built on the ngp_pl backbone (a PyTorch implementation of Instant-NGP). To try our method with other backbones, please check out the lib_bilagrid.py module.
Related work
This work stands on the shoulders of many prior papers. Especially, the following work illuminated us significantly:
- Real-time Edge-aware Image Processing with the Bilateral Grid introduces the bilateral grid data structure, which is the fundamental of our method. Bilateral Guided Upsampling demonstrates that the bilateral grid is a universal approximator for various image enhancements. We further adopt this operator to process 3D scenes.
- HDRNet demonstrates that the bilateral grid can work with neural networks due to its differentiability.
- TensoRF shows the effectiveness of low-rank approximation for 3D representations.
- RawNeRF first incorporates camera pipelines into NeRF.
- Our method pipeline resembles the classic HDR+ algorithm, which first merges a burst of frames into an HDR image and then applies local tone mapping for photo-finishing.
Citation
If you find this work helpful, please consider citing:@article{wang2024bilateral,
title={Bilateral Guided Radiance Field Processing},
author={Wang, Yuehao and Wang, Chaoyi and Gong, Bingchen and Xue, Tianfan},
journal={ACM Transactions on Graphics (TOG)},
volume={43},
number={4},
pages={1--13},
year={2024},
publisher={ACM New York, NY, USA}
}
Acknowledgements
We thank all anonymous reviewers for their constructive feedback and suggestions. We thank Ruikang Li for discussing image signal processing with us. We also thank Wang Wei, Xuejing Huang, and Chengkun Li for their assistance in making the video demo and capturing the dataset.
The website template was borrowed from Michaƫl Gharbi and Ref-NeRF.