Kernel-Predicting Convolutional Networks
for Denoising Monte Carlo Renderings
Evaluation on production data
This section shows the performance of our models on test frames from Cars 3 and Coco, which are significantly different in style from our training set, Finding Dory. We compare our results to the results obtained by previous methods: RDFC [Rousselle et al. 2013], APR [Moon et al. 2016], NFOR [Bitterli et al. 2016], LBF [Kalantari et al. 2015], and the Hyperion denoiser, used in the production of the aforementioned films. To obtain the results for LBF [Kalantari et al. 2015], we trained on different data than their system expects which resulted in suboptimal results with excessive residual noise (see paper text for details). We trained separate systems for 32 spp and 128 spp data. A large subset of the frames include 16 spp results that were run with our network trained on 32 spp, demonstrating its ability to extrapolate to other sampling rates.
Hyper parameters
For this dataset, we used a learning rate of 10-5 and a batch size of 5 patches with ADAM optimization [Kingma and Ba 2014]. When performing fine tuning (Sec. 4.3), we use a learning rate of 10-6.
Test set
Evaluation on an open-source dataset
The focus of this paper is the application of deep-learning based denoising to production-quality Monte Carlo renderings. Most of the training and evaluation has therefore been done on proprietary data. In order to facilitate comparisons to future methods, we also provide results for our models when trained and tested on a publicly available dataset.
Training data
We trained our models on a training set consisting of perturbations of scenes available on https://benedikt-bitterli.me/resources/. To assure the training data covers a sufficient range of lighting situations, color patterns, and geometry, we randomly perturbed the scenes by varying camera parameters, materials, and lighting. In this way, we generated 1484 (noisy image, high quality image) pairs from the following 8 base scenes. We extracted patches from those images at four different sample counts per pixel: 128 spp, 256 spp, 512 spp and 1024 spp.
![]() Contemporary Bathroom Mareck, Blendswap.com |
![]() Pontiac GTO 67 MrChimp2313, Blendswap.com |
![]() Bedroom SlykDrako, Blendswap.com |
![]() 4060.b Spaceship thecali, Blendswap.com |
![]() Victorian Style House MrChimp2313, Blendswap.com |
![]() The Breakfast Room Wig42, Blendswap.com |
![]() The Wooden Staircase Wig42, Blendswap.com |
![]() Japanese Classroom NovaZeeke, Blendswap.com |
Hyper parameters
We found that a learning rate of 10-4 and a batch size of 100 patches worked well for this dataset. A larger batch size helps to deal with the amount of noise in these scenes, which is much larger than in our production frames. For the fine-tuning stage, we use a learning rate of 10-6.
Evaluation
To assess the quality of our models, we evaluate them on different scenes from the same website.
* joint first authors