diff --git a/README.md b/README.md index f044c821..258fadcc 100644 --- a/README.md +++ b/README.md @@ -3,11 +3,207 @@ CUDA Denoiser For CUDA Path Tracer **University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 4** -* (TODO) YOUR NAME HERE -* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab) +* Megan Reddy + * [LinkedIn](https://www.linkedin.com/in/meganr25a949125/), [personal website](https://meganr28.github.io/) +* Tested on: Windows 10, AMD Ryzen 9 5900HS with Radeon Graphics @ 3301 MHz 16GB, NVIDIA GeForce RTX 3060 Laptop GPU 6GB (Personal Computer) +* Compute Capability: 8.6 -### (TODO: Your README) +### Overview -*DO NOT* leave the README to the last minute! It is a crucial part of the -project, and we will not be able to grade you without a good README. +Denoising is a technique used to remove noise from path-traced images. In scenes where ray paths are unlikely to hit light sources, we perceive +a lot of noise. In real-time ray tracing, we often desire high quality images that can be rendered at interactive rates. In traditional path tracing, +we must take hundreds of samples per pixel to achieve the high visual quality we desire. For real-time applications, it is impractical to wait such a long time +for a frame to render, so we desire a technique that can achieve an acceptably "smooth" image in fewer iterations. This is where denoising comes in handy. + +| Iterations | Raw Pathtraced | Denoised | +|:----------: |:-------------: |:------:| +| 100 | ![](img/cover_pathtraced_100.png) | ![](img/cover_denoised_100.PNG) | +| 1000 | ![](img/cover_pathtraced_1000.PNG)| ![](img/cover_denoised_1000.PNG) | +

Figure 1. Raw path-traced and denoised image comparison with 80 x 80 À-Trous filter.

+ +#### Edge-Avoiding À-Trous Wavelet Transform + +This implementation uses the technique described in the paper [Edge-Avoiding À-Trous Wavelet Transform for fast Global Illumination Filtering](https://jo.dreggn.org/home/2010_atrous.pdf) +to implement denoising. The basis of the technique is to take in a noisy path-traced image, as well as information from a G-buffer (normals and positions), to strategically perform a blur on the image to +remove noise. The À-Trous filtering technique is an approximation of Gaussian blur, but instead uses a fixed kernel size. In our implementation, the kernel remains at a fixed size of 5x5, but we increasing +space out samples each iteration instead of using a larger filter. The step size increases by a factor of 2 each iteration. The illustration below demonstrates this idea. + +![](img/atrous_kernels.PNG) +

Figure 2. Spacing of À-Trous kernel at iteration 0, 1, and 2 (left to right).

+ +At the first path tracing iteration, we store information in our G-buffer. We store intersection depth, normal, and position in a `GBufferPixel` struct to read later. +We use this information to detect edges in our filtering scheme. When we blur an image, we want to preserve edges between objects. If normals or positions differ significantly between pixels, +we most likely have encountered an edge. A visualization of the normal, position, and depth buffers is provided below. + + Per-Pixel Normals (remapped to [0, 1]) | Per-Pixel Positions (abs value and scaled down) | Per-Pixel Depth | +|---|---|---| +|![](img/results/normals.PNG)|![](img/results/positions.PNG)|![](img/results/depth.PNG)| +

Figure 3. G-buffer visualizations

+ +Our next step is implement a weighted filter that we will use to gather an accumulated color for each pixel. Without any weighting, the result +of denoising is simply a blur across the entire image since we have no edge detection scheme in place. We have provided a comparison to GIMP's Gaussian blur +in the table below. We notice that the À-Trous filter and Gaussian filter produce similar results. + + Raw Pathtraced (100 iterations) | Simple Blur 80x80 Filter | GIMP Gaussian Blur 80x80 Filter | +|---|---|---| +|![](img/results/basic_blur/no_blur_100samp.PNG)|![](img/results/basic_blur/basic_blur_100samp.PNG)|![](img/results/basic_blur/gimp_blur_100samp.PNG)| +

Figure 4. A comparison of basic blur between the À-Trous filter and GIMP's Gaussian filter (100 iterations). Note that the À-Trous algorithm is an approximation of Gaussian blur.

+ +Lastly, we implement edge-avoiding filtering using the weighting function described in the paper. We compute the edge-stopping function for the path-traced pixel color, +pixel normal, and pixel position (see Equation 5 in the paper) and multiply these together to get a pixel weight. We attenuate the current color by the weight and kernel value and +add this to the accumulated sum. We also keep a cumulative sum of weights. At the end, we set the denoised pixel color equal to the `accumulated_color_sum / weight_sum`. + + Raw Pathtraced (100 iterations) | Simple Blur 80x80 Filter | Edge-Avoiding 80x80 Filter | +|---|---|---| +|![](img/results/basic_blur/no_blur_100samp.PNG)|![](img/results/basic_blur/basic_blur_100samp.PNG)|![](img/results/basic_blur/edge_avoiding_100samp.PNG)| +

Figure 5. Result of adding in weighting functions. We use a color weight of 25.0, normal weight of 0.35, and position weight of 0.2.

+ +### GUI Controls + +* `iterations` - number of path tracing iterations. +* `denoise` - check to show denoised image. +* `filter size` - size of À-Trous filter. Determines the number of filter passes. +* `color weight` - sigma value in color edge-stopping function. +* `normal weight` - sigma value in normal edge-stopping function. +* `position weight` - sigma value in position edge-stopping function. +* `show gbuffer` - show the g-buffer. Use dropdown to select which buffer you want to see. + +### Visual Analysis + +For the following results, we use a `color_weight` of 150.0, a `normal_weight` of 0.5, and +a `position_weight` of 0.4. The resolution of the image is fixed at 800 x 800. + +#### Varying Filter Size + +To observe the effect of filter size on visual quality, we denoise the same scene using 100 path-tracing iterations +and varying filter sizes. The results suggest that visual quality increases as filter size increases. In the images below, we can still see noise +at low filter sizes. As we increase the filter size, it becomes less noticeable. The visual improvement +does not scale uniformly with filter size; we can see a large improvement in quality between sizes 10, 20, and 40, +but the amount of quality we gain afterwards is much less noticeable. + +| Filter Size | Result (100 iterations) | Filter Size | Result (100 iterations) +|:----------: |:-------------: |:------: |:------:| +| 10x10 | ![](img/results/filter_size/filter10_2.PNG) | 160x160 | ![](img/results/filter_size/filter160_2.PNG) | +| 20x20 | ![](img/results/filter_size/filter20_2.PNG) | 320x320 | ![](img/results/filter_size/filter320_2.PNG) | +| 40x40 | ![](img/results/filter_size/filter40_2.PNG) | 640x640 | ![](img/results/filter_size/filter640_2.PNG) | +| 80x80 | ![](img/results/filter_size/filter80_2.PNG) | 1280x1280 | ![](img/results/filter_size/filter1280_2.PNG) | +

Figure 6. Visual impact of varying filter size on an 800 x 800 image.

+ +#### Different Material Types + +The following scenes were rendered with an 80 x 80 filter. We compare our denoising results to these ground-truth images +of a diffuse and specular reflective sphere rendered at 10,000 iterations to judge visual quality. + +| Iterations | Diffuse | Specular Reflective | +|:----------: |:-------------: |:------:| +| 10000 | ![](img/results/materials/diffuse_pathtraced_10000samp.PNG) | ![](img/results/materials/specular_pathtraced_10000samp.PNG) | +

Figure 7. Ground-truth reference images for diffuse and reflective scenes (10,000 iterations).

+ +Denoising is very effective for scenes with diffuse materials, since there aren't many fine details to capture. We can see that +the denoised result is close to the outcome that we would expect after running the program for many iterations. + +| Iterations | Raw Pathtraced | Denoised | +|:----------: |:-------------: |:------:| +| 100 | ![](img/results/materials/diffuse_pathtraced_100samp.PNG) | ![](img/results/materials/diffuse_denoised_100samp.PNG) | +| 1000 | ![](img/results/materials/diffuse_pathtraced_1000samp.PNG)| ![](img/results/materials/diffuse_denoised_1000samp.PNG) | +

Figure 8. Effect of denoising a scene with a diffuse sphere.

+ +Denoising is less effective for specular materials, especially at lower iteration counts. Since the image is less converged, +many of the fine details are less apparent and therefore the denoised image blurs the reflection at the edges. At 1000 iterations, +the reflective detail is more clear. + +| Iterations | Raw Pathtraced | Denoised | +|:----------: |:-------------: |:------:| +| 100 | ![](img/results/materials/specular_pathtraced_100samp.PNG) | ![](img/results/materials/specular_denoised_100samp.PNG) | +| 1000 | ![](img/results/materials/specular_pathtraced_1000samp.PNG)| ![](img/results/materials/specular_denoised_1000samp.PNG) | +

Figure 9. Effect of denoising a scene with a reflective sphere.

+ +#### Different Scenes + +The scene with the larger ceiling light produces much better denoised results than the scene with the smaller light. +Since the light is bigger in the first scene, rays are more likely to hit it, meaning that the image will converge faster. +In the smaller light scene, rays are more likely to miss the light, which leads to a noisier image. Since the first image +produces less noise at lower iterations, the denoiser is able to produce an image much closer to the expected outcome quickly. + +| Iterations | Raw Pathtraced | Denoised | +|:----------:|:-------------:|:------:| +| 100 | ![](img/results/materials/specular_pathtraced_100samp.PNG) | ![](img/results/materials/specular_denoised_100samp.PNG) | +| 1000 | ![](img/results/materials/specular_pathtraced_1000samp.PNG) | ![](img/results/materials/specular_denoised_1000samp.PNG) | +

Figure 10. Denoising a scene with a large area light.

+ +| Iterations | Raw Pathtraced | Denoised | +|:----------:|:-------------:|:------:| +| 100 | ![](img/results/cornell_pathtraced_100samp.PNG) | ![](img/results/cornell_denoised_100samp.PNG) | +| 1000 | ![](img/results/cornell_pathtraced_1000samp.PNG) | ![](img/results/cornell_denoised_1000samp.PNG) | +

Figure 11. Denoising a scene with a small area light.

+ +### Performance Analysis + +We measure performance by timing the denoising kernel, which is only run once at the end of pathtracing. We use `cudaEvents` +to record the total execution time of the kernel. Note that we do not include the path-tracing time in the measurements, just the +additional time spent denoising. Additionally, the number displayed on the graphs are the average of 10 runs of the denoising kernel. + +For the following measurements, we use a `color_weight` of 25.0, a `normal_weight` of 0.35, and +a `position_weight` of 0.2. The resolution of the image is 800 x 800 unless otherwise noted. + +#### How Much Time Denoising Adds to Renders + +Denoising time is indepedent of total path tracing iterations since it only runs once at the end. +In the graph below, we can see that total denoising time did not vary much between iteration counts. + +![](img/results/graphs/denoising_iterations.png) +

Figure 12. Impact of increasing iteration count on denoising kernel execution time.

+ +#### Varying Filter Size + +If we increase filter size, we do see an increase in the total time spent denoising. This is because +we must perform more passes over the image (i.e. increase the number of denoising iterations). The filter size +directly determines the number of times we apply the À-Trous filter. If `filterSize = 5 * 2^(iterations)`, then +we can calculate iterations from filter size using `iterations = floor(log2(filterSize / 5))`. + +![](img/results/graphs/denoising_filtersize.png) +

Figure 13. Impact of increasing filter size on denoising kernel execution time.

+ +#### Number of Iterations to "Acceptably Smooth" Result + +Using 10,000 iterations as the "perfectly smoothed" reference, we'll take 1000 iterations as the "acceptably smoothed" result. +We observe that the difference image shows variation along the edges, but this is fine for our purposes since the largest areas of the scene +match. We will observe how many iterations it will take us to get a comparable result using denoising. + +| Raw Pathtraced (10000 iterations) | "Acceptably Smooth" Pathtraced (1000 iterations) | Difference | +|:----------: |:-------------: |:------:| +| ![](img/results/acceptably_smooth/cornell_ceiling_light_10000samp.PNG) | ![](img/results/acceptably_smooth/cornell_ceiling_light_1000samp2.PNG) | ![](img/results/acceptably_smooth/cornell_ceiling_10000_1000_diff.PNG) +

Figure 14. Comparison of the ground-truth result to an acceptably-smoothed image.

+ +With denoising, it takes about 100 iterations to achieve the "acceptably smooth" result. The difference image is about the same as the one above. +This is about a 90% decrease in iterations. + +| "Acceptably Smooth" Pathtraced (1000 iterations) | Denoised (100 iterations) | Difference | +|:----------: |:-------------: |:------:| +| ![](img/results/acceptably_smooth/cornell_ceiling_light_1000samp2.PNG) | ![](img/results/acceptably_smooth/cornell_ceiling_light_100samp_denoised.PNG) | ![](img/results/acceptably_smooth/cornell_ceiling_denoised_diff.PNG) +

Figure 15. Fewer iterations are necessary to achieved an acceptably-smoothed result using denoising.

+ +#### Denoising at Different Resolutions + +Denoising time increases with increasing resolution. This is because we have to run our filter over more pixels, which +will directly increase runtime. + +![](img/results/graphs/denoising_resolution.png) +

Figure 16. Impact of increasing image resolution on denoising kernel execution time.

+ +### Bloopers + +**Foggy Cornell** + +![](img/results/bloopers/blooper1.PNG) + +**Ghost Lights** + +![](img/results/bloopers/blooper2.PNG) + +### References + +* Paper - [Edge-Avoiding À-Trous Wavelet Transform for fast Global Illumination Filtering](https://jo.dreggn.org/home/2010_atrous.pdf) +* Convolution Filter - [The à trous algorithm](https://www.eso.org/sci/software/esomidas/doc/user/18NOV/volb/node317.html) +* UPenn CIS 565 Course Notes diff --git a/img/atrous_kernels.PNG b/img/atrous_kernels.PNG new file mode 100644 index 00000000..f758bc46 Binary files /dev/null and b/img/atrous_kernels.PNG differ diff --git a/img/cover_denoised_100.PNG b/img/cover_denoised_100.PNG new file mode 100644 index 00000000..52ea76b4 Binary files /dev/null and b/img/cover_denoised_100.PNG differ diff --git a/img/cover_denoised_1000.PNG b/img/cover_denoised_1000.PNG new file mode 100644 index 00000000..db9d0549 Binary files /dev/null and b/img/cover_denoised_1000.PNG differ diff --git a/img/cover_pathtraced_100.png b/img/cover_pathtraced_100.png new file mode 100644 index 00000000..2a1caf06 Binary files /dev/null and b/img/cover_pathtraced_100.png differ diff --git a/img/cover_pathtraced_1000.PNG b/img/cover_pathtraced_1000.PNG new file mode 100644 index 00000000..b67e6dfe Binary files /dev/null and b/img/cover_pathtraced_1000.PNG differ diff --git a/img/results/acceptably_smooth/cornell_ceiling_10000_1000_diff.PNG b/img/results/acceptably_smooth/cornell_ceiling_10000_1000_diff.PNG new file mode 100644 index 00000000..c48c7b61 Binary files /dev/null and b/img/results/acceptably_smooth/cornell_ceiling_10000_1000_diff.PNG differ diff --git a/img/results/acceptably_smooth/cornell_ceiling_denoised_diff.PNG b/img/results/acceptably_smooth/cornell_ceiling_denoised_diff.PNG new file mode 100644 index 00000000..26ce6b78 Binary files /dev/null and b/img/results/acceptably_smooth/cornell_ceiling_denoised_diff.PNG differ diff --git a/img/results/acceptably_smooth/cornell_ceiling_light_10000samp.PNG b/img/results/acceptably_smooth/cornell_ceiling_light_10000samp.PNG new file mode 100644 index 00000000..46750aad Binary files /dev/null and b/img/results/acceptably_smooth/cornell_ceiling_light_10000samp.PNG differ diff --git a/img/results/acceptably_smooth/cornell_ceiling_light_1000samp.PNG b/img/results/acceptably_smooth/cornell_ceiling_light_1000samp.PNG new file mode 100644 index 00000000..87786945 Binary files /dev/null and b/img/results/acceptably_smooth/cornell_ceiling_light_1000samp.PNG differ diff --git a/img/results/acceptably_smooth/cornell_ceiling_light_1000samp2.PNG b/img/results/acceptably_smooth/cornell_ceiling_light_1000samp2.PNG new file mode 100644 index 00000000..da38dcd8 Binary files /dev/null and b/img/results/acceptably_smooth/cornell_ceiling_light_1000samp2.PNG differ diff --git a/img/results/acceptably_smooth/cornell_ceiling_light_100samp_denoised.PNG b/img/results/acceptably_smooth/cornell_ceiling_light_100samp_denoised.PNG new file mode 100644 index 00000000..c54139d8 Binary files /dev/null and b/img/results/acceptably_smooth/cornell_ceiling_light_100samp_denoised.PNG differ diff --git a/img/results/base_cornell_ceiling_light_10iter.png b/img/results/base_cornell_ceiling_light_10iter.png new file mode 100644 index 00000000..163c3199 Binary files /dev/null and b/img/results/base_cornell_ceiling_light_10iter.png differ diff --git a/img/results/basic_blur/basic_blur_100samp.PNG b/img/results/basic_blur/basic_blur_100samp.PNG new file mode 100644 index 00000000..c5ad7e90 Binary files /dev/null and b/img/results/basic_blur/basic_blur_100samp.PNG differ diff --git a/img/results/basic_blur/cornell_10iter_100samp.PNG b/img/results/basic_blur/cornell_10iter_100samp.PNG new file mode 100644 index 00000000..c2f6a002 Binary files /dev/null and b/img/results/basic_blur/cornell_10iter_100samp.PNG differ diff --git a/img/results/basic_blur/cornell_1iter_100samp.PNG b/img/results/basic_blur/cornell_1iter_100samp.PNG new file mode 100644 index 00000000..3f5859b4 Binary files /dev/null and b/img/results/basic_blur/cornell_1iter_100samp.PNG differ diff --git a/img/results/basic_blur/cornell_2iter_100samp.PNG b/img/results/basic_blur/cornell_2iter_100samp.PNG new file mode 100644 index 00000000..e4251e29 Binary files /dev/null and b/img/results/basic_blur/cornell_2iter_100samp.PNG differ diff --git a/img/results/basic_blur/cornell_3iter_100samp.PNG b/img/results/basic_blur/cornell_3iter_100samp.PNG new file mode 100644 index 00000000..b6322be7 Binary files /dev/null and b/img/results/basic_blur/cornell_3iter_100samp.PNG differ diff --git a/img/results/basic_blur/cornell_4iter_100samp.PNG b/img/results/basic_blur/cornell_4iter_100samp.PNG new file mode 100644 index 00000000..78734733 Binary files /dev/null and b/img/results/basic_blur/cornell_4iter_100samp.PNG differ diff --git a/img/results/basic_blur/cornell_5iter_100samp.PNG b/img/results/basic_blur/cornell_5iter_100samp.PNG new file mode 100644 index 00000000..6c8e69e1 Binary files /dev/null and b/img/results/basic_blur/cornell_5iter_100samp.PNG differ diff --git a/img/results/basic_blur/cornell_6iter_100samp.PNG b/img/results/basic_blur/cornell_6iter_100samp.PNG new file mode 100644 index 00000000..1185f837 Binary files /dev/null and b/img/results/basic_blur/cornell_6iter_100samp.PNG differ diff --git a/img/results/basic_blur/cornell_7iter_100samp.PNG b/img/results/basic_blur/cornell_7iter_100samp.PNG new file mode 100644 index 00000000..cbf949b4 Binary files /dev/null and b/img/results/basic_blur/cornell_7iter_100samp.PNG differ diff --git a/img/results/basic_blur/cornell_8iter_100samp.PNG b/img/results/basic_blur/cornell_8iter_100samp.PNG new file mode 100644 index 00000000..8e5015e1 Binary files /dev/null and b/img/results/basic_blur/cornell_8iter_100samp.PNG differ diff --git a/img/results/basic_blur/cornell_9iter_100samp.PNG b/img/results/basic_blur/cornell_9iter_100samp.PNG new file mode 100644 index 00000000..3032f58b Binary files /dev/null and b/img/results/basic_blur/cornell_9iter_100samp.PNG differ diff --git a/img/results/basic_blur/edge_avoiding_100samp.PNG b/img/results/basic_blur/edge_avoiding_100samp.PNG new file mode 100644 index 00000000..c2bc7e73 Binary files /dev/null and b/img/results/basic_blur/edge_avoiding_100samp.PNG differ diff --git a/img/results/basic_blur/gimp_blur_100samp.PNG b/img/results/basic_blur/gimp_blur_100samp.PNG new file mode 100644 index 00000000..72b30948 Binary files /dev/null and b/img/results/basic_blur/gimp_blur_100samp.PNG differ diff --git a/img/results/basic_blur/no_blur_100samp.PNG b/img/results/basic_blur/no_blur_100samp.PNG new file mode 100644 index 00000000..dc621313 Binary files /dev/null and b/img/results/basic_blur/no_blur_100samp.PNG differ diff --git a/img/results/bloopers/blooper1.PNG b/img/results/bloopers/blooper1.PNG new file mode 100644 index 00000000..b9d4b933 Binary files /dev/null and b/img/results/bloopers/blooper1.PNG differ diff --git a/img/results/bloopers/blooper2.PNG b/img/results/bloopers/blooper2.PNG new file mode 100644 index 00000000..6c364e47 Binary files /dev/null and b/img/results/bloopers/blooper2.PNG differ diff --git a/img/results/cornell_ceiling_light_depth_10samp.PNG b/img/results/cornell_ceiling_light_depth_10samp.PNG new file mode 100644 index 00000000..28f4b08b Binary files /dev/null and b/img/results/cornell_ceiling_light_depth_10samp.PNG differ diff --git a/img/results/cornell_ceiling_light_normal_10samp.PNG b/img/results/cornell_ceiling_light_normal_10samp.PNG new file mode 100644 index 00000000..a24bcf12 Binary files /dev/null and b/img/results/cornell_ceiling_light_normal_10samp.PNG differ diff --git a/img/results/cornell_ceiling_light_position_10samp.PNG b/img/results/cornell_ceiling_light_position_10samp.PNG new file mode 100644 index 00000000..3bfe34f0 Binary files /dev/null and b/img/results/cornell_ceiling_light_position_10samp.PNG differ diff --git a/img/results/cornell_denoised_1000samp.PNG b/img/results/cornell_denoised_1000samp.PNG new file mode 100644 index 00000000..27ee5f2a Binary files /dev/null and b/img/results/cornell_denoised_1000samp.PNG differ diff --git a/img/results/cornell_denoised_100samp.PNG b/img/results/cornell_denoised_100samp.PNG new file mode 100644 index 00000000..d422c9d6 Binary files /dev/null and b/img/results/cornell_denoised_100samp.PNG differ diff --git a/img/results/cornell_pathtraced_1000samp.PNG b/img/results/cornell_pathtraced_1000samp.PNG new file mode 100644 index 00000000..e330cf1a Binary files /dev/null and b/img/results/cornell_pathtraced_1000samp.PNG differ diff --git a/img/results/cornell_pathtraced_100samp.PNG b/img/results/cornell_pathtraced_100samp.PNG new file mode 100644 index 00000000..b65fa11e Binary files /dev/null and b/img/results/cornell_pathtraced_100samp.PNG differ diff --git a/img/results/depth.PNG b/img/results/depth.PNG new file mode 100644 index 00000000..80cee3da Binary files /dev/null and b/img/results/depth.PNG differ diff --git a/img/results/filter_size/filter10.PNG b/img/results/filter_size/filter10.PNG new file mode 100644 index 00000000..7845998d Binary files /dev/null and b/img/results/filter_size/filter10.PNG differ diff --git a/img/results/filter_size/filter10_2.PNG b/img/results/filter_size/filter10_2.PNG new file mode 100644 index 00000000..e2a537c1 Binary files /dev/null and b/img/results/filter_size/filter10_2.PNG differ diff --git a/img/results/filter_size/filter1280.PNG b/img/results/filter_size/filter1280.PNG new file mode 100644 index 00000000..b8ffe49a Binary files /dev/null and b/img/results/filter_size/filter1280.PNG differ diff --git a/img/results/filter_size/filter1280_2.PNG b/img/results/filter_size/filter1280_2.PNG new file mode 100644 index 00000000..a8c77a47 Binary files /dev/null and b/img/results/filter_size/filter1280_2.PNG differ diff --git a/img/results/filter_size/filter160.PNG b/img/results/filter_size/filter160.PNG new file mode 100644 index 00000000..158077e0 Binary files /dev/null and b/img/results/filter_size/filter160.PNG differ diff --git a/img/results/filter_size/filter160_2.PNG b/img/results/filter_size/filter160_2.PNG new file mode 100644 index 00000000..2e9e91b1 Binary files /dev/null and b/img/results/filter_size/filter160_2.PNG differ diff --git a/img/results/filter_size/filter20.PNG b/img/results/filter_size/filter20.PNG new file mode 100644 index 00000000..e873d526 Binary files /dev/null and b/img/results/filter_size/filter20.PNG differ diff --git a/img/results/filter_size/filter20_2.PNG b/img/results/filter_size/filter20_2.PNG new file mode 100644 index 00000000..51b97aba Binary files /dev/null and b/img/results/filter_size/filter20_2.PNG differ diff --git a/img/results/filter_size/filter320.PNG b/img/results/filter_size/filter320.PNG new file mode 100644 index 00000000..aacaf84c Binary files /dev/null and b/img/results/filter_size/filter320.PNG differ diff --git a/img/results/filter_size/filter320_2.PNG b/img/results/filter_size/filter320_2.PNG new file mode 100644 index 00000000..b5f2713c Binary files /dev/null and b/img/results/filter_size/filter320_2.PNG differ diff --git a/img/results/filter_size/filter40.PNG b/img/results/filter_size/filter40.PNG new file mode 100644 index 00000000..4002e814 Binary files /dev/null and b/img/results/filter_size/filter40.PNG differ diff --git a/img/results/filter_size/filter40_2.PNG b/img/results/filter_size/filter40_2.PNG new file mode 100644 index 00000000..2154d3c8 Binary files /dev/null and b/img/results/filter_size/filter40_2.PNG differ diff --git a/img/results/filter_size/filter640.PNG b/img/results/filter_size/filter640.PNG new file mode 100644 index 00000000..53649a71 Binary files /dev/null and b/img/results/filter_size/filter640.PNG differ diff --git a/img/results/filter_size/filter640_2.PNG b/img/results/filter_size/filter640_2.PNG new file mode 100644 index 00000000..ed6ec329 Binary files /dev/null and b/img/results/filter_size/filter640_2.PNG differ diff --git a/img/results/filter_size/filter80.PNG b/img/results/filter_size/filter80.PNG new file mode 100644 index 00000000..62d91e73 Binary files /dev/null and b/img/results/filter_size/filter80.PNG differ diff --git a/img/results/filter_size/filter80_2.PNG b/img/results/filter_size/filter80_2.PNG new file mode 100644 index 00000000..94c69d2d Binary files /dev/null and b/img/results/filter_size/filter80_2.PNG differ diff --git a/img/results/graphs/denoising_filtersize.png b/img/results/graphs/denoising_filtersize.png new file mode 100644 index 00000000..c8a03c30 Binary files /dev/null and b/img/results/graphs/denoising_filtersize.png differ diff --git a/img/results/graphs/denoising_iterations.png b/img/results/graphs/denoising_iterations.png new file mode 100644 index 00000000..d3b02a10 Binary files /dev/null and b/img/results/graphs/denoising_iterations.png differ diff --git a/img/results/graphs/denoising_resolution.png b/img/results/graphs/denoising_resolution.png new file mode 100644 index 00000000..24594f53 Binary files /dev/null and b/img/results/graphs/denoising_resolution.png differ diff --git a/img/results/materials/diffuse_denoised_1000samp.PNG b/img/results/materials/diffuse_denoised_1000samp.PNG new file mode 100644 index 00000000..65329499 Binary files /dev/null and b/img/results/materials/diffuse_denoised_1000samp.PNG differ diff --git a/img/results/materials/diffuse_denoised_100samp.PNG b/img/results/materials/diffuse_denoised_100samp.PNG new file mode 100644 index 00000000..49eec7ea Binary files /dev/null and b/img/results/materials/diffuse_denoised_100samp.PNG differ diff --git a/img/results/materials/diffuse_pathtraced_10000samp.PNG b/img/results/materials/diffuse_pathtraced_10000samp.PNG new file mode 100644 index 00000000..bf0dca81 Binary files /dev/null and b/img/results/materials/diffuse_pathtraced_10000samp.PNG differ diff --git a/img/results/materials/diffuse_pathtraced_1000samp.PNG b/img/results/materials/diffuse_pathtraced_1000samp.PNG new file mode 100644 index 00000000..c0138cd9 Binary files /dev/null and b/img/results/materials/diffuse_pathtraced_1000samp.PNG differ diff --git a/img/results/materials/diffuse_pathtraced_100samp.PNG b/img/results/materials/diffuse_pathtraced_100samp.PNG new file mode 100644 index 00000000..5d5ef493 Binary files /dev/null and b/img/results/materials/diffuse_pathtraced_100samp.PNG differ diff --git a/img/results/materials/specular_denoised_1000samp.PNG b/img/results/materials/specular_denoised_1000samp.PNG new file mode 100644 index 00000000..53f106f8 Binary files /dev/null and b/img/results/materials/specular_denoised_1000samp.PNG differ diff --git a/img/results/materials/specular_denoised_100samp.PNG b/img/results/materials/specular_denoised_100samp.PNG new file mode 100644 index 00000000..8f354445 Binary files /dev/null and b/img/results/materials/specular_denoised_100samp.PNG differ diff --git a/img/results/materials/specular_pathtraced_10000samp.PNG b/img/results/materials/specular_pathtraced_10000samp.PNG new file mode 100644 index 00000000..ef4004fd Binary files /dev/null and b/img/results/materials/specular_pathtraced_10000samp.PNG differ diff --git a/img/results/materials/specular_pathtraced_1000samp.PNG b/img/results/materials/specular_pathtraced_1000samp.PNG new file mode 100644 index 00000000..e0f1fd18 Binary files /dev/null and b/img/results/materials/specular_pathtraced_1000samp.PNG differ diff --git a/img/results/materials/specular_pathtraced_100samp.PNG b/img/results/materials/specular_pathtraced_100samp.PNG new file mode 100644 index 00000000..cf6a7f15 Binary files /dev/null and b/img/results/materials/specular_pathtraced_100samp.PNG differ diff --git a/img/results/normals.PNG b/img/results/normals.PNG new file mode 100644 index 00000000..a41dad03 Binary files /dev/null and b/img/results/normals.PNG differ diff --git a/img/results/positions.PNG b/img/results/positions.PNG new file mode 100644 index 00000000..0b6c6176 Binary files /dev/null and b/img/results/positions.PNG differ diff --git a/scenes/cornell.txt b/scenes/cornell.txt index 83ff8202..d6155ea2 100644 --- a/scenes/cornell.txt +++ b/scenes/cornell.txt @@ -52,7 +52,7 @@ EMITTANCE 0 CAMERA RES 800 800 FOVY 45 -ITERATIONS 5000 +ITERATIONS 1000 DEPTH 8 FILE cornell EYE 0.0 5 10.5 diff --git a/scenes/cornell_ceiling_light.txt b/scenes/cornell_ceiling_light.txt index 15af5f19..113033d5 100644 --- a/scenes/cornell_ceiling_light.txt +++ b/scenes/cornell_ceiling_light.txt @@ -52,7 +52,7 @@ EMITTANCE 0 CAMERA RES 800 800 FOVY 45 -ITERATIONS 10 +ITERATIONS 100 DEPTH 8 FILE cornell EYE 0.0 5 10.5 diff --git a/scenes/custom_scene.txt b/scenes/custom_scene.txt new file mode 100644 index 00000000..dc53e370 --- /dev/null +++ b/scenes/custom_scene.txt @@ -0,0 +1,188 @@ +// Emissive material (light) +MATERIAL 0 +RGB 1 1 1 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 5 + +// Diffuse white +MATERIAL 1 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Diffuse red +MATERIAL 2 +RGB .85 .35 .55 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Diffuse green +MATERIAL 3 +RGB .35 .85 .35 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Specular white +MATERIAL 4 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB .98 .98 .98 +REFL 1 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Diffuse aqua +MATERIAL 5 +RGB 0.4 0.9 0.6 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Diffuse grey +MATERIAL 6 +RGB 0.6 0.6 0.6 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Diffuse blue +MATERIAL 7 +RGB 0.5 0.5 0.9 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Specular pink +MATERIAL 8 +RGB 0.98 0.98 0.98 +SPECEX 0 +SPECRGB 0.9 0.6 0.9 +REFL 1 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Camera +CAMERA +RES 1000 800 +FOVY 35 +ITERATIONS 1000 +DEPTH 8 +FILE custom_scene +EYE 0.0 5 10.5 +LOOKAT 0 5 0 +UP 0 1 0 + +// Ceiling light middle +OBJECT 0 +cube +material 0 +TRANS 0 10 0 +ROTAT 0 0 0 +SCALE 3 .3 3 + +// Ceiling light left +OBJECT 1 +cube +material 0 +TRANS -5 10 0 +ROTAT 0 0 0 +SCALE 3 .3 3 + +// Ceiling light right +OBJECT 2 +cube +material 0 +TRANS 5 10 0 +ROTAT 0 0 0 +SCALE 3 .3 3 + +// Floor +OBJECT 3 +cube +material 1 +TRANS 0 0 0 +ROTAT 0 0 0 +SCALE 20 .01 40 + +// Ceiling +OBJECT 4 +cube +material 2 +TRANS 0 10 0 +ROTAT 0 0 90 +SCALE .01 20 40 + +// Back wall +OBJECT 5 +cube +material 1 +TRANS 0 5 -5 +ROTAT 0 90 0 +SCALE .01 10 20 + +// Left wall +OBJECT 6 +cube +material 5 +TRANS -10 5 0 +ROTAT 0 0 0 +SCALE .01 10 10 + +// Right wall +OBJECT 7 +cube +material 5 +TRANS 10 5 0 +ROTAT 0 0 0 +SCALE .01 10 10 + +// Sphere +OBJECT 8 +sphere +material 1 +TRANS 2.0 1.25 -1 +ROTAT 0 0 0 +SCALE 3 3 3 + +// Cube +OBJECT 9 +cube +material 1 +TRANS -0.9 3 -3 +ROTAT 0 27.5 0 +SCALE 3 6 3 + +// Sphere +OBJECT 10 +sphere +material 8 +TRANS -2.75 1.25 0.1 +ROTAT 0 0 0 +SCALE 3 3 3 diff --git a/src/main.cpp b/src/main.cpp index 4092ae4a..922cf748 100644 --- a/src/main.cpp +++ b/src/main.cpp @@ -23,12 +23,14 @@ int ui_iterations = 0; int startupIterations = 0; int lastLoopIterations = 0; bool ui_showGbuffer = false; +int ui_currentBuffer = 0; bool ui_denoise = false; int ui_filterSize = 80; -float ui_colorWeight = 0.45f; -float ui_normalWeight = 0.35f; -float ui_positionWeight = 0.2f; +float ui_colorWeight = 150.f; +float ui_normalWeight = 0.5f; +float ui_positionWeight = 0.4f; bool ui_saveAndExit = false; +int denoise_call_count = 0; static bool camchanged = true; static float dtheta = 0, dphi = 0; @@ -144,6 +146,7 @@ void runCuda() { cameraPosition += cam.lookAt; cam.position = cameraPosition; camchanged = false; + //ui_denoise = true; } // Map OpenGL buffer object for writing from CUDA on a single GPU @@ -166,9 +169,16 @@ void runCuda() { } if (ui_showGbuffer) { - showGBuffer(pbo_dptr); - } else { - showImage(pbo_dptr, iteration); + showGBuffer(pbo_dptr, ui_currentBuffer); + } + else if (ui_denoise) { + if (iteration == ui_iterations) { + denoise(pbo_dptr, iteration, ui_filterSize, ui_colorWeight, ui_normalWeight, ui_positionWeight, denoise_call_count); + denoise_call_count++; + } + } + else { + showImage(pbo_dptr, iteration); } // unmap buffer object diff --git a/src/main.h b/src/main.h index 06d311a8..6bedbe47 100644 --- a/src/main.h +++ b/src/main.h @@ -35,6 +35,7 @@ extern int height; extern int ui_iterations; extern int startupIterations; extern bool ui_showGbuffer; +extern int ui_currentBuffer; extern bool ui_denoise; extern int ui_filterSize; extern float ui_colorWeight; diff --git a/src/pathtrace.cu b/src/pathtrace.cu index 23e5f909..d1b7a9da 100644 --- a/src/pathtrace.cu +++ b/src/pathtrace.cu @@ -15,6 +15,7 @@ #include "interactions.h" #define ERRORCHECK 1 +#define DENOISE_ITERATIONS 5 #define FILENAME (strrchr(__FILE__, '/') ? strrchr(__FILE__, '/') + 1 : __FILE__) #define checkCUDAError(msg) checkCUDAErrorFn(msg, FILENAME, __LINE__) @@ -38,12 +39,108 @@ void checkCUDAErrorFn(const char *msg, const char *file, int line) { #endif } +PerformanceTimer& timer() +{ + static PerformanceTimer timer; + return timer; +} + +template +void printElapsedTime(T time, std::string note = "") +{ + std::cout << " elapsed time: " << time << "ms " << note << std::endl; +} + __host__ __device__ thrust::default_random_engine makeSeededRandomEngine(int iter, int index, int depth) { int h = utilhash((1 << 31) | (depth << 22) | iter) ^ utilhash(index); return thrust::default_random_engine(h); } +__global__ void denoiseBasicBlur(glm::vec3* denoisedImage, glm::ivec2 resolution, + int iter, int stepsize, glm::vec3* image, GBufferPixel* dev_gBuffer, + float* dev_kernel, glm::ivec2* dev_offset) { + int x = (blockIdx.x * blockDim.x) + threadIdx.x; + int y = (blockIdx.y * blockDim.y) + threadIdx.y; + + if (x < resolution.x && y < resolution.y) { + int index = x + (y * resolution.x); + + glm::vec3 sum = glm::vec3(0.f, 0.f, 0.f); + for (int i = 0; i < 25; i++) { + glm::ivec2 offset = dev_offset[i] * stepsize; + glm::ivec2 uv = glm::ivec2(x, y) + offset; + + // Clamp indices to image width and height + uv = glm::clamp(uv, glm::ivec2(0, 0), glm::ivec2(resolution.x - 1, resolution.y - 1)); + + // Apply kernel + glm::vec3 col = image[uv.x + resolution.x * uv.y]; + sum += col * dev_kernel[i]; + } + + // Write color to OpenGL PBO + denoisedImage[index] = sum; + } +} + +__global__ void denoiseWeighted(glm::vec3* denoisedImage, glm::ivec2 resolution, + int iter, int stepsize, float sigma_col, float sigma_norm, float sigma_pos, + glm::vec3* image, GBufferPixel* dev_gBuffer, + float* dev_kernel, glm::ivec2* dev_offset) { + int x = (blockIdx.x * blockDim.x) + threadIdx.x; + int y = (blockIdx.y * blockDim.y) + threadIdx.y; + + if (x < resolution.x && y < resolution.y) { + int index = x + (y * resolution.x); + + // Read G-Buffer values for current pixel + glm::vec3 col = image[index]; + glm::vec3 norm = dev_gBuffer[index].normal; + glm::vec3 pos = dev_gBuffer[index].position; + + // Sum to accumulate color + glm::vec3 sum = glm::vec3(0.f, 0.f, 0.f); + float sum_weights = 0.f; + + glm::vec3 test_col; + + for (int i = 0; i < 25; i++) { + // Find pixel in image at desired offset (clamp to image boundaries) + glm::ivec2 offset = dev_offset[i] * stepsize; + glm::ivec2 uv = glm::ivec2(x, y) + offset; + uv = clamp(uv, glm::ivec2(0, 0), glm::ivec2(resolution.x - 1, resolution.y - 1)); + + // Find difference in color between current and neighboring pixel + glm::vec3 col_n = image[uv.x + resolution.x * uv.y]; + glm::vec3 col_diff = col - col_n; + float dist2 = glm::dot(col_diff, col_diff); + float col_w = glm::min(glm::exp(-(dist2) / (sigma_col * sigma_col)), 1.f); + + // Normal + glm::vec3 norm_n = dev_gBuffer[uv.x + resolution.x * uv.y].normal; + glm::vec3 norm_diff = norm - norm_n; + dist2 = glm::max(glm::dot(norm_diff, norm_diff) / (float)(stepsize * stepsize), 0.f); + float norm_w = glm::min(glm::exp(-(dist2) / (sigma_norm * sigma_norm)), 1.f); + + // Position + glm::vec3 pos_n = dev_gBuffer[uv.x + resolution.x * uv.y].position; + glm::vec3 pos_diff = pos - pos_n; + dist2 = glm::dot(pos_diff, pos_diff); + float pos_w = glm::min(glm::exp(-(dist2) / (sigma_pos * sigma_pos)), 1.f); + + // Calculate weighting + float weight = col_w * norm_w * pos_w; + sum += col_n * weight * dev_kernel[i]; + sum_weights += weight * dev_kernel[i]; + } + + // Write color to OpenGL PBO + denoisedImage[index] = sum / sum_weights; + } + +} + //Kernel that writes the image to the OpenGL PBO directly. __global__ void sendImageToPBO(uchar4* pbo, glm::ivec2 resolution, int iter, glm::vec3* image) { @@ -67,39 +164,84 @@ __global__ void sendImageToPBO(uchar4* pbo, glm::ivec2 resolution, } } -__global__ void gbufferToPBO(uchar4* pbo, glm::ivec2 resolution, GBufferPixel* gBuffer) { +__global__ void gbufferToPBO(uchar4* pbo, int ui_currentBuffer, glm::ivec2 resolution, GBufferPixel* gBuffer) { int x = (blockIdx.x * blockDim.x) + threadIdx.x; int y = (blockIdx.y * blockDim.y) + threadIdx.y; if (x < resolution.x && y < resolution.y) { int index = x + (y * resolution.x); float timeToIntersect = gBuffer[index].t * 256.0; - - pbo[index].w = 0; - pbo[index].x = timeToIntersect; - pbo[index].y = timeToIntersect; - pbo[index].z = timeToIntersect; + glm::vec3 position = 0.1f * abs(gBuffer[index].position); + glm::vec3 normal = 0.5f * (gBuffer[index].normal + glm::vec3(1.f, 1.f, 1.f)); + + if (ui_currentBuffer == 0) { + pbo[index].w = 0; + pbo[index].x = timeToIntersect; + pbo[index].y = timeToIntersect; + pbo[index].z = timeToIntersect; + } + else if (ui_currentBuffer == 1) { + pbo[index].w = 0; + pbo[index].x = normal.x * 255.0; + pbo[index].y = normal.y * 255.0; + pbo[index].z = normal.z * 255.0; + } + else { + pbo[index].w = 0; + pbo[index].x = position.x * 256.0; + pbo[index].y = position.y * 256.0; + pbo[index].z = position.z * 256.0; + } } } static Scene * hst_scene = NULL; static glm::vec3 * dev_image = NULL; +static glm::vec3* dev_denoised_image_in = NULL; +static glm::vec3* dev_denoised_image_out = NULL; static Geom * dev_geoms = NULL; static Material * dev_materials = NULL; static PathSegment * dev_paths = NULL; static ShadeableIntersection * dev_intersections = NULL; static GBufferPixel* dev_gBuffer = NULL; +static float* dev_kernel = NULL; +static glm::ivec2* dev_offsets = NULL; // TODO: static variables for device memory, any extra info you need, etc // ... +glm::ivec2 offsets[25]; + +float kernel[25] = { 1.f / 256.f, 1.f / 64.f, 3.f / 128.f, 1.f / 64.f, 1.f / 256.f, + 1.f / 64.f, 1.f / 16.f, 3.f / 32.f, 1.f / 16.f, 1 / 64.f, + 3.f / 128.f, 3.f / 32.f, 9.f / 64.f, 3.f / 32.f, 3.f / 128.f, + 1.f / 64.f, 1.f / 16.f, 3.f / 32.f, 1.f / 16.f, 1 / 64.f, + 1.f / 256.f, 1.f / 64.f, 3.f / 128.f, 1.f / 64.f, 1.f / 256.f }; + void pathtraceInit(Scene *scene) { hst_scene = scene; const Camera &cam = hst_scene->state.camera; const int pixelcount = cam.resolution.x * cam.resolution.y; + // Fill offset array + int count = 0; + for (int j = -2; j <= 2; ++j) { + for (int i = -2; i <= 2; ++i) { + offsets[count] = glm::ivec2(i, j); + //std::cout << "(" << count << "): " << "(" << offsets[count].x << ", " << offsets[count].y << ")" << std::endl; + ++count; + } + } + cudaMalloc(&dev_image, pixelcount * sizeof(glm::vec3)); cudaMemset(dev_image, 0, pixelcount * sizeof(glm::vec3)); + // To store intermediate results after denoising + cudaMalloc(&dev_denoised_image_in, pixelcount * sizeof(glm::vec3)); + cudaMemset(dev_denoised_image_in, 0, pixelcount * sizeof(glm::vec3)); + + cudaMalloc(&dev_denoised_image_out, pixelcount * sizeof(glm::vec3)); + cudaMemset(dev_denoised_image_out, 0, pixelcount * sizeof(glm::vec3)); + cudaMalloc(&dev_paths, pixelcount * sizeof(PathSegment)); cudaMalloc(&dev_geoms, scene->geoms.size() * sizeof(Geom)); @@ -114,6 +256,11 @@ void pathtraceInit(Scene *scene) { cudaMalloc(&dev_gBuffer, pixelcount * sizeof(GBufferPixel)); // TODO: initialize any extra device memeory you need + cudaMalloc(&dev_kernel, 25 * sizeof(float)); + cudaMemcpy(dev_kernel, kernel, 25 * sizeof(float), cudaMemcpyHostToDevice); + + cudaMalloc(&dev_offsets, 25 * sizeof(glm::ivec2)); + cudaMemcpy(dev_offsets, offsets, 25 * sizeof(glm::ivec2), cudaMemcpyHostToDevice); checkCUDAError("pathtraceInit"); } @@ -126,6 +273,10 @@ void pathtraceFree() { cudaFree(dev_intersections); cudaFree(dev_gBuffer); // TODO: clean up any extra device memory you created + cudaFree(dev_denoised_image_in); + cudaFree(dev_denoised_image_out); + cudaFree(dev_kernel); + cudaFree(dev_offsets); checkCUDAError("pathtraceFree"); } @@ -148,7 +299,7 @@ __global__ void generateRayFromCamera(Camera cam, int iter, int traceDepth, Path PathSegment & segment = pathSegments[index]; segment.ray.origin = cam.position; - segment.color = glm::vec3(1.0f, 1.0f, 1.0f); + segment.color = glm::vec3(1.0f, 1.0f, 1.0f); segment.ray.direction = glm::normalize(cam.view - cam.right * cam.pixelLength.x * ((float)x - (float)cam.resolution.x * 0.5f) @@ -281,7 +432,9 @@ __global__ void generateGBuffer ( int idx = blockIdx.x * blockDim.x + threadIdx.x; if (idx < num_paths) { - gBuffer[idx].t = shadeableIntersections[idx].t; + gBuffer[idx].t = shadeableIntersections[idx].t * 0.05f; + gBuffer[idx].position = getPointOnRay(pathSegments[idx].ray, shadeableIntersections[idx].t); + gBuffer[idx].normal = shadeableIntersections[idx].surfaceNormal; } } @@ -356,46 +509,47 @@ void pathtrace(int frame, int iter) { // --- PathSegment Tracing Stage --- // Shoot ray into scene, bounce between objects, push shading chunks - // Empty gbuffer - cudaMemset(dev_gBuffer, 0, pixelcount * sizeof(GBufferPixel)); + // Empty gbuffer + cudaMemset(dev_gBuffer, 0, pixelcount * sizeof(GBufferPixel)); // clean shading chunks cudaMemset(dev_intersections, 0, pixelcount * sizeof(ShadeableIntersection)); - bool iterationComplete = false; + bool iterationComplete = false; while (!iterationComplete) { - // tracing - dim3 numblocksPathSegmentTracing = (num_paths + blockSize1d - 1) / blockSize1d; - computeIntersections <<>> ( - depth - , num_paths - , dev_paths - , dev_geoms - , hst_scene->geoms.size() - , dev_intersections - ); - checkCUDAError("trace one bounce"); - cudaDeviceSynchronize(); - - if (depth == 0) { - generateGBuffer<<>>(num_paths, dev_intersections, dev_paths, dev_gBuffer); - } - - depth++; - - shadeSimpleMaterials<<>> ( - iter, - num_paths, - dev_intersections, - dev_paths, - dev_materials - ); - iterationComplete = depth == traceDepth; + // tracing + dim3 numblocksPathSegmentTracing = (num_paths + blockSize1d - 1) / blockSize1d; + computeIntersections <<>> ( + depth + , num_paths + , dev_paths + , dev_geoms + , hst_scene->geoms.size() + , dev_intersections + ); + checkCUDAError("trace one bounce"); + cudaDeviceSynchronize(); + + // For first depth, generate gBuffer + if (depth == 0) { + generateGBuffer<<>>(num_paths, dev_intersections, dev_paths, dev_gBuffer); + } + + depth++; + + shadeSimpleMaterials<<>> ( + iter, + num_paths, + dev_intersections, + dev_paths, + dev_materials + ); + iterationComplete = depth == traceDepth; } - // Assemble this iteration and apply it to the image - dim3 numBlocksPixels = (pixelcount + blockSize1d - 1) / blockSize1d; + // Assemble this iteration and apply it to the image + dim3 numBlocksPixels = (pixelcount + blockSize1d - 1) / blockSize1d; finalGather<<>>(num_paths, dev_image, dev_paths); /////////////////////////////////////////////////////////////////////////// @@ -410,7 +564,7 @@ void pathtrace(int frame, int iter) { } // CHECKITOUT: this kernel "post-processes" the gbuffer/gbuffers into something that you can visualize for debugging. -void showGBuffer(uchar4* pbo) { +void showGBuffer(uchar4* pbo, int ui_currentBuffer) { const Camera &cam = hst_scene->state.camera; const dim3 blockSize2d(8, 8); const dim3 blocksPerGrid2d( @@ -418,7 +572,47 @@ void showGBuffer(uchar4* pbo) { (cam.resolution.y + blockSize2d.y - 1) / blockSize2d.y); // CHECKITOUT: process the gbuffer results and send them to OpenGL buffer for visualization - gbufferToPBO<<>>(pbo, cam.resolution, dev_gBuffer); + gbufferToPBO<<>>(pbo, ui_currentBuffer, cam.resolution, dev_gBuffer); +} + +void denoise(uchar4* pbo, int iter, int filterSize, float sigma_col, float sigma_norm, float sigma_pos, int call_count) { + const Camera& cam = hst_scene->state.camera; + const dim3 blockSize2d(8, 8); + const dim3 blocksPerGrid2d( + (cam.resolution.x + blockSize2d.x - 1) / blockSize2d.x, + (cam.resolution.y + blockSize2d.y - 1) / blockSize2d.y); + + timer().startGpuTimer(); + // Copy initial image input data + int pixelcount = cam.resolution.x * cam.resolution.y; + cudaMemcpy(dev_denoised_image_in, dev_image, pixelcount * sizeof(glm::vec3), cudaMemcpyDeviceToDevice); + + // Calculate iterations based on filter size + int num_iterations = glm::floor(glm::log2(filterSize / 5.f)); + + // Denoise image + int stepsize = 1; + float sigma_col_div = sigma_col; + for (int i = 0; i < num_iterations; ++i) { + denoiseWeighted << > > (dev_denoised_image_out, cam.resolution, iter, stepsize, + sigma_col_div, sigma_norm, sigma_pos, + dev_denoised_image_in, dev_gBuffer, dev_kernel, dev_offsets); + + // Do not swap on last iterations + if (i != num_iterations - 1) { + std::swap(dev_denoised_image_in, dev_denoised_image_out); + } + stepsize *= 2; + sigma_col_div /= (float)stepsize; + } + + // Send results to OpenGL buffer for rendering + sendImageToPBO << > > (pbo, cam.resolution, iter, dev_denoised_image_out); + timer().endGpuTimer(); + + if (call_count < 10) { + std::cout << timer().getGpuElapsedTimeForPreviousOperation() << std::endl; + } } void showImage(uchar4* pbo, int iter) { diff --git a/src/pathtrace.h b/src/pathtrace.h index 9e12f440..2ac23c36 100644 --- a/src/pathtrace.h +++ b/src/pathtrace.h @@ -1,10 +1,104 @@ #pragma once +#include #include #include "scene.h" void pathtraceInit(Scene *scene); void pathtraceFree(); void pathtrace(int frame, int iteration); -void showGBuffer(uchar4 *pbo); +void showGBuffer(uchar4 *pbo, int ui_currentBuffer); +void denoise(uchar4* pbo, int iter, int filterSize, float sigma_col, float sigma_norm, float sigma_pos, int call_count); void showImage(uchar4 *pbo, int iter); + +/** +* This class is used for timing the performance +* Uncopyable and unmovable +* +* Adapted from WindyDarian(https://github.com/WindyDarian) +*/ +class PerformanceTimer +{ +public: + PerformanceTimer() + { + cudaEventCreate(&event_start); + cudaEventCreate(&event_end); + } + + ~PerformanceTimer() + { + cudaEventDestroy(event_start); + cudaEventDestroy(event_end); + } + + void startCpuTimer() + { + if (cpu_timer_started) { throw std::runtime_error("CPU timer already started"); } + cpu_timer_started = true; + + time_start_cpu = std::chrono::high_resolution_clock::now(); + } + + void endCpuTimer() + { + time_end_cpu = std::chrono::high_resolution_clock::now(); + + if (!cpu_timer_started) { throw std::runtime_error("CPU timer not started"); } + + std::chrono::duration duro = time_end_cpu - time_start_cpu; + prev_elapsed_time_cpu_milliseconds = + static_cast(duro.count()); + + cpu_timer_started = false; + } + + void startGpuTimer() + { + if (gpu_timer_started) { throw std::runtime_error("GPU timer already started"); } + gpu_timer_started = true; + + cudaEventRecord(event_start); + } + + void endGpuTimer() + { + cudaEventRecord(event_end); + cudaEventSynchronize(event_end); + + if (!gpu_timer_started) { throw std::runtime_error("GPU timer not started"); } + + cudaEventElapsedTime(&prev_elapsed_time_gpu_milliseconds, event_start, event_end); + gpu_timer_started = false; + } + + float getCpuElapsedTimeForPreviousOperation() //noexcept //(damn I need VS 2015 + { + return prev_elapsed_time_cpu_milliseconds; + } + + float getGpuElapsedTimeForPreviousOperation() //noexcept + { + return prev_elapsed_time_gpu_milliseconds; + } + + // remove copy and move functions + PerformanceTimer(const PerformanceTimer&) = delete; + PerformanceTimer(PerformanceTimer&&) = delete; + PerformanceTimer& operator=(const PerformanceTimer&) = delete; + PerformanceTimer& operator=(PerformanceTimer&&) = delete; + +private: + cudaEvent_t event_start = nullptr; + cudaEvent_t event_end = nullptr; + + using time_point_t = std::chrono::high_resolution_clock::time_point; + time_point_t time_start_cpu; + time_point_t time_end_cpu; + + bool cpu_timer_started = false; + bool gpu_timer_started = false; + + float prev_elapsed_time_cpu_milliseconds = 0.f; + float prev_elapsed_time_gpu_milliseconds = 0.f; +}; diff --git a/src/preview.cpp b/src/preview.cpp index 3ca27180..d78f69b5 100644 --- a/src/preview.cpp +++ b/src/preview.cpp @@ -215,13 +215,14 @@ void drawGui(int windowWidth, int windowHeight) { ImGui::Checkbox("Denoise", &ui_denoise); ImGui::SliderInt("Filter Size", &ui_filterSize, 0, 100); - ImGui::SliderFloat("Color Weight", &ui_colorWeight, 0.0f, 10.0f); + ImGui::SliderFloat("Color Weight", &ui_colorWeight, 0.0f, 80.0f); ImGui::SliderFloat("Normal Weight", &ui_normalWeight, 0.0f, 10.0f); ImGui::SliderFloat("Position Weight", &ui_positionWeight, 0.0f, 10.0f); ImGui::Separator(); ImGui::Checkbox("Show GBuffer", &ui_showGbuffer); + ImGui::Combo("Buffer", &ui_currentBuffer, "Depth\0Normal\0Position\0"); ImGui::Separator(); diff --git a/src/sceneStructs.h b/src/sceneStructs.h index da7e558a..de38ec2c 100644 --- a/src/sceneStructs.h +++ b/src/sceneStructs.h @@ -79,4 +79,6 @@ struct ShadeableIntersection { // What information might be helpful for guiding a denoising filter? struct GBufferPixel { float t; + glm::vec3 position; + glm::vec3 normal; };