diff --git a/README.md b/README.md
index f044c821..258fadcc 100644
--- a/README.md
+++ b/README.md
@@ -3,11 +3,207 @@ CUDA Denoiser For CUDA Path Tracer
 
 **University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 4**
 
-* (TODO) YOUR NAME HERE
-* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
+* Megan Reddy
+  * [LinkedIn](https://www.linkedin.com/in/meganr25a949125/), [personal website](https://meganr28.github.io/)
+* Tested on: Windows 10, AMD Ryzen 9 5900HS with Radeon Graphics @ 3301 MHz 16GB, NVIDIA GeForce RTX 3060 Laptop GPU 6GB (Personal Computer)
+* Compute Capability: 8.6
 
-### (TODO: Your README)
+### Overview
 
-*DO NOT* leave the README to the last minute! It is a crucial part of the
-project, and we will not be able to grade you without a good README.
+Denoising is a technique used to remove noise from path-traced images. In scenes where ray paths are unlikely to hit light sources, we perceive 
+a lot of noise. In real-time ray tracing, we often desire high quality images that can be rendered at interactive rates. In traditional path tracing,
+we must take hundreds of samples per pixel to achieve the high visual quality we desire. For real-time applications, it is impractical to wait such a long time
+for a frame to render, so we desire a technique that can achieve an acceptably "smooth" image in fewer iterations. This is where denoising comes in handy.
+
+| Iterations      |      Raw Pathtraced      |  Denoised |
+|:----------:       |:-------------:           |:------:|
+| 100             |  ![](img/cover_pathtraced_100.png) | ![](img/cover_denoised_100.PNG) |
+| 1000            |  ![](img/cover_pathtraced_1000.PNG)| ![](img/cover_denoised_1000.PNG) | 
+<p align="center"><em>Figure 1. Raw path-traced and denoised image comparison with 80 x 80 À-Trous filter. </em></p>
+
+#### Edge-Avoiding À-Trous Wavelet Transform
+
+This implementation uses the technique described in the paper [Edge-Avoiding À-Trous Wavelet Transform for fast Global Illumination Filtering](https://jo.dreggn.org/home/2010_atrous.pdf)
+to implement denoising. The basis of the technique is to take in a noisy path-traced image, as well as information from a G-buffer (normals and positions), to strategically perform a blur on the image to 
+remove noise. The À-Trous filtering technique is an approximation of Gaussian blur, but instead uses a fixed kernel size. In our implementation, the kernel remains at a fixed size of 5x5, but we increasing
+space out samples each iteration instead of using a larger filter. The step size increases by a factor of 2 each iteration. The illustration below demonstrates this idea.
+
+![](img/atrous_kernels.PNG)
+<p align="center"><em>Figure 2. Spacing of À-Trous kernel at iteration 0, 1, and 2 (left to right). </em></p>
+
+At the first path tracing iteration, we store information in our G-buffer. We store intersection depth, normal, and position in a `GBufferPixel` struct to read later.
+We use this information to detect edges in our filtering scheme. When we blur an image, we want to preserve edges between objects. If normals or positions differ significantly between pixels,
+we most likely have encountered an edge. A visualization of the normal, position, and depth buffers is provided below. 
+
+ Per-Pixel Normals (remapped to [0, 1]) | Per-Pixel Positions (abs value and scaled down) | Per-Pixel Depth |
+|---|---|---|
+|![](img/results/normals.PNG)|![](img/results/positions.PNG)|![](img/results/depth.PNG)|
+<p align="center"><em>Figure 3. G-buffer visualizations </em></p>
+
+Our next step is implement a weighted filter that we will use to gather an accumulated color for each pixel. Without any weighting, the result
+of denoising is simply a blur across the entire image since we have no edge detection scheme in place. We have provided a comparison to GIMP's Gaussian blur
+in the table below. We notice that the À-Trous filter and Gaussian filter produce similar results.
+
+ Raw Pathtraced (100 iterations) | Simple Blur 80x80 Filter | GIMP Gaussian Blur 80x80 Filter |
+|---|---|---|
+|![](img/results/basic_blur/no_blur_100samp.PNG)|![](img/results/basic_blur/basic_blur_100samp.PNG)|![](img/results/basic_blur/gimp_blur_100samp.PNG)|
+<p align="center"><em>Figure 4. A comparison of basic blur between the À-Trous filter and GIMP's Gaussian filter (100 iterations). Note that the À-Trous algorithm is an approximation of Gaussian blur.</em></p>
+
+Lastly, we implement edge-avoiding filtering using the weighting function described in the paper. We compute the edge-stopping function for the path-traced pixel color,
+pixel normal, and pixel position (see Equation 5 in the paper) and multiply these together to get a pixel weight. We attenuate the current color by the weight and kernel value and
+add this to the accumulated sum. We also keep a cumulative sum of weights. At the end, we set the denoised pixel color equal to the `accumulated_color_sum / weight_sum`. 
+
+ Raw Pathtraced (100 iterations) | Simple Blur 80x80 Filter | Edge-Avoiding 80x80 Filter |
+|---|---|---|
+|![](img/results/basic_blur/no_blur_100samp.PNG)|![](img/results/basic_blur/basic_blur_100samp.PNG)|![](img/results/basic_blur/edge_avoiding_100samp.PNG)|
+<p align="center"><em>Figure 5. Result of adding in weighting functions. We use a color weight of 25.0, normal weight of 0.35, and position weight of 0.2.</em></p>
+
+### GUI Controls
+
+* `iterations`     - number of path tracing iterations.
+* `denoise`         - check to show denoised image.
+* `filter size`     - size of À-Trous filter. Determines the number of filter passes.
+* `color weight`     - sigma value in color edge-stopping function.
+* `normal weight`     - sigma value in normal edge-stopping function.
+* `position weight`     - sigma value in position edge-stopping function.
+* `show gbuffer`     - show the g-buffer. Use dropdown to select which buffer you want to see.
+
+### Visual Analysis
+
+For the following results, we use a `color_weight` of 150.0, a `normal_weight` of 0.5, and 
+a `position_weight` of 0.4. The resolution of the image is fixed at 800 x 800. 
+
+#### Varying Filter Size
+
+To observe the effect of filter size on visual quality, we denoise the same scene using 100 path-tracing iterations 
+and varying filter sizes. The results suggest that visual quality increases as filter size increases. In the images below, we can still see noise
+at low filter sizes. As we increase the filter size, it becomes less noticeable. The visual improvement
+does not scale uniformly with filter size; we can see a large improvement in quality between sizes 10, 20, and 40,
+but the amount of quality we gain afterwards is much less noticeable. 
+
+| Filter Size  |      Result (100 iterations)     |  Filter Size | Result (100 iterations)
+|:----------:    |:-------------:  |:------:       |:------:|
+| 10x10        |  ![](img/results/filter_size/filter10_2.PNG)   | 160x160      | ![](img/results/filter_size/filter160_2.PNG) |
+| 20x20        |  ![](img/results/filter_size/filter20_2.PNG)   | 320x320      | ![](img/results/filter_size/filter320_2.PNG) |
+| 40x40        |  ![](img/results/filter_size/filter40_2.PNG)   | 640x640      | ![](img/results/filter_size/filter640_2.PNG) |
+| 80x80        |  ![](img/results/filter_size/filter80_2.PNG)   | 1280x1280    | ![](img/results/filter_size/filter1280_2.PNG) |
+<p align="center"><em>Figure 6. Visual impact of varying filter size on an 800 x 800 image.</em></p>
+
+#### Different Material Types
+
+The following scenes were rendered with an 80 x 80 filter. We compare our denoising results to these ground-truth images
+of a diffuse and specular reflective sphere rendered at 10,000 iterations to judge visual quality. 
+
+| Iterations      |      Diffuse      |  Specular Reflective |
+|:----------:       |:-------------:           |:------:|
+| 10000            |  ![](img/results/materials/diffuse_pathtraced_10000samp.PNG) | ![](img/results/materials/specular_pathtraced_10000samp.PNG) |
+<p align="center"><em>Figure 7. Ground-truth reference images for diffuse and reflective scenes (10,000 iterations).</em></p>
+
+Denoising is very effective for scenes with diffuse materials, since there aren't many fine details to capture. We can see that
+the denoised result is close to the outcome that we would expect after running the program for many iterations.
+
+| Iterations      |      Raw Pathtraced      |  Denoised |
+|:----------:       |:-------------:           |:------:|
+| 100             |  ![](img/results/materials/diffuse_pathtraced_100samp.PNG) | ![](img/results/materials/diffuse_denoised_100samp.PNG) |
+| 1000            |  ![](img/results/materials/diffuse_pathtraced_1000samp.PNG)| ![](img/results/materials/diffuse_denoised_1000samp.PNG) | 
+<p align="center"><em>Figure 8. Effect of denoising a scene with a diffuse sphere.</em></p>
+
+Denoising is less effective for specular materials, especially at lower iteration counts. Since the image is less converged,
+many of the fine details are less apparent and therefore the denoised image blurs the reflection at the edges. At 1000 iterations, 
+the reflective detail is more clear.  
+
+| Iterations      |      Raw Pathtraced      |  Denoised |
+|:----------:       |:-------------:           |:------:|
+| 100             |  ![](img/results/materials/specular_pathtraced_100samp.PNG) | ![](img/results/materials/specular_denoised_100samp.PNG) |
+| 1000            |  ![](img/results/materials/specular_pathtraced_1000samp.PNG)| ![](img/results/materials/specular_denoised_1000samp.PNG) |  
+<p align="center"><em>Figure 9. Effect of denoising a scene with a reflective sphere.</em></p>
+
+#### Different Scenes
+
+The scene with the larger ceiling light produces much better denoised results than the scene with the smaller light.
+Since the light is bigger in the first scene, rays are more likely to hit it, meaning that the image will converge faster.
+In the smaller light scene, rays are more likely to miss the light, which leads to a noisier image. Since the first image 
+produces less noise at lower iterations, the denoiser is able to produce an image much closer to the expected outcome quickly. 
+
+| Iterations | Raw Pathtraced |  Denoised |
+|:----------:|:-------------:|:------:|
+| 100             |  ![](img/results/materials/specular_pathtraced_100samp.PNG)   | ![](img/results/materials/specular_denoised_100samp.PNG) |
+| 1000            |  ![](img/results/materials/specular_pathtraced_1000samp.PNG)  | ![](img/results/materials/specular_denoised_1000samp.PNG) | 
+<p align="center"><em>Figure 10. Denoising a scene with a large area light.</em></p>
+
+| Iterations | Raw Pathtraced | Denoised |
+|:----------:|:-------------:|:------:|
+| 100             |  ![](img/results/cornell_pathtraced_100samp.PNG)   | ![](img/results/cornell_denoised_100samp.PNG) |
+| 1000            |  ![](img/results/cornell_pathtraced_1000samp.PNG)  | ![](img/results/cornell_denoised_1000samp.PNG) | 
+<p align="center"><em>Figure 11. Denoising a scene with a small area light.</em></p>
+
+### Performance Analysis
+
+We measure performance by timing the denoising kernel, which is only run once at the end of pathtracing. We use `cudaEvents`
+to record the total execution time of the kernel. Note that we do not include the path-tracing time in the measurements, just the
+additional time spent denoising. Additionally, the number displayed on the graphs are the average of 10 runs of the denoising kernel.  
+
+For the following measurements, we use a `color_weight` of 25.0, a `normal_weight` of 0.35, and 
+a `position_weight` of 0.2. The resolution of the image is 800 x 800 unless otherwise noted.
+
+#### How Much Time Denoising Adds to Renders
+
+Denoising time is indepedent of total path tracing iterations since it only runs once at the end. 
+In the graph below, we can see that total denoising time did not vary much between iteration counts.
+
+![](img/results/graphs/denoising_iterations.png)
+<p align="center"><em>Figure 12. Impact of increasing iteration count on denoising kernel execution time.</em></p>
+
+#### Varying Filter Size
+
+If we increase filter size, we do see an increase in the total time spent denoising. This is because
+we must perform more passes over the image (i.e. increase the number of denoising iterations). The filter size
+directly determines the number of times we apply the À-Trous filter. If `filterSize = 5 * 2^(iterations)`, then
+we can calculate iterations from filter size using `iterations = floor(log2(filterSize / 5))`.
+
+![](img/results/graphs/denoising_filtersize.png)
+<p align="center"><em>Figure 13. Impact of increasing filter size on denoising kernel execution time.</em></p>
+
+#### Number of Iterations to "Acceptably Smooth" Result
+
+Using 10,000 iterations as the "perfectly smoothed" reference, we'll take 1000 iterations as the "acceptably smoothed" result.
+We observe that the difference image shows variation along the edges, but this is fine for our purposes since the largest areas of the scene
+match. We will observe how many iterations it will take us to get a comparable result using denoising. 
+
+| Raw Pathtraced (10000 iterations) |  "Acceptably Smooth" Pathtraced (1000 iterations) | Difference |
+|:----------:       |:-------------:           |:------:|
+| ![](img/results/acceptably_smooth/cornell_ceiling_light_10000samp.PNG) | ![](img/results/acceptably_smooth/cornell_ceiling_light_1000samp2.PNG) | ![](img/results/acceptably_smooth/cornell_ceiling_10000_1000_diff.PNG)
+<p align="center"><em>Figure 14. Comparison of the ground-truth result to an acceptably-smoothed image.</em></p>
+
+With denoising, it takes about 100 iterations to achieve the "acceptably smooth" result. The difference image is about the same as the one above.
+This is about a 90% decrease in iterations.
+
+| "Acceptably Smooth" Pathtraced (1000 iterations) |  Denoised (100 iterations) | Difference |
+|:----------:       |:-------------:           |:------:|
+| ![](img/results/acceptably_smooth/cornell_ceiling_light_1000samp2.PNG) | ![](img/results/acceptably_smooth/cornell_ceiling_light_100samp_denoised.PNG) | ![](img/results/acceptably_smooth/cornell_ceiling_denoised_diff.PNG)
+<p align="center"><em>Figure 15. Fewer iterations are necessary to achieved an acceptably-smoothed result using denoising.</em></p>
+
+#### Denoising at Different Resolutions
+
+Denoising time increases with increasing resolution. This is because we have to run our filter over more pixels, which 
+will directly increase runtime.
+
+![](img/results/graphs/denoising_resolution.png)
+<p align="center"><em>Figure 16. Impact of increasing image resolution on denoising kernel execution time.</em></p>
+
+### Bloopers
+
+**Foggy Cornell**
+
+![](img/results/bloopers/blooper1.PNG)
+
+**Ghost Lights**
+
+![](img/results/bloopers/blooper2.PNG)
+
+### References
+
+* Paper - [Edge-Avoiding À-Trous Wavelet Transform for fast Global Illumination Filtering](https://jo.dreggn.org/home/2010_atrous.pdf)
+* Convolution Filter - [The à trous algorithm](https://www.eso.org/sci/software/esomidas/doc/user/18NOV/volb/node317.html)
+* UPenn CIS 565 Course Notes
 
diff --git a/img/atrous_kernels.PNG b/img/atrous_kernels.PNG
new file mode 100644
index 00000000..f758bc46
Binary files /dev/null and b/img/atrous_kernels.PNG differ
diff --git a/img/cover_denoised_100.PNG b/img/cover_denoised_100.PNG
new file mode 100644
index 00000000..52ea76b4
Binary files /dev/null and b/img/cover_denoised_100.PNG differ
diff --git a/img/cover_denoised_1000.PNG b/img/cover_denoised_1000.PNG
new file mode 100644
index 00000000..db9d0549
Binary files /dev/null and b/img/cover_denoised_1000.PNG differ
diff --git a/img/cover_pathtraced_100.png b/img/cover_pathtraced_100.png
new file mode 100644
index 00000000..2a1caf06
Binary files /dev/null and b/img/cover_pathtraced_100.png differ
diff --git a/img/cover_pathtraced_1000.PNG b/img/cover_pathtraced_1000.PNG
new file mode 100644
index 00000000..b67e6dfe
Binary files /dev/null and b/img/cover_pathtraced_1000.PNG differ
diff --git a/img/results/acceptably_smooth/cornell_ceiling_10000_1000_diff.PNG b/img/results/acceptably_smooth/cornell_ceiling_10000_1000_diff.PNG
new file mode 100644
index 00000000..c48c7b61
Binary files /dev/null and b/img/results/acceptably_smooth/cornell_ceiling_10000_1000_diff.PNG differ
diff --git a/img/results/acceptably_smooth/cornell_ceiling_denoised_diff.PNG b/img/results/acceptably_smooth/cornell_ceiling_denoised_diff.PNG
new file mode 100644
index 00000000..26ce6b78
Binary files /dev/null and b/img/results/acceptably_smooth/cornell_ceiling_denoised_diff.PNG differ
diff --git a/img/results/acceptably_smooth/cornell_ceiling_light_10000samp.PNG b/img/results/acceptably_smooth/cornell_ceiling_light_10000samp.PNG
new file mode 100644
index 00000000..46750aad
Binary files /dev/null and b/img/results/acceptably_smooth/cornell_ceiling_light_10000samp.PNG differ
diff --git a/img/results/acceptably_smooth/cornell_ceiling_light_1000samp.PNG b/img/results/acceptably_smooth/cornell_ceiling_light_1000samp.PNG
new file mode 100644
index 00000000..87786945
Binary files /dev/null and b/img/results/acceptably_smooth/cornell_ceiling_light_1000samp.PNG differ
diff --git a/img/results/acceptably_smooth/cornell_ceiling_light_1000samp2.PNG b/img/results/acceptably_smooth/cornell_ceiling_light_1000samp2.PNG
new file mode 100644
index 00000000..da38dcd8
Binary files /dev/null and b/img/results/acceptably_smooth/cornell_ceiling_light_1000samp2.PNG differ
diff --git a/img/results/acceptably_smooth/cornell_ceiling_light_100samp_denoised.PNG b/img/results/acceptably_smooth/cornell_ceiling_light_100samp_denoised.PNG
new file mode 100644
index 00000000..c54139d8
Binary files /dev/null and b/img/results/acceptably_smooth/cornell_ceiling_light_100samp_denoised.PNG differ
diff --git a/img/results/base_cornell_ceiling_light_10iter.png b/img/results/base_cornell_ceiling_light_10iter.png
new file mode 100644
index 00000000..163c3199
Binary files /dev/null and b/img/results/base_cornell_ceiling_light_10iter.png differ
diff --git a/img/results/basic_blur/basic_blur_100samp.PNG b/img/results/basic_blur/basic_blur_100samp.PNG
new file mode 100644
index 00000000..c5ad7e90
Binary files /dev/null and b/img/results/basic_blur/basic_blur_100samp.PNG differ
diff --git a/img/results/basic_blur/cornell_10iter_100samp.PNG b/img/results/basic_blur/cornell_10iter_100samp.PNG
new file mode 100644
index 00000000..c2f6a002
Binary files /dev/null and b/img/results/basic_blur/cornell_10iter_100samp.PNG differ
diff --git a/img/results/basic_blur/cornell_1iter_100samp.PNG b/img/results/basic_blur/cornell_1iter_100samp.PNG
new file mode 100644
index 00000000..3f5859b4
Binary files /dev/null and b/img/results/basic_blur/cornell_1iter_100samp.PNG differ
diff --git a/img/results/basic_blur/cornell_2iter_100samp.PNG b/img/results/basic_blur/cornell_2iter_100samp.PNG
new file mode 100644
index 00000000..e4251e29
Binary files /dev/null and b/img/results/basic_blur/cornell_2iter_100samp.PNG differ
diff --git a/img/results/basic_blur/cornell_3iter_100samp.PNG b/img/results/basic_blur/cornell_3iter_100samp.PNG
new file mode 100644
index 00000000..b6322be7
Binary files /dev/null and b/img/results/basic_blur/cornell_3iter_100samp.PNG differ
diff --git a/img/results/basic_blur/cornell_4iter_100samp.PNG b/img/results/basic_blur/cornell_4iter_100samp.PNG
new file mode 100644
index 00000000..78734733
Binary files /dev/null and b/img/results/basic_blur/cornell_4iter_100samp.PNG differ
diff --git a/img/results/basic_blur/cornell_5iter_100samp.PNG b/img/results/basic_blur/cornell_5iter_100samp.PNG
new file mode 100644
index 00000000..6c8e69e1
Binary files /dev/null and b/img/results/basic_blur/cornell_5iter_100samp.PNG differ
diff --git a/img/results/basic_blur/cornell_6iter_100samp.PNG b/img/results/basic_blur/cornell_6iter_100samp.PNG
new file mode 100644
index 00000000..1185f837
Binary files /dev/null and b/img/results/basic_blur/cornell_6iter_100samp.PNG differ
diff --git a/img/results/basic_blur/cornell_7iter_100samp.PNG b/img/results/basic_blur/cornell_7iter_100samp.PNG
new file mode 100644
index 00000000..cbf949b4
Binary files /dev/null and b/img/results/basic_blur/cornell_7iter_100samp.PNG differ
diff --git a/img/results/basic_blur/cornell_8iter_100samp.PNG b/img/results/basic_blur/cornell_8iter_100samp.PNG
new file mode 100644
index 00000000..8e5015e1
Binary files /dev/null and b/img/results/basic_blur/cornell_8iter_100samp.PNG differ
diff --git a/img/results/basic_blur/cornell_9iter_100samp.PNG b/img/results/basic_blur/cornell_9iter_100samp.PNG
new file mode 100644
index 00000000..3032f58b
Binary files /dev/null and b/img/results/basic_blur/cornell_9iter_100samp.PNG differ
diff --git a/img/results/basic_blur/edge_avoiding_100samp.PNG b/img/results/basic_blur/edge_avoiding_100samp.PNG
new file mode 100644
index 00000000..c2bc7e73
Binary files /dev/null and b/img/results/basic_blur/edge_avoiding_100samp.PNG differ
diff --git a/img/results/basic_blur/gimp_blur_100samp.PNG b/img/results/basic_blur/gimp_blur_100samp.PNG
new file mode 100644
index 00000000..72b30948
Binary files /dev/null and b/img/results/basic_blur/gimp_blur_100samp.PNG differ
diff --git a/img/results/basic_blur/no_blur_100samp.PNG b/img/results/basic_blur/no_blur_100samp.PNG
new file mode 100644
index 00000000..dc621313
Binary files /dev/null and b/img/results/basic_blur/no_blur_100samp.PNG differ
diff --git a/img/results/bloopers/blooper1.PNG b/img/results/bloopers/blooper1.PNG
new file mode 100644
index 00000000..b9d4b933
Binary files /dev/null and b/img/results/bloopers/blooper1.PNG differ
diff --git a/img/results/bloopers/blooper2.PNG b/img/results/bloopers/blooper2.PNG
new file mode 100644
index 00000000..6c364e47
Binary files /dev/null and b/img/results/bloopers/blooper2.PNG differ
diff --git a/img/results/cornell_ceiling_light_depth_10samp.PNG b/img/results/cornell_ceiling_light_depth_10samp.PNG
new file mode 100644
index 00000000..28f4b08b
Binary files /dev/null and b/img/results/cornell_ceiling_light_depth_10samp.PNG differ
diff --git a/img/results/cornell_ceiling_light_normal_10samp.PNG b/img/results/cornell_ceiling_light_normal_10samp.PNG
new file mode 100644
index 00000000..a24bcf12
Binary files /dev/null and b/img/results/cornell_ceiling_light_normal_10samp.PNG differ
diff --git a/img/results/cornell_ceiling_light_position_10samp.PNG b/img/results/cornell_ceiling_light_position_10samp.PNG
new file mode 100644
index 00000000..3bfe34f0
Binary files /dev/null and b/img/results/cornell_ceiling_light_position_10samp.PNG differ
diff --git a/img/results/cornell_denoised_1000samp.PNG b/img/results/cornell_denoised_1000samp.PNG
new file mode 100644
index 00000000..27ee5f2a
Binary files /dev/null and b/img/results/cornell_denoised_1000samp.PNG differ
diff --git a/img/results/cornell_denoised_100samp.PNG b/img/results/cornell_denoised_100samp.PNG
new file mode 100644
index 00000000..d422c9d6
Binary files /dev/null and b/img/results/cornell_denoised_100samp.PNG differ
diff --git a/img/results/cornell_pathtraced_1000samp.PNG b/img/results/cornell_pathtraced_1000samp.PNG
new file mode 100644
index 00000000..e330cf1a
Binary files /dev/null and b/img/results/cornell_pathtraced_1000samp.PNG differ
diff --git a/img/results/cornell_pathtraced_100samp.PNG b/img/results/cornell_pathtraced_100samp.PNG
new file mode 100644
index 00000000..b65fa11e
Binary files /dev/null and b/img/results/cornell_pathtraced_100samp.PNG differ
diff --git a/img/results/depth.PNG b/img/results/depth.PNG
new file mode 100644
index 00000000..80cee3da
Binary files /dev/null and b/img/results/depth.PNG differ
diff --git a/img/results/filter_size/filter10.PNG b/img/results/filter_size/filter10.PNG
new file mode 100644
index 00000000..7845998d
Binary files /dev/null and b/img/results/filter_size/filter10.PNG differ
diff --git a/img/results/filter_size/filter10_2.PNG b/img/results/filter_size/filter10_2.PNG
new file mode 100644
index 00000000..e2a537c1
Binary files /dev/null and b/img/results/filter_size/filter10_2.PNG differ
diff --git a/img/results/filter_size/filter1280.PNG b/img/results/filter_size/filter1280.PNG
new file mode 100644
index 00000000..b8ffe49a
Binary files /dev/null and b/img/results/filter_size/filter1280.PNG differ
diff --git a/img/results/filter_size/filter1280_2.PNG b/img/results/filter_size/filter1280_2.PNG
new file mode 100644
index 00000000..a8c77a47
Binary files /dev/null and b/img/results/filter_size/filter1280_2.PNG differ
diff --git a/img/results/filter_size/filter160.PNG b/img/results/filter_size/filter160.PNG
new file mode 100644
index 00000000..158077e0
Binary files /dev/null and b/img/results/filter_size/filter160.PNG differ
diff --git a/img/results/filter_size/filter160_2.PNG b/img/results/filter_size/filter160_2.PNG
new file mode 100644
index 00000000..2e9e91b1
Binary files /dev/null and b/img/results/filter_size/filter160_2.PNG differ
diff --git a/img/results/filter_size/filter20.PNG b/img/results/filter_size/filter20.PNG
new file mode 100644
index 00000000..e873d526
Binary files /dev/null and b/img/results/filter_size/filter20.PNG differ
diff --git a/img/results/filter_size/filter20_2.PNG b/img/results/filter_size/filter20_2.PNG
new file mode 100644
index 00000000..51b97aba
Binary files /dev/null and b/img/results/filter_size/filter20_2.PNG differ
diff --git a/img/results/filter_size/filter320.PNG b/img/results/filter_size/filter320.PNG
new file mode 100644
index 00000000..aacaf84c
Binary files /dev/null and b/img/results/filter_size/filter320.PNG differ
diff --git a/img/results/filter_size/filter320_2.PNG b/img/results/filter_size/filter320_2.PNG
new file mode 100644
index 00000000..b5f2713c
Binary files /dev/null and b/img/results/filter_size/filter320_2.PNG differ
diff --git a/img/results/filter_size/filter40.PNG b/img/results/filter_size/filter40.PNG
new file mode 100644
index 00000000..4002e814
Binary files /dev/null and b/img/results/filter_size/filter40.PNG differ
diff --git a/img/results/filter_size/filter40_2.PNG b/img/results/filter_size/filter40_2.PNG
new file mode 100644
index 00000000..2154d3c8
Binary files /dev/null and b/img/results/filter_size/filter40_2.PNG differ
diff --git a/img/results/filter_size/filter640.PNG b/img/results/filter_size/filter640.PNG
new file mode 100644
index 00000000..53649a71
Binary files /dev/null and b/img/results/filter_size/filter640.PNG differ
diff --git a/img/results/filter_size/filter640_2.PNG b/img/results/filter_size/filter640_2.PNG
new file mode 100644
index 00000000..ed6ec329
Binary files /dev/null and b/img/results/filter_size/filter640_2.PNG differ
diff --git a/img/results/filter_size/filter80.PNG b/img/results/filter_size/filter80.PNG
new file mode 100644
index 00000000..62d91e73
Binary files /dev/null and b/img/results/filter_size/filter80.PNG differ
diff --git a/img/results/filter_size/filter80_2.PNG b/img/results/filter_size/filter80_2.PNG
new file mode 100644
index 00000000..94c69d2d
Binary files /dev/null and b/img/results/filter_size/filter80_2.PNG differ
diff --git a/img/results/graphs/denoising_filtersize.png b/img/results/graphs/denoising_filtersize.png
new file mode 100644
index 00000000..c8a03c30
Binary files /dev/null and b/img/results/graphs/denoising_filtersize.png differ
diff --git a/img/results/graphs/denoising_iterations.png b/img/results/graphs/denoising_iterations.png
new file mode 100644
index 00000000..d3b02a10
Binary files /dev/null and b/img/results/graphs/denoising_iterations.png differ
diff --git a/img/results/graphs/denoising_resolution.png b/img/results/graphs/denoising_resolution.png
new file mode 100644
index 00000000..24594f53
Binary files /dev/null and b/img/results/graphs/denoising_resolution.png differ
diff --git a/img/results/materials/diffuse_denoised_1000samp.PNG b/img/results/materials/diffuse_denoised_1000samp.PNG
new file mode 100644
index 00000000..65329499
Binary files /dev/null and b/img/results/materials/diffuse_denoised_1000samp.PNG differ
diff --git a/img/results/materials/diffuse_denoised_100samp.PNG b/img/results/materials/diffuse_denoised_100samp.PNG
new file mode 100644
index 00000000..49eec7ea
Binary files /dev/null and b/img/results/materials/diffuse_denoised_100samp.PNG differ
diff --git a/img/results/materials/diffuse_pathtraced_10000samp.PNG b/img/results/materials/diffuse_pathtraced_10000samp.PNG
new file mode 100644
index 00000000..bf0dca81
Binary files /dev/null and b/img/results/materials/diffuse_pathtraced_10000samp.PNG differ
diff --git a/img/results/materials/diffuse_pathtraced_1000samp.PNG b/img/results/materials/diffuse_pathtraced_1000samp.PNG
new file mode 100644
index 00000000..c0138cd9
Binary files /dev/null and b/img/results/materials/diffuse_pathtraced_1000samp.PNG differ
diff --git a/img/results/materials/diffuse_pathtraced_100samp.PNG b/img/results/materials/diffuse_pathtraced_100samp.PNG
new file mode 100644
index 00000000..5d5ef493
Binary files /dev/null and b/img/results/materials/diffuse_pathtraced_100samp.PNG differ
diff --git a/img/results/materials/specular_denoised_1000samp.PNG b/img/results/materials/specular_denoised_1000samp.PNG
new file mode 100644
index 00000000..53f106f8
Binary files /dev/null and b/img/results/materials/specular_denoised_1000samp.PNG differ
diff --git a/img/results/materials/specular_denoised_100samp.PNG b/img/results/materials/specular_denoised_100samp.PNG
new file mode 100644
index 00000000..8f354445
Binary files /dev/null and b/img/results/materials/specular_denoised_100samp.PNG differ
diff --git a/img/results/materials/specular_pathtraced_10000samp.PNG b/img/results/materials/specular_pathtraced_10000samp.PNG
new file mode 100644
index 00000000..ef4004fd
Binary files /dev/null and b/img/results/materials/specular_pathtraced_10000samp.PNG differ
diff --git a/img/results/materials/specular_pathtraced_1000samp.PNG b/img/results/materials/specular_pathtraced_1000samp.PNG
new file mode 100644
index 00000000..e0f1fd18
Binary files /dev/null and b/img/results/materials/specular_pathtraced_1000samp.PNG differ
diff --git a/img/results/materials/specular_pathtraced_100samp.PNG b/img/results/materials/specular_pathtraced_100samp.PNG
new file mode 100644
index 00000000..cf6a7f15
Binary files /dev/null and b/img/results/materials/specular_pathtraced_100samp.PNG differ
diff --git a/img/results/normals.PNG b/img/results/normals.PNG
new file mode 100644
index 00000000..a41dad03
Binary files /dev/null and b/img/results/normals.PNG differ
diff --git a/img/results/positions.PNG b/img/results/positions.PNG
new file mode 100644
index 00000000..0b6c6176
Binary files /dev/null and b/img/results/positions.PNG differ
diff --git a/scenes/cornell.txt b/scenes/cornell.txt
index 83ff8202..d6155ea2 100644
--- a/scenes/cornell.txt
+++ b/scenes/cornell.txt
@@ -52,7 +52,7 @@ EMITTANCE   0
 CAMERA
 RES         800 800
 FOVY        45
-ITERATIONS  5000
+ITERATIONS  1000
 DEPTH       8
 FILE        cornell
 EYE         0.0 5 10.5
diff --git a/scenes/cornell_ceiling_light.txt b/scenes/cornell_ceiling_light.txt
index 15af5f19..113033d5 100644
--- a/scenes/cornell_ceiling_light.txt
+++ b/scenes/cornell_ceiling_light.txt
@@ -52,7 +52,7 @@ EMITTANCE   0
 CAMERA
 RES         800 800
 FOVY        45
-ITERATIONS  10
+ITERATIONS  100
 DEPTH       8
 FILE        cornell
 EYE         0.0 5 10.5
diff --git a/scenes/custom_scene.txt b/scenes/custom_scene.txt
new file mode 100644
index 00000000..dc53e370
--- /dev/null
+++ b/scenes/custom_scene.txt
@@ -0,0 +1,188 @@
+// Emissive material (light)
+MATERIAL 0
+RGB         1 1 1
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   5
+
+// Diffuse white
+MATERIAL 1
+RGB         .98 .98 .98
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse red
+MATERIAL 2
+RGB         .85 .35 .55
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse green
+MATERIAL 3
+RGB         .35 .85 .35
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Specular white
+MATERIAL 4
+RGB         .98 .98 .98
+SPECEX      0
+SPECRGB     .98 .98 .98
+REFL        1
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse aqua
+MATERIAL 5
+RGB         0.4 0.9 0.6
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse grey
+MATERIAL 6
+RGB         0.6 0.6 0.6
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Diffuse blue
+MATERIAL 7
+RGB         0.5 0.5 0.9
+SPECEX      0
+SPECRGB     0 0 0
+REFL        0
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Specular pink
+MATERIAL 8
+RGB         0.98 0.98 0.98
+SPECEX      0
+SPECRGB     0.9 0.6 0.9
+REFL        1
+REFR        0
+REFRIOR     0
+EMITTANCE   0
+
+// Camera
+CAMERA
+RES         1000 800
+FOVY        35
+ITERATIONS  1000
+DEPTH       8
+FILE        custom_scene
+EYE         0.0 5 10.5
+LOOKAT      0 5 0
+UP          0 1 0
+
+// Ceiling light middle
+OBJECT 0
+cube
+material 0
+TRANS       0 10 0
+ROTAT       0 0 0
+SCALE       3 .3 3
+
+// Ceiling light left
+OBJECT 1
+cube
+material 0
+TRANS       -5 10 0
+ROTAT       0 0 0
+SCALE       3 .3 3
+
+// Ceiling light right
+OBJECT 2
+cube
+material 0
+TRANS       5 10 0
+ROTAT       0 0 0
+SCALE       3 .3 3
+
+// Floor
+OBJECT 3
+cube
+material 1
+TRANS       0 0 0
+ROTAT       0 0 0
+SCALE       20 .01 40
+
+// Ceiling
+OBJECT 4
+cube
+material 2
+TRANS       0 10 0
+ROTAT       0 0 90
+SCALE       .01 20 40
+
+// Back wall
+OBJECT 5
+cube
+material 1
+TRANS       0 5 -5
+ROTAT       0 90 0
+SCALE       .01 10 20
+
+// Left wall
+OBJECT 6
+cube
+material 5
+TRANS       -10 5 0
+ROTAT       0 0 0
+SCALE       .01 10 10
+
+// Right wall
+OBJECT 7
+cube
+material 5
+TRANS       10 5 0
+ROTAT       0 0 0
+SCALE       .01 10 10
+
+// Sphere
+OBJECT 8
+sphere
+material 1
+TRANS       2.0 1.25 -1
+ROTAT       0 0 0
+SCALE       3 3 3
+
+// Cube
+OBJECT 9
+cube
+material 1
+TRANS       -0.9 3 -3
+ROTAT       0 27.5 0
+SCALE       3 6 3
+
+// Sphere
+OBJECT 10
+sphere
+material 8
+TRANS       -2.75 1.25 0.1
+ROTAT       0 0 0
+SCALE       3 3 3
diff --git a/src/main.cpp b/src/main.cpp
index 4092ae4a..922cf748 100644
--- a/src/main.cpp
+++ b/src/main.cpp
@@ -23,12 +23,14 @@ int ui_iterations = 0;
 int startupIterations = 0;
 int lastLoopIterations = 0;
 bool ui_showGbuffer = false;
+int ui_currentBuffer = 0;
 bool ui_denoise = false;
 int ui_filterSize = 80;
-float ui_colorWeight = 0.45f;
-float ui_normalWeight = 0.35f;
-float ui_positionWeight = 0.2f;
+float ui_colorWeight = 150.f;
+float ui_normalWeight = 0.5f;
+float ui_positionWeight = 0.4f;
 bool ui_saveAndExit = false;
+int denoise_call_count = 0;
 
 static bool camchanged = true;
 static float dtheta = 0, dphi = 0;
@@ -144,6 +146,7 @@ void runCuda() {
         cameraPosition += cam.lookAt;
         cam.position = cameraPosition;
         camchanged = false;
+        //ui_denoise = true;
       }
 
     // Map OpenGL buffer object for writing from CUDA on a single GPU
@@ -166,9 +169,16 @@ void runCuda() {
     }
 
     if (ui_showGbuffer) {
-      showGBuffer(pbo_dptr);
-    } else {
-      showImage(pbo_dptr, iteration);
+      showGBuffer(pbo_dptr, ui_currentBuffer);
+    }
+    else if (ui_denoise) {
+        if (iteration == ui_iterations) {
+            denoise(pbo_dptr, iteration, ui_filterSize, ui_colorWeight, ui_normalWeight, ui_positionWeight, denoise_call_count);
+            denoise_call_count++;
+        }
+    } 
+    else {
+        showImage(pbo_dptr, iteration);
     }
 
     // unmap buffer object
diff --git a/src/main.h b/src/main.h
index 06d311a8..6bedbe47 100644
--- a/src/main.h
+++ b/src/main.h
@@ -35,6 +35,7 @@ extern int height;
 extern int ui_iterations;
 extern int startupIterations;
 extern bool ui_showGbuffer;
+extern int ui_currentBuffer;
 extern bool ui_denoise;
 extern int ui_filterSize;
 extern float ui_colorWeight;
diff --git a/src/pathtrace.cu b/src/pathtrace.cu
index 23e5f909..d1b7a9da 100644
--- a/src/pathtrace.cu
+++ b/src/pathtrace.cu
@@ -15,6 +15,7 @@
 #include "interactions.h"
 
 #define ERRORCHECK 1
+#define DENOISE_ITERATIONS 5
 
 #define FILENAME (strrchr(__FILE__, '/') ? strrchr(__FILE__, '/') + 1 : __FILE__)
 #define checkCUDAError(msg) checkCUDAErrorFn(msg, FILENAME, __LINE__)
@@ -38,12 +39,108 @@ void checkCUDAErrorFn(const char *msg, const char *file, int line) {
 #endif
 }
 
+PerformanceTimer& timer()
+{
+    static PerformanceTimer timer;
+    return timer;
+}
+
+template<typename T>
+void printElapsedTime(T time, std::string note = "")
+{
+    std::cout << "   elapsed time: " << time << "ms    " << note << std::endl;
+}
+
 __host__ __device__
 thrust::default_random_engine makeSeededRandomEngine(int iter, int index, int depth) {
     int h = utilhash((1 << 31) | (depth << 22) | iter) ^ utilhash(index);
     return thrust::default_random_engine(h);
 }
 
+__global__ void denoiseBasicBlur(glm::vec3* denoisedImage, glm::ivec2 resolution,
+    int iter, int stepsize, glm::vec3* image, GBufferPixel* dev_gBuffer,
+    float* dev_kernel, glm::ivec2* dev_offset) {
+   int x = (blockIdx.x * blockDim.x) + threadIdx.x;
+   int y = (blockIdx.y * blockDim.y) + threadIdx.y;
+
+   if (x < resolution.x && y < resolution.y) {
+       int index = x + (y * resolution.x); 
+
+       glm::vec3 sum = glm::vec3(0.f, 0.f, 0.f);
+       for (int i = 0; i < 25; i++) {
+           glm::ivec2 offset = dev_offset[i] * stepsize;
+           glm::ivec2 uv = glm::ivec2(x, y) + offset;
+
+           // Clamp indices to image width and height
+           uv = glm::clamp(uv, glm::ivec2(0, 0), glm::ivec2(resolution.x - 1, resolution.y - 1));
+
+           // Apply kernel
+           glm::vec3 col = image[uv.x + resolution.x * uv.y];
+           sum += col * dev_kernel[i];
+       }
+       
+       // Write color to OpenGL PBO
+       denoisedImage[index] = sum;
+   }
+}
+
+__global__ void denoiseWeighted(glm::vec3* denoisedImage, glm::ivec2 resolution,
+    int iter, int stepsize, float sigma_col, float sigma_norm, float sigma_pos,
+    glm::vec3* image, GBufferPixel* dev_gBuffer,
+    float* dev_kernel, glm::ivec2* dev_offset) {
+    int x = (blockIdx.x * blockDim.x) + threadIdx.x;
+    int y = (blockIdx.y * blockDim.y) + threadIdx.y;
+
+    if (x < resolution.x && y < resolution.y) {
+        int index = x + (y * resolution.x);
+
+        // Read G-Buffer values for current pixel
+        glm::vec3 col = image[index];
+        glm::vec3 norm = dev_gBuffer[index].normal;
+        glm::vec3 pos = dev_gBuffer[index].position;
+
+        // Sum to accumulate color
+        glm::vec3 sum = glm::vec3(0.f, 0.f, 0.f);
+        float sum_weights = 0.f;
+        
+        glm::vec3 test_col;
+
+        for (int i = 0; i < 25; i++) {
+            // Find pixel in image at desired offset (clamp to image boundaries)
+            glm::ivec2 offset = dev_offset[i] * stepsize;
+            glm::ivec2 uv = glm::ivec2(x, y) + offset;
+            uv = clamp(uv, glm::ivec2(0, 0), glm::ivec2(resolution.x - 1, resolution.y - 1));
+
+            // Find difference in color between current and neighboring pixel
+            glm::vec3 col_n = image[uv.x + resolution.x * uv.y];
+            glm::vec3 col_diff = col - col_n;
+            float dist2 = glm::dot(col_diff, col_diff);
+            float col_w = glm::min(glm::exp(-(dist2) / (sigma_col * sigma_col)), 1.f);
+
+            // Normal
+            glm::vec3 norm_n = dev_gBuffer[uv.x + resolution.x * uv.y].normal;
+            glm::vec3 norm_diff = norm - norm_n;
+            dist2 = glm::max(glm::dot(norm_diff, norm_diff) / (float)(stepsize * stepsize), 0.f);
+            float norm_w = glm::min(glm::exp(-(dist2) / (sigma_norm * sigma_norm)), 1.f);
+
+            // Position
+            glm::vec3 pos_n = dev_gBuffer[uv.x + resolution.x * uv.y].position;
+            glm::vec3 pos_diff = pos - pos_n;
+            dist2 = glm::dot(pos_diff, pos_diff);
+            float pos_w = glm::min(glm::exp(-(dist2) / (sigma_pos * sigma_pos)), 1.f);
+
+            // Calculate weighting
+            float weight = col_w * norm_w * pos_w;
+            sum += col_n * weight * dev_kernel[i];
+            sum_weights += weight * dev_kernel[i];
+        }
+
+        // Write color to OpenGL PBO
+        denoisedImage[index] = sum / sum_weights;
+    }
+
+}
+
 //Kernel that writes the image to the OpenGL PBO directly.
 __global__ void sendImageToPBO(uchar4* pbo, glm::ivec2 resolution,
         int iter, glm::vec3* image) {
@@ -67,39 +164,84 @@ __global__ void sendImageToPBO(uchar4* pbo, glm::ivec2 resolution,
     }
 }
 
-__global__ void gbufferToPBO(uchar4* pbo, glm::ivec2 resolution, GBufferPixel* gBuffer) {
+__global__ void gbufferToPBO(uchar4* pbo, int ui_currentBuffer, glm::ivec2 resolution, GBufferPixel* gBuffer) {
     int x = (blockIdx.x * blockDim.x) + threadIdx.x;
     int y = (blockIdx.y * blockDim.y) + threadIdx.y;
 
     if (x < resolution.x && y < resolution.y) {
         int index = x + (y * resolution.x);
         float timeToIntersect = gBuffer[index].t * 256.0;
-
-        pbo[index].w = 0;
-        pbo[index].x = timeToIntersect;
-        pbo[index].y = timeToIntersect;
-        pbo[index].z = timeToIntersect;
+        glm::vec3 position = 0.1f * abs(gBuffer[index].position);
+        glm::vec3 normal = 0.5f * (gBuffer[index].normal + glm::vec3(1.f, 1.f, 1.f));
+
+        if (ui_currentBuffer == 0) {
+            pbo[index].w = 0;
+            pbo[index].x = timeToIntersect;
+            pbo[index].y = timeToIntersect;
+            pbo[index].z = timeToIntersect;
+        }
+        else if (ui_currentBuffer == 1) {
+            pbo[index].w = 0;
+            pbo[index].x = normal.x * 255.0;
+            pbo[index].y = normal.y * 255.0;
+            pbo[index].z = normal.z * 255.0;
+        }
+        else {
+            pbo[index].w = 0;
+            pbo[index].x = position.x * 256.0;
+            pbo[index].y = position.y * 256.0;
+            pbo[index].z = position.z * 256.0;
+        }
     }
 }
 
 static Scene * hst_scene = NULL;
 static glm::vec3 * dev_image = NULL;
+static glm::vec3* dev_denoised_image_in = NULL;
+static glm::vec3* dev_denoised_image_out = NULL;
 static Geom * dev_geoms = NULL;
 static Material * dev_materials = NULL;
 static PathSegment * dev_paths = NULL;
 static ShadeableIntersection * dev_intersections = NULL;
 static GBufferPixel* dev_gBuffer = NULL;
+static float* dev_kernel = NULL;
+static glm::ivec2* dev_offsets = NULL;
 // TODO: static variables for device memory, any extra info you need, etc
 // ...
 
+glm::ivec2 offsets[25];
+
+float kernel[25] = { 1.f / 256.f, 1.f / 64.f, 3.f / 128.f, 1.f / 64.f, 1.f / 256.f,
+                     1.f / 64.f, 1.f / 16.f, 3.f / 32.f, 1.f / 16.f, 1 / 64.f,
+                     3.f / 128.f, 3.f / 32.f, 9.f / 64.f, 3.f / 32.f, 3.f / 128.f,
+                     1.f / 64.f, 1.f / 16.f, 3.f / 32.f, 1.f / 16.f, 1 / 64.f,
+                     1.f / 256.f, 1.f / 64.f, 3.f / 128.f, 1.f / 64.f, 1.f / 256.f };
+
 void pathtraceInit(Scene *scene) {
     hst_scene = scene;
     const Camera &cam = hst_scene->state.camera;
     const int pixelcount = cam.resolution.x * cam.resolution.y;
 
+    // Fill offset array
+    int count = 0;
+    for (int j = -2; j <= 2; ++j) {
+        for (int i = -2; i <= 2; ++i) {
+            offsets[count] = glm::ivec2(i, j);
+            //std::cout << "(" << count << "): " << "(" << offsets[count].x << ", " << offsets[count].y << ")" << std::endl;
+            ++count;
+        }
+    }
+
     cudaMalloc(&dev_image, pixelcount * sizeof(glm::vec3));
     cudaMemset(dev_image, 0, pixelcount * sizeof(glm::vec3));
 
+    // To store intermediate results after denoising
+    cudaMalloc(&dev_denoised_image_in, pixelcount * sizeof(glm::vec3));
+    cudaMemset(dev_denoised_image_in, 0, pixelcount * sizeof(glm::vec3));
+
+    cudaMalloc(&dev_denoised_image_out, pixelcount * sizeof(glm::vec3));
+    cudaMemset(dev_denoised_image_out, 0, pixelcount * sizeof(glm::vec3));
+
   	cudaMalloc(&dev_paths, pixelcount * sizeof(PathSegment));
 
   	cudaMalloc(&dev_geoms, scene->geoms.size() * sizeof(Geom));
@@ -114,6 +256,11 @@ void pathtraceInit(Scene *scene) {
     cudaMalloc(&dev_gBuffer, pixelcount * sizeof(GBufferPixel));
 
     // TODO: initialize any extra device memeory you need
+    cudaMalloc(&dev_kernel, 25 * sizeof(float));
+    cudaMemcpy(dev_kernel, kernel, 25 * sizeof(float), cudaMemcpyHostToDevice);
+
+    cudaMalloc(&dev_offsets, 25 * sizeof(glm::ivec2));
+    cudaMemcpy(dev_offsets, offsets, 25 * sizeof(glm::ivec2), cudaMemcpyHostToDevice);
 
     checkCUDAError("pathtraceInit");
 }
@@ -126,6 +273,10 @@ void pathtraceFree() {
   	cudaFree(dev_intersections);
     cudaFree(dev_gBuffer);
     // TODO: clean up any extra device memory you created
+    cudaFree(dev_denoised_image_in);
+    cudaFree(dev_denoised_image_out);
+    cudaFree(dev_kernel);
+    cudaFree(dev_offsets);
 
     checkCUDAError("pathtraceFree");
 }
@@ -148,7 +299,7 @@ __global__ void generateRayFromCamera(Camera cam, int iter, int traceDepth, Path
 		PathSegment & segment = pathSegments[index];
 
 		segment.ray.origin = cam.position;
-    segment.color = glm::vec3(1.0f, 1.0f, 1.0f);
+        segment.color = glm::vec3(1.0f, 1.0f, 1.0f);
 
 		segment.ray.direction = glm::normalize(cam.view
 			- cam.right * cam.pixelLength.x * ((float)x - (float)cam.resolution.x * 0.5f)
@@ -281,7 +432,9 @@ __global__ void generateGBuffer (
   int idx = blockIdx.x * blockDim.x + threadIdx.x;
   if (idx < num_paths)
   {
-    gBuffer[idx].t = shadeableIntersections[idx].t;
+    gBuffer[idx].t = shadeableIntersections[idx].t * 0.05f;
+    gBuffer[idx].position = getPointOnRay(pathSegments[idx].ray, shadeableIntersections[idx].t);
+    gBuffer[idx].normal = shadeableIntersections[idx].surfaceNormal;
   }
 }
 
@@ -356,46 +509,47 @@ void pathtrace(int frame, int iter) {
 	// --- PathSegment Tracing Stage ---
 	// Shoot ray into scene, bounce between objects, push shading chunks
 
-  // Empty gbuffer
-  cudaMemset(dev_gBuffer, 0, pixelcount * sizeof(GBufferPixel));
+    // Empty gbuffer
+    cudaMemset(dev_gBuffer, 0, pixelcount * sizeof(GBufferPixel));
 
 	// clean shading chunks
 	cudaMemset(dev_intersections, 0, pixelcount * sizeof(ShadeableIntersection));
 
-  bool iterationComplete = false;
+    bool iterationComplete = false;
 	while (!iterationComplete) {
 
-	// tracing
-	dim3 numblocksPathSegmentTracing = (num_paths + blockSize1d - 1) / blockSize1d;
-	computeIntersections <<<numblocksPathSegmentTracing, blockSize1d>>> (
-		depth
-		, num_paths
-		, dev_paths
-		, dev_geoms
-		, hst_scene->geoms.size()
-		, dev_intersections
-		);
-	checkCUDAError("trace one bounce");
-	cudaDeviceSynchronize();
-
-  if (depth == 0) {
-    generateGBuffer<<<numblocksPathSegmentTracing, blockSize1d>>>(num_paths, dev_intersections, dev_paths, dev_gBuffer);
-  }
-
-	depth++;
-
-  shadeSimpleMaterials<<<numblocksPathSegmentTracing, blockSize1d>>> (
-    iter,
-    num_paths,
-    dev_intersections,
-    dev_paths,
-    dev_materials
-  );
-  iterationComplete = depth == traceDepth;
+	    // tracing
+	    dim3 numblocksPathSegmentTracing = (num_paths + blockSize1d - 1) / blockSize1d;
+	    computeIntersections <<<numblocksPathSegmentTracing, blockSize1d>>> (
+		    depth
+		    , num_paths
+		    , dev_paths
+		    , dev_geoms
+		    , hst_scene->geoms.size()
+		    , dev_intersections
+		    );
+	    checkCUDAError("trace one bounce");
+	    cudaDeviceSynchronize();
+
+        // For first depth, generate gBuffer
+        if (depth == 0) {
+            generateGBuffer<<<numblocksPathSegmentTracing, blockSize1d>>>(num_paths, dev_intersections, dev_paths, dev_gBuffer);
+        }
+
+	    depth++;
+
+        shadeSimpleMaterials<<<numblocksPathSegmentTracing, blockSize1d>>> (
+            iter,
+            num_paths,
+            dev_intersections,
+            dev_paths,
+            dev_materials
+        );
+        iterationComplete = depth == traceDepth;
 	}
 
-  // Assemble this iteration and apply it to the image
-  dim3 numBlocksPixels = (pixelcount + blockSize1d - 1) / blockSize1d;
+    // Assemble this iteration and apply it to the image
+    dim3 numBlocksPixels = (pixelcount + blockSize1d - 1) / blockSize1d;
 	finalGather<<<numBlocksPixels, blockSize1d>>>(num_paths, dev_image, dev_paths);
 
     ///////////////////////////////////////////////////////////////////////////
@@ -410,7 +564,7 @@ void pathtrace(int frame, int iter) {
 }
 
 // CHECKITOUT: this kernel "post-processes" the gbuffer/gbuffers into something that you can visualize for debugging.
-void showGBuffer(uchar4* pbo) {
+void showGBuffer(uchar4* pbo, int ui_currentBuffer) {
     const Camera &cam = hst_scene->state.camera;
     const dim3 blockSize2d(8, 8);
     const dim3 blocksPerGrid2d(
@@ -418,7 +572,47 @@ void showGBuffer(uchar4* pbo) {
             (cam.resolution.y + blockSize2d.y - 1) / blockSize2d.y);
 
     // CHECKITOUT: process the gbuffer results and send them to OpenGL buffer for visualization
-    gbufferToPBO<<<blocksPerGrid2d, blockSize2d>>>(pbo, cam.resolution, dev_gBuffer);
+    gbufferToPBO<<<blocksPerGrid2d, blockSize2d>>>(pbo, ui_currentBuffer, cam.resolution, dev_gBuffer);
+}
+
+void denoise(uchar4* pbo, int iter, int filterSize, float sigma_col, float sigma_norm, float sigma_pos, int call_count) {
+    const Camera& cam = hst_scene->state.camera;
+    const dim3 blockSize2d(8, 8);
+    const dim3 blocksPerGrid2d(
+        (cam.resolution.x + blockSize2d.x - 1) / blockSize2d.x,
+        (cam.resolution.y + blockSize2d.y - 1) / blockSize2d.y);
+
+    timer().startGpuTimer();
+    // Copy initial image input data
+    int pixelcount = cam.resolution.x * cam.resolution.y;
+    cudaMemcpy(dev_denoised_image_in, dev_image, pixelcount * sizeof(glm::vec3), cudaMemcpyDeviceToDevice);
+
+    // Calculate iterations based on filter size
+    int num_iterations = glm::floor(glm::log2(filterSize / 5.f));
+
+    // Denoise image
+    int stepsize = 1;
+    float sigma_col_div = sigma_col;
+    for (int i = 0; i < num_iterations; ++i) {
+        denoiseWeighted << <blocksPerGrid2d, blockSize2d >> > (dev_denoised_image_out, cam.resolution, iter, stepsize,
+            sigma_col_div, sigma_norm, sigma_pos,
+            dev_denoised_image_in, dev_gBuffer, dev_kernel, dev_offsets);
+
+        // Do not swap on last iterations
+        if (i != num_iterations - 1) {
+            std::swap(dev_denoised_image_in, dev_denoised_image_out);
+        }
+        stepsize *= 2;
+        sigma_col_div /= (float)stepsize;
+    }
+
+    // Send results to OpenGL buffer for rendering
+    sendImageToPBO << <blocksPerGrid2d, blockSize2d >> > (pbo, cam.resolution, iter, dev_denoised_image_out);
+    timer().endGpuTimer();
+
+    if (call_count < 10) {
+        std::cout << timer().getGpuElapsedTimeForPreviousOperation() << std::endl;
+    }
 }
 
 void showImage(uchar4* pbo, int iter) {
diff --git a/src/pathtrace.h b/src/pathtrace.h
index 9e12f440..2ac23c36 100644
--- a/src/pathtrace.h
+++ b/src/pathtrace.h
@@ -1,10 +1,104 @@
 #pragma once
 
+#include <chrono>
 #include <vector>
 #include "scene.h"
 
 void pathtraceInit(Scene *scene);
 void pathtraceFree();
 void pathtrace(int frame, int iteration);
-void showGBuffer(uchar4 *pbo);
+void showGBuffer(uchar4 *pbo, int ui_currentBuffer);
+void denoise(uchar4* pbo, int iter, int filterSize, float sigma_col, float sigma_norm, float sigma_pos, int call_count);
 void showImage(uchar4 *pbo, int iter);
+
+/**
+* This class is used for timing the performance
+* Uncopyable and unmovable
+*
+* Adapted from WindyDarian(https://github.com/WindyDarian)
+*/
+class PerformanceTimer
+{
+public:
+    PerformanceTimer()
+    {
+        cudaEventCreate(&event_start);
+        cudaEventCreate(&event_end);
+    }
+
+    ~PerformanceTimer()
+    {
+        cudaEventDestroy(event_start);
+        cudaEventDestroy(event_end);
+    }
+
+    void startCpuTimer()
+    {
+        if (cpu_timer_started) { throw std::runtime_error("CPU timer already started"); }
+        cpu_timer_started = true;
+
+        time_start_cpu = std::chrono::high_resolution_clock::now();
+    }
+
+    void endCpuTimer()
+    {
+        time_end_cpu = std::chrono::high_resolution_clock::now();
+
+        if (!cpu_timer_started) { throw std::runtime_error("CPU timer not started"); }
+
+        std::chrono::duration<double, std::milli> duro = time_end_cpu - time_start_cpu;
+        prev_elapsed_time_cpu_milliseconds =
+            static_cast<decltype(prev_elapsed_time_cpu_milliseconds)>(duro.count());
+
+        cpu_timer_started = false;
+    }
+
+    void startGpuTimer()
+    {
+        if (gpu_timer_started) { throw std::runtime_error("GPU timer already started"); }
+        gpu_timer_started = true;
+
+        cudaEventRecord(event_start);
+    }
+
+    void endGpuTimer()
+    {
+        cudaEventRecord(event_end);
+        cudaEventSynchronize(event_end);
+
+        if (!gpu_timer_started) { throw std::runtime_error("GPU timer not started"); }
+
+        cudaEventElapsedTime(&prev_elapsed_time_gpu_milliseconds, event_start, event_end);
+        gpu_timer_started = false;
+    }
+
+    float getCpuElapsedTimeForPreviousOperation() //noexcept //(damn I need VS 2015
+    {
+        return prev_elapsed_time_cpu_milliseconds;
+    }
+
+    float getGpuElapsedTimeForPreviousOperation() //noexcept
+    {
+        return prev_elapsed_time_gpu_milliseconds;
+    }
+
+    // remove copy and move functions
+    PerformanceTimer(const PerformanceTimer&) = delete;
+    PerformanceTimer(PerformanceTimer&&) = delete;
+    PerformanceTimer& operator=(const PerformanceTimer&) = delete;
+    PerformanceTimer& operator=(PerformanceTimer&&) = delete;
+
+private:
+    cudaEvent_t event_start = nullptr;
+    cudaEvent_t event_end = nullptr;
+
+    using time_point_t = std::chrono::high_resolution_clock::time_point;
+    time_point_t time_start_cpu;
+    time_point_t time_end_cpu;
+
+    bool cpu_timer_started = false;
+    bool gpu_timer_started = false;
+
+    float prev_elapsed_time_cpu_milliseconds = 0.f;
+    float prev_elapsed_time_gpu_milliseconds = 0.f;
+};
diff --git a/src/preview.cpp b/src/preview.cpp
index 3ca27180..d78f69b5 100644
--- a/src/preview.cpp
+++ b/src/preview.cpp
@@ -215,13 +215,14 @@ void drawGui(int windowWidth, int windowHeight) {
     ImGui::Checkbox("Denoise", &ui_denoise);
 
     ImGui::SliderInt("Filter Size", &ui_filterSize, 0, 100);
-    ImGui::SliderFloat("Color Weight", &ui_colorWeight, 0.0f, 10.0f);
+    ImGui::SliderFloat("Color Weight", &ui_colorWeight, 0.0f, 80.0f);
     ImGui::SliderFloat("Normal Weight", &ui_normalWeight, 0.0f, 10.0f);
     ImGui::SliderFloat("Position Weight", &ui_positionWeight, 0.0f, 10.0f);
 
     ImGui::Separator();
 
     ImGui::Checkbox("Show GBuffer", &ui_showGbuffer);
+    ImGui::Combo("Buffer", &ui_currentBuffer, "Depth\0Normal\0Position\0");
 
     ImGui::Separator();
 
diff --git a/src/sceneStructs.h b/src/sceneStructs.h
index da7e558a..de38ec2c 100644
--- a/src/sceneStructs.h
+++ b/src/sceneStructs.h
@@ -79,4 +79,6 @@ struct ShadeableIntersection {
 // What information might be helpful for guiding a denoising filter?
 struct GBufferPixel {
   float t;
+  glm::vec3 position;
+  glm::vec3 normal;
 };