diff --git a/README.md b/README.md index 110697c..6797339 100644 --- a/README.md +++ b/README.md @@ -1,13 +1,120 @@ CUDA Path Tracer ================ +**I used two late days in this project.** + **University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 3** -* (TODO) YOUR NAME HERE -* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab) +* Yiyang Chen + * [LinkedIn](https://www.linkedin.com/in/yiyang-chen-6a7641210/), [personal website](https://cyy0915.github.io/) +* Tested on: Windows 10, i5-8700k @ 3.7GHz, GTX 1080, personal computer + +## My Work + +### BSDF Evaluation + + +![](img/3balls.png) + +*cornell.txt* + +See the above image (I modified the scene file `cornell.txt`) +* Ideal diffuse surfaces (the wall) +* Perfectly specular-reflective surfaces (the left ball) +* Imperfect specular-reflective surfaces (Phong shading) (the middle ball) + * After I got the rendered image, I found that because the Phong shading model is not physically based, I had to carefully set parameters to make the ball real. +* Perfectly refractive surfaces [PBRT 8.2] (the right ball) + +### Path Continuation/Termination +I terminated path that r, g, b are all smaller than EPSILON or hitting the light source, using thrust::partition. In performance analysis I will show that it improve performance significantly. + +### Memory Contiguous +I used thrust::sort_by_key to sort path segments and intersections by material ids after computing intersection. However, in performance analysis I will show that it may not improve performance. + +### Cache First Intersections +I made a toggleable option to cache the first bounce intersections when antialiasing and depth-of-field are not enabled. + +### Physically-based Depth-of-field [PBRT 6.2.3] +![](img/depth-of-field.png) + +*depth_of_field.txt. Enable depth-of-field* + +![](img/no-depth-of-field.png) + +*depth_of_field.txt. No depth-of-field* + +See the above image + +### Stochastic Sampled Antialiasing +![](img/antialiasing_detail.png) + +*Enable antialiasing* + +![](img/no_antialiasing_detail.png) + +*Disable antialiasing* + +See the above comparison. There's obvious difference at the edge of the ball. + +When generating a ray from a pixel coordinate, I add uniform noise between `[-0.5, 0.5]` for pixel coordinate's `x` and `y`. + +### Better Hemisphere Sampling Methods +Sample a cosine-weighted random direction in a hemisphere can be done by first sample uniformly in a circle and then get `z` by `z = sqrt(1-x^2-y^2)`. I rewrote functions in `interctions.h`, and added 2 different method to sample uniformly in a circle. (reference: http://www.josswhittle.com/concentric-disk-sampling/) + +However the rendered images are almost the same, so I don't put comparison here. + +### Re-startable Path Tracing +Press `Enter` to stop rendering temporarily, and press `Enter` again to start + +## Analysis + +### Core questions +* Stream compaction helps most after a few bounces. Print and plot the effects of stream compaction within a single iteration (i.e. the number of unterminated rays after each bounce) and evaluate the benefits you get from stream compaction. + +![](img/number_of_paths.png) + +![](img/stream_compaction.png) + +Stream compaction reduce approximately 2/3 paths in an iteration. However, the `thrust::partition` function also takes some time, so the overall benefit in my scene (which is not complex) is not so obvious. + + +* Compare scenes which are open (like the given cornell box) and closed (i.e. no light can escape the scene). Again, compare the performance effects of stream compaction! Remember, stream compaction only affects rays which terminate, so what might you expect? + +![](img/number_of_paths_compare.png) + +![](img/render_time_compare.png) + +In closed scene there are also termintaed paths. They are paths hitting light source or the rgb is too small. From the above graphs we can know that more paths terminated in open scene, and the render time is smaller. In addition, notice that in close scene, no compaction is even faster than compaction, I think it's because stream compaction also takes some time, it's considerable when the scene is simple. + +### Memory Contiguous +![](img/sort.png) + +The above graph is comparison of render time in my `cornell.txt` scene. +As mentioned in above section, using sort to make the memory contiguous doesn't improve performance. I think it's because although shading is a bit faster, sorting itself takes more time. Maybe sorting will work in a much more complex scene. + +### Physically-based Depth-of-field & Stochastic Sampled Antialiasing +The two features are all about jittering rays when generating rays from camera, so I discuss them togethor. +* Performance impact: I found that the rendering time is almost the same, because I just jitter rays in the function `generateRayFromCamera`, which is relatively simple +* Compare to hypothetical CPU version: I think it's almost same to implement on the GPU. On CPU we generate rays in a loop, and on GPU we just make the process parallel. +* Optimization: The sample method (on the aperture) might be changed. + +Overview write-up of the feature along with before/after images. +Performance impact of the feature +If you did something to accelerate the feature, what did you do and why? +Compare your GPU version of the feature to a HYPOTHETICAL CPU version (you don't have to implement it!)? Does it benefit or suffer from being implemented on the GPU? +How might this feature be optimized beyond your current implementation? -### (TODO: Your README) +### Refraction +* Performance impact: almost the same. Although I thought computing the fresnel may cost some time, it came out that my `shadeFakeMaterial` function always takes about 10 microseconds with or without a glass ball in my `cornell.txt`. +* Compare to hypothetical CPU version: The computation is in the kernal, and independent with other rays. So I think the GPU version has no difference with CPU version (maybe just `__host__ __device__`). +* Optimization: Imperfect refraction, and even other BSDF like microfacet BSDF. -*DO NOT* leave the README to the last minute! It is a crucial part of the -project, and we will not be able to grade you without a good README. +### Better Hemisphere Sampling Methods +* Peformance impact: just different methods of sampling, peformance is almost the same +* Compare to hypothetical CPU version: Sample is in the kernel, and independent with other rays. So I think the GPU version has no difference with CPU version (maybe just `__host__ __device__`) +* Optimization: The method of uniformly sampling in a square can be optimized, using grid sampling, stratified sampling, etc. +### Re-startable Path Tracing +* Peformance impact: None +* Compare to hypothetical CPU version: None +* Optimization: Save some render state and data in files so that the program can save a project when still incompleted and then close, and when it open it can resume the work. \ No newline at end of file diff --git a/img/3balls.png b/img/3balls.png new file mode 100644 index 0000000..79105bc Binary files /dev/null and b/img/3balls.png differ diff --git a/img/antialiasing.png b/img/antialiasing.png new file mode 100644 index 0000000..aca5eca Binary files /dev/null and b/img/antialiasing.png differ diff --git a/img/antialiasing_detail.png b/img/antialiasing_detail.png new file mode 100644 index 0000000..5a168e7 Binary files /dev/null and b/img/antialiasing_detail.png differ diff --git a/img/depth-of-field.png b/img/depth-of-field.png new file mode 100644 index 0000000..0e27560 Binary files /dev/null and b/img/depth-of-field.png differ diff --git a/img/no-depth-of-field.png b/img/no-depth-of-field.png new file mode 100644 index 0000000..f29e6e2 Binary files /dev/null and b/img/no-depth-of-field.png differ diff --git a/img/no_antialiasing_detail.png b/img/no_antialiasing_detail.png new file mode 100644 index 0000000..da6a210 Binary files /dev/null and b/img/no_antialiasing_detail.png differ diff --git a/img/number_of_paths.png b/img/number_of_paths.png new file mode 100644 index 0000000..1c193ae Binary files /dev/null and b/img/number_of_paths.png differ diff --git a/img/number_of_paths_close.png b/img/number_of_paths_close.png new file mode 100644 index 0000000..9c8bffb Binary files /dev/null and b/img/number_of_paths_close.png differ diff --git a/img/number_of_paths_compare.png b/img/number_of_paths_compare.png new file mode 100644 index 0000000..12198b3 Binary files /dev/null and b/img/number_of_paths_compare.png differ diff --git a/img/render_time_compare.png b/img/render_time_compare.png new file mode 100644 index 0000000..2f62288 Binary files /dev/null and b/img/render_time_compare.png differ diff --git a/img/sort.png b/img/sort.png new file mode 100644 index 0000000..505374d Binary files /dev/null and b/img/sort.png differ diff --git a/img/stream_compaction.png b/img/stream_compaction.png new file mode 100644 index 0000000..18dfe73 Binary files /dev/null and b/img/stream_compaction.png differ diff --git a/scenes/close_cornell.txt b/scenes/close_cornell.txt new file mode 100644 index 0000000..6d2c49e --- /dev/null +++ b/scenes/close_cornell.txt @@ -0,0 +1,164 @@ +// Emissive material (light) +MATERIAL 0 +RGB 1 1 1 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 5 + +// Diffuse white +MATERIAL 1 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Diffuse red +MATERIAL 2 +RGB .85 .35 .35 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Diffuse green +MATERIAL 3 +RGB .35 .85 .35 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Specular white 1 +MATERIAL 4 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB .98 .98 .98 +REFL 1 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Specular white 2 +MATERIAL 5 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB .98 .98 .98 +REFL 0.5 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Specular white 3 +MATERIAL 6 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB .98 .98 .98 +REFL 1 +REFR 1 +REFRIOR 1.5 +EMITTANCE 0 +REFRRGB .98 .98 .98 + +// Camera +CAMERA +RES 800 800 +FOVY 45 +ITERATIONS 5000 +DEPTH 8 +FILE cornell +EYE 0.0 5 10.5 +LOOKAT 0 5 0 +UP 0 1 0 +LENS 0.025 +FOCAL 10 + + +// Ceiling light +OBJECT 0 +cube +material 0 +TRANS 0 10 0 +ROTAT 0 0 0 +SCALE 3 .3 3 + +// Floor +OBJECT 1 +cube +material 1 +TRANS 0 0 0 +ROTAT 0 0 0 +SCALE 10 .01 16 + +// Ceiling +OBJECT 2 +cube +material 1 +TRANS 0 10 0 +ROTAT 0 0 90 +SCALE .01 10 16 + +// Back wall +OBJECT 3 +cube +material 1 +TRANS 0 5 -5 +ROTAT 0 90 0 +SCALE .01 10 10 + +// Left wall +OBJECT 4 +cube +material 2 +TRANS -5 5 0 +ROTAT 0 0 0 +SCALE .01 10 16 + +// Right wall +OBJECT 5 +cube +material 3 +TRANS 5 5 0 +ROTAT 0 0 0 +SCALE .01 10 16 + +// Sphere 1 +OBJECT 6 +sphere +material 4 +TRANS -3 2 -1 +ROTAT 0 0 0 +SCALE 3 3 3 + +// Sphere 2 +OBJECT 7 +sphere +material 5 +TRANS 0 4 -1 +ROTAT 0 0 0 +SCALE 3 3 3 + +// Sphere 3 +OBJECT 8 +sphere +material 6 +TRANS 3 2 -1 +ROTAT 0 0 0 +SCALE 3 3 3 + +// front wall +OBJECT 9 +cube +material 1 +TRANS 0 5 11 +ROTAT 0 -90 0 +SCALE .01 10 10 diff --git a/scenes/cornell.txt b/scenes/cornell.txt index 83ff820..41fa6f8 100644 --- a/scenes/cornell.txt +++ b/scenes/cornell.txt @@ -38,7 +38,7 @@ REFR 0 REFRIOR 0 EMITTANCE 0 -// Specular white +// Specular white 1 MATERIAL 4 RGB .98 .98 .98 SPECEX 0 @@ -48,6 +48,27 @@ REFR 0 REFRIOR 0 EMITTANCE 0 +// Specular white 2 +MATERIAL 5 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB .98 .98 .98 +REFL 0.5 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Specular white 3 +MATERIAL 6 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB .98 .98 .98 +REFL 1 +REFR 1 +REFRIOR 1.5 +EMITTANCE 0 +REFRRGB .98 .98 .98 + // Camera CAMERA RES 800 800 @@ -58,6 +79,8 @@ FILE cornell EYE 0.0 5 10.5 LOOKAT 0 5 0 UP 0 1 0 +LENS 0.025 +FOCAL 10 // Ceiling light @@ -108,10 +131,26 @@ TRANS 5 5 0 ROTAT 0 0 0 SCALE .01 10 10 -// Sphere +// Sphere 1 OBJECT 6 sphere material 4 -TRANS -1 4 -1 +TRANS -3 2 -1 +ROTAT 0 0 0 +SCALE 3 3 3 + +// Sphere 2 +OBJECT 7 +sphere +material 5 +TRANS 0 4 -1 ROTAT 0 0 0 SCALE 3 3 3 + +// Sphere 3 +OBJECT 8 +sphere +material 6 +TRANS 3 2 -1 +ROTAT 0 0 0 +SCALE 3 3 3 \ No newline at end of file diff --git a/scenes/depth_of_field.txt b/scenes/depth_of_field.txt new file mode 100644 index 0000000..a565211 --- /dev/null +++ b/scenes/depth_of_field.txt @@ -0,0 +1,148 @@ +// Emissive material (light) +MATERIAL 0 +RGB 1 1 1 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 5 + +// Diffuse white +MATERIAL 1 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Diffuse red +MATERIAL 2 +RGB .85 .35 .35 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Diffuse green +MATERIAL 3 +RGB .35 .85 .35 +SPECEX 0 +SPECRGB 0 0 0 +REFL 0 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Specular white 1 +MATERIAL 4 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB .98 .98 .98 +REFL 1 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Specular white 2 +MATERIAL 5 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB .98 .98 .98 +REFL 0.5 +REFR 0 +REFRIOR 0 +EMITTANCE 0 + +// Specular white 3 +MATERIAL 6 +RGB .98 .98 .98 +SPECEX 0 +SPECRGB .98 .98 .98 +REFL 1 +REFR 1 +REFRIOR 1.5 +EMITTANCE 0 +REFRRGB .98 .98 .98 + +// Camera +CAMERA +RES 800 800 +FOVY 45 +ITERATIONS 5000 +DEPTH 8 +FILE cornell +EYE 0.0 5 10.5 +LOOKAT 0 5 0 +UP 0 1 0 +LENS 0.3 +FOCAL 7 + + +// Ceiling light +OBJECT 0 +cube +material 0 +TRANS 0 10 0 +ROTAT 0 0 0 +SCALE 3 .3 3 + +// Floor +OBJECT 1 +cube +material 1 +TRANS 0 0 0 +ROTAT 0 0 0 +SCALE 10 .01 10 + +// Ceiling +OBJECT 2 +cube +material 1 +TRANS 0 10 0 +ROTAT 0 0 90 +SCALE .01 10 10 + +// Back wall +OBJECT 3 +cube +material 1 +TRANS 0 5 -5 +ROTAT 0 90 0 +SCALE .01 10 10 + +// Left wall +OBJECT 4 +cube +material 2 +TRANS -5 5 0 +ROTAT 0 0 0 +SCALE .01 10 10 + +// Right wall +OBJECT 5 +cube +material 3 +TRANS 5 5 0 +ROTAT 0 0 0 +SCALE .01 10 10 + +// Sphere 2 +OBJECT 6 +sphere +material 2 +TRANS -2 7 -3 +ROTAT 0 0 0 +SCALE 3 3 3 + +// Sphere 1 +OBJECT 7 +sphere +material 3 +TRANS 2 2 3 +ROTAT 0 0 0 +SCALE 2 2 2 \ No newline at end of file diff --git a/src/interactions.h b/src/interactions.h index f969e45..b3eb739 100644 --- a/src/interactions.h +++ b/src/interactions.h @@ -2,31 +2,61 @@ #include "intersections.h" -// CHECKITOUT -/** - * Computes a cosine-weighted random direction in a hemisphere. - * Used for diffuse lighting. - */ __host__ __device__ -glm::vec3 calculateRandomDirectionInHemisphere( - glm::vec3 normal, thrust::default_random_engine &rng) { - thrust::uniform_real_distribution u01(0, 1); - - float up = sqrt(u01(rng)); // cos(theta) - float over = sqrt(1 - up * up); // sin(theta) - float around = u01(rng) * TWO_PI; +glm::vec3 squareToDiskUniform(const glm::vec2& sample) +{ + float phi, r, u, v; + r = sqrt(sample.x); + phi = 2 * PI * sample.y; + u = r * cos(phi); + v = r * sin(phi); + return glm::vec3(u, v, 0); +} - // Find a direction that is not the normal based off of whether or not the - // normal's components are all equal to sqrt(1/3) or whether or not at - // least one component is less than sqrt(1/3). Learned this trick from - // Peter Kutz. +__host__ __device__ +glm::vec3 squareToDiskConcentric(const glm::vec2& sample) +{ + float phi, r, u, v; + float a = 2 * sample.x - 1; + float b = 2 * sample.y - 1; + if (a > -b) { // region 1 or 2 + if (a > b) {// region 1, also |a| > |b| + r = a; + phi = (PI / 4) * (b / a); + } + else {// region 2, also |b| > |a| + r = b; + phi = (PI / 4) * (2 - (a / b)); + } + } + else {// region 3 or 4 + if (a < b) { // region 3, also |a| >= |b|, a != 0 + r = -a; + phi = (PI / 4) * (4 + (b / a)); + } + else {// region 4, |b| >= |a|, but a==0 and b==0 could occur. + r = -b; + if (b != 0) + phi = (PI / 4) * (6 - (a / b)); + else + phi = 0; + } + } + u = r * cos(phi); + v = r * sin(phi); + return glm::vec3(u, v, 0); +} +__host__ __device__ +glm::vec3 localToWorldWithNormal(glm::vec3 pos, glm::vec3 normal) { glm::vec3 directionNotNormal; if (abs(normal.x) < SQRT_OF_ONE_THIRD) { directionNotNormal = glm::vec3(1, 0, 0); - } else if (abs(normal.y) < SQRT_OF_ONE_THIRD) { + } + else if (abs(normal.y) < SQRT_OF_ONE_THIRD) { directionNotNormal = glm::vec3(0, 1, 0); - } else { + } + else { directionNotNormal = glm::vec3(0, 0, 1); } @@ -36,9 +66,105 @@ glm::vec3 calculateRandomDirectionInHemisphere( glm::vec3 perpendicularDirection2 = glm::normalize(glm::cross(normal, perpendicularDirection1)); - return up * normal - + cos(around) * over * perpendicularDirection1 - + sin(around) * over * perpendicularDirection2; + return pos.x * perpendicularDirection1 + pos.y * perpendicularDirection2 + pos.z * normal; +} + +// CHECKITOUT +/** + * Computes a cosine-weighted random direction in a hemisphere. + * Used for diffuse lighting. + */ +__host__ __device__ +glm::vec3 calculateRandomDirectionInHemisphere( + glm::vec3 normal, thrust::default_random_engine &rng, float& pdf) { + thrust::uniform_real_distribution u01(0, 1); + + glm::vec3 pos = squareToDiskConcentric(glm::vec2(u01(rng), u01(rng))); + pos.z = sqrt(1 - pos.x * pos.x - pos.y * pos.y); + pdf = pos.z * INV_PI; + return localToWorldWithNormal(pos, normal); + + //float up = sqrt(u01(rng)); // cos(theta) + //float over = sqrt(1 - up * up); // sin(theta) + //float around = u01(rng) * TWO_PI; + + //// Find a direction that is not the normal based off of whether or not the + //// normal's components are all equal to sqrt(1/3) or whether or not at + //// least one component is less than sqrt(1/3). Learned this trick from + //// Peter Kutz. + + //glm::vec3 directionNotNormal; + //if (abs(normal.x) < SQRT_OF_ONE_THIRD) { + // directionNotNormal = glm::vec3(1, 0, 0); + //} else if (abs(normal.y) < SQRT_OF_ONE_THIRD) { + // directionNotNormal = glm::vec3(0, 1, 0); + //} else { + // directionNotNormal = glm::vec3(0, 0, 1); + //} + + //// Use not-normal direction to generate two perpendicular directions + //glm::vec3 perpendicularDirection1 = + // glm::normalize(glm::cross(normal, directionNotNormal)); + //glm::vec3 perpendicularDirection2 = + // glm::normalize(glm::cross(normal, perpendicularDirection1)); + + //pdf = up * INV_PI; + //return up * normal + // + cos(around) * over * perpendicularDirection1 + // + sin(around) * over * perpendicularDirection2; +} + +__host__ __device__ +glm::vec3 calculateRandomDirectionInSpecularLobe( + glm::vec3 wiCenter, float specex, thrust::default_random_engine& rng, float& pdf) { + thrust::uniform_real_distribution u01(0, 1); + + float up = powf(u01(rng), 1.f / (specex + 1.f)); // cos(alpha) + float over = sqrt(1.f - up * up); // sin(alpha) + float around = u01(rng) * TWO_PI; + + pdf = (specex + 1) * powf(up, specex) * over / TWO_PI; + return localToWorldWithNormal(glm::vec3(cos(around) * over, sin(around) * over, up), wiCenter); + +} + +__host__ __device__ +float FrDielectric(float cosThetaI, float etaI, float etaT) +{ + cosThetaI = glm::clamp(cosThetaI, -1.f, 1.f); + if (cosThetaI <= 0.f) + { + float tmp = etaI; + etaI = etaT; + etaT = tmp; + cosThetaI = abs(cosThetaI); + } + + float sinThetaI = sqrt(glm::max((float)0, 1 - cosThetaI * cosThetaI)); + float sinThetaT = etaI / etaT * sinThetaI; + if (sinThetaT >= 1) { + return 1.f; + } + + float cosThetaT = sqrt(glm::max((float)0, 1 - sinThetaT * sinThetaT)); + float Rparl = ((etaT * cosThetaI) - (etaI * cosThetaT)) / ((etaT * cosThetaI) + (etaI * cosThetaT)); + float Rperp = ((etaI * cosThetaI) - (etaT * cosThetaT)) / ((etaI * cosThetaI) + (etaT * cosThetaT)); + return (Rparl * Rparl + Rperp * Rperp) / 2; +} + +__host__ __device__ +bool Refract(const glm::vec3& wi, const glm::vec3& n, float eta, + glm::vec3* wt) { + // Compute cos theta using Snell's law + float cosThetaI = glm::dot(n, wi); + float sin2ThetaI = glm::max(float(0), float(1 - cosThetaI * cosThetaI)); + float sin2ThetaT = eta * eta * sin2ThetaI; + + // Handle total internal reflection for transmission + if (sin2ThetaT >= 1) return false; + float cosThetaT = sqrt(1 - sin2ThetaT); + *wt = eta * -wi + (eta * cosThetaI - cosThetaT) * glm::vec3(n); + return true; } /** @@ -76,4 +202,98 @@ void scatterRay( // TODO: implement this. // A basic implementation of pure-diffuse shading will just call the // calculateRandomDirectionInHemisphere defined above. + /*if (pathSegment.remainingBounces < 0) { + int a = pathSegment.remainingBounces; + return; + } */ + //todo + + + glm::vec3 scatterDir; + float pdf = 0.f; + glm::vec3 color(0.f); + + if (!m.hasReflective && !m.hasRefractive) { //pure diffuse + scatterDir = calculateRandomDirectionInHemisphere(normal, rng, pdf); + float cosine = glm::dot(normal, scatterDir); + color = glm::max(cosine, 0.f) * m.color * INV_PI; + } + else if (m.hasReflective > 0 && m.hasReflective < 1){ //imperfect reflection + thrust::uniform_real_distribution u01(0, 1); + float randNum = u01(rng); + float frac = m.hasReflective; + + if (randNum < frac) { + scatterDir = calculateRandomDirectionInHemisphere(normal, rng, pdf); + float cosine = glm::dot(normal, scatterDir); + color = glm::max(cosine, 0.f) * m.color * INV_PI * frac; + pdf *= frac; + } + else { + glm::vec3 wiCenter = glm::reflect(pathSegment.ray.direction, normal); + scatterDir = calculateRandomDirectionInSpecularLobe(wiCenter, m.specular.exponent, rng, pdf); + if (glm::dot(normal, scatterDir) <= 0) { + pdf = 0; + } + else { + float cosRI = glm::dot(scatterDir, wiCenter); + color = m.specular.color * powf(cosRI, m.specular.exponent) * INV_PI * (1-frac); + pdf *= (1 - frac); + } + } + } + else if (m.hasReflective == 1 && !m.hasRefractive) { //perfect reflection + scatterDir = glm::reflect(pathSegment.ray.direction, normal); + pdf = 1; + color = m.specular.color; + } + else if (m.hasReflective == 1 && m.hasRefractive == 1) { //reflection and refraction, like glass + thrust::uniform_real_distribution u01(0, 1); + float randNum = u01(rng); + + if (randNum < 0.5) { + scatterDir = glm::reflect(pathSegment.ray.direction, normal); + pdf = 0.5; + float cosine = glm::dot(scatterDir, normal); + float fresnel = FrDielectric(cosine, 1, m.indexOfRefraction); + color = fresnel * m.specular.color; + } + else { + float eta = m.indexOfRefraction; + glm::vec3 trueNormal = normal; + if (glm::dot(pathSegment.ray.direction, normal) > 0) { + eta = 1.f / eta; + trueNormal = -normal; + } + + glm::vec3 refractDir; + bool fullReflect = !Refract(-pathSegment.ray.direction, trueNormal, 1.f / eta, &refractDir); + if (fullReflect) { + scatterDir = glm::reflect(pathSegment.ray.direction, trueNormal); + pdf = 0.5; + float cosine = glm::dot(scatterDir, trueNormal); + float fresnel = FrDielectric(cosine, 1, eta); + color = fresnel * m.specular.color; + } + else { + scatterDir = refractDir; + pdf = 0.5; + float cosine = glm::dot(scatterDir, trueNormal); + float fresnel = (1 - FrDielectric(cosine, 1, eta)); + color = fresnel * m.refractionColor; + } + } + } + + if (pdf < 0.01f) { + pathSegment.color = glm::vec3(0.f); + } + else { + pathSegment.color *= color / pdf; + } + pathSegment.ray.direction = scatterDir; + pathSegment.ray.origin = intersect + scatterDir * 0.01f; + pathSegment.remainingBounces--; + + } diff --git a/src/intersections.h b/src/intersections.h index b150407..72f4c78 100644 --- a/src/intersections.h +++ b/src/intersections.h @@ -25,7 +25,7 @@ __host__ __device__ inline unsigned int utilhash(unsigned int a) { * Falls slightly short so that it doesn't intersect the object it's hitting. */ __host__ __device__ glm::vec3 getPointOnRay(Ray r, float t) { - return r.origin + (t - .0001f) * glm::normalize(r.direction); + return r.origin + t * glm::normalize(r.direction); } /** @@ -136,9 +136,9 @@ __host__ __device__ float sphereIntersectionTest(Geom sphere, Ray r, intersectionPoint = multiplyMV(sphere.transform, glm::vec4(objspaceIntersection, 1.f)); normal = glm::normalize(multiplyMV(sphere.invTranspose, glm::vec4(objspaceIntersection, 0.f))); - if (!outside) { + /*if (!outside) { normal = -normal; - } + }*/ return glm::length(r.origin - intersectionPoint); } diff --git a/src/main.cpp b/src/main.cpp index 96127b6..a65fb89 100644 --- a/src/main.cpp +++ b/src/main.cpp @@ -15,6 +15,8 @@ static bool camchanged = true; static float dtheta = 0, dphi = 0; static glm::vec3 cammove; +static bool stoped = false; + float zoom, theta, phi; glm::vec3 cameraPosition; glm::vec3 ogLookAt; // for recentering the camera @@ -107,6 +109,10 @@ void saveImage() { } void runCuda() { + if (stoped) { + return; + } + if (camchanged) { iteration = 0; Camera& cam = renderState->camera; @@ -165,6 +171,9 @@ void keyCallback(GLFWwindow* window, int key, int scancode, int action, int mods case GLFW_KEY_S: saveImage(); break; + case GLFW_KEY_ENTER: + stoped = !stoped; + break; case GLFW_KEY_SPACE: camchanged = true; renderState = &scene->state; diff --git a/src/pathtrace.cu b/src/pathtrace.cu index fd2a464..2cfdb24 100644 --- a/src/pathtrace.cu +++ b/src/pathtrace.cu @@ -4,6 +4,9 @@ #include #include #include +#include +#include +#include #include "sceneStructs.h" #include "scene.h" @@ -14,6 +17,14 @@ #include "intersections.h" #include "interactions.h" +//option +#define CACHE_FIRST_INTERSECTION 1 +#define MATERIAL_CONTIGUOUS 0 +#define ANTIALIASING 1 +#define DEPTH_OF_FIELD 0 + +#define ENABLE_CACHE_FIRST_INTERSECTION (CACHE_FIRST_INTERSECTION && !ANTIALIASING && !DEPTH_OF_FIELD) + #define ERRORCHECK 1 #define FILENAME (strrchr(__FILE__, '/') ? strrchr(__FILE__, '/') + 1 : __FILE__) @@ -44,6 +55,17 @@ thrust::default_random_engine makeSeededRandomEngine(int iter, int index, int de return thrust::default_random_engine(h); } +//for stream compaction +struct shouldContinue +{ + __host__ __device__ + bool operator()(const PathSegment x) + { + bool stop = x.remainingBounces <= 0 || (x.color.r < EPSILON&& x.color.b < EPSILON&& x.color.g < EPSILON); + return !stop; + } +}; + //Kernel that writes the image to the OpenGL PBO directly. __global__ void sendImageToPBO(uchar4* pbo, glm::ivec2 resolution, int iter, glm::vec3* image) { @@ -76,6 +98,10 @@ static PathSegment* dev_paths = NULL; static ShadeableIntersection* dev_intersections = NULL; // TODO: static variables for device memory, any extra info you need, etc // ... +#if ENABLE_CACHE_FIRST_INTERSECTION +static ShadeableIntersection* dev_cacheIntersections = NULL; +#endif // ENABLE_CACHE_FIRST_INTERSECTION + void InitDataContainer(GuiDataContainer* imGuiData) { @@ -103,6 +129,11 @@ void pathtraceInit(Scene* scene) { cudaMemset(dev_intersections, 0, pixelcount * sizeof(ShadeableIntersection)); // TODO: initialize any extra device memeory you need +#if ENABLE_CACHE_FIRST_INTERSECTION + cudaMalloc(&dev_cacheIntersections, pixelcount * sizeof(ShadeableIntersection)); + cudaMemset(dev_cacheIntersections, 0, pixelcount * sizeof(ShadeableIntersection)); +#endif // ENABLE_CACHE_FIRST_INTERSECTION + checkCUDAError("pathtraceInit"); } @@ -114,6 +145,10 @@ void pathtraceFree() { cudaFree(dev_materials); cudaFree(dev_intersections); // TODO: clean up any extra device memory you created +#if ENABLE_CACHE_FIRST_INTERSECTION + cudaFree(dev_cacheIntersections); +#endif // ENABLE_CACHE_FIRST_INTERSECTION + checkCUDAError("pathtraceFree"); } @@ -130,20 +165,44 @@ __global__ void generateRayFromCamera(Camera cam, int iter, int traceDepth, Path { int x = (blockIdx.x * blockDim.x) + threadIdx.x; int y = (blockIdx.y * blockDim.y) + threadIdx.y; + if (x < cam.resolution.x && y < cam.resolution.y) { int index = x + (y * cam.resolution.x); PathSegment& segment = pathSegments[index]; + thrust::default_random_engine rng = makeSeededRandomEngine(iter, cam.resolution.y * cam.resolution.x - index, 0); + thrust::uniform_real_distribution u01(0, 1); + segment.ray.origin = cam.position; segment.color = glm::vec3(1.0f, 1.0f, 1.0f); + float pixelx = x, pixely = y; + glm::mat4 cameraToWorld(glm::vec4(cam.right, 0), glm::vec4(cam.up, 0), glm::vec4(cam.view, 0), glm::vec4(cam.position, 1)); - // TODO: implement antialiasing by jittering the ray - segment.ray.direction = glm::normalize(cam.view - - cam.right * cam.pixelLength.x * ((float)x - (float)cam.resolution.x * 0.5f) - - cam.up * cam.pixelLength.y * ((float)y - (float)cam.resolution.y * 0.5f) - ); +#if ANTIALIASING + pixelx = x + u01(rng) - 0.5; + pixely = y + u01(rng) - 0.5; +#endif +#if DEPTH_OF_FIELD + float phi, r, u, v; + r = sqrt(u01(rng)); + phi = TWO_PI * u01(rng); + u = r * cos(phi); + v = r * sin(phi); + glm::vec3 pLens = cam.lensRadius * glm::vec3(u, v, 0); + glm::vec3 pPixel = glm::vec3(-cam.pixelLength.x * (pixelx - (float)cam.resolution.x * 0.5f), -cam.pixelLength.y * (pixely - (float)cam.resolution.y * 0.5f), 1); + glm::vec3 pFocus = cam.focalDistance * pPixel; + segment.ray.origin = glm::vec3(cameraToWorld * glm::vec4(pLens, 1)); + segment.ray.direction = glm::normalize(glm::mat3(cameraToWorld) * (pFocus - pLens)); +#else + glm::vec3 pPixel = glm::vec3(-cam.pixelLength.x * (pixelx - (float)cam.resolution.x * 0.5f), -cam.pixelLength.y * (pixely - (float)cam.resolution.y * 0.5f), 1); + segment.ray.direction = glm::mat3(cameraToWorld) * pPixel; + /*segment.ray.direction = glm::normalize(cam.view + - cam.right * cam.pixelLength.x * (pixelx - (float)cam.resolution.x * 0.5f) + - cam.up * cam.pixelLength.y * (pixely - (float)cam.resolution.y * 0.5f) + );*/ +#endif segment.pixelIndex = index; segment.remainingBounces = traceDepth; } @@ -253,14 +312,17 @@ __global__ void shadeFakeMaterial( // If the material indicates that the object was a light, "light" the ray if (material.emittance > 0.0f) { pathSegments[idx].color *= (materialColor * material.emittance); + pathSegments[idx].remainingBounces = 0; //stop when hit light source } // Otherwise, do some pseudo-lighting computation. This is actually more // like what you would expect from shading in a rasterizer like OpenGL. // TODO: replace this! you should be able to start with basically a one-liner else { - float lightTerm = glm::dot(intersection.surfaceNormal, glm::vec3(0.0f, 1.0f, 0.0f)); - pathSegments[idx].color *= (materialColor * lightTerm) * 0.3f + ((1.0f - intersection.t * 0.02f) * materialColor) * 0.7f; - pathSegments[idx].color *= u01(rng); // apply some noise because why not + glm::vec3 isectPoint = getPointOnRay(pathSegments[idx].ray, intersection.t); + scatterRay(pathSegments[idx], isectPoint, intersection.surfaceNormal, material, rng); + //float lightTerm = glm::dot(intersection.surfaceNormal, glm::vec3(0.0f, 1.0f, 0.0f)); + //pathSegments[idx].color *= (materialColor * lightTerm) * 0.3f + ((1.0f - intersection.t * 0.02f) * materialColor) * 0.7f; + //pathSegments[idx].color *= u01(rng); // apply some noise because why not } // If there was no intersection, color the ray black. // Lots of renderers use 4 channel color, RGBA, where A = alpha, often @@ -281,6 +343,9 @@ __global__ void finalGather(int nPaths, glm::vec3* image, PathSegment* iteration if (index < nPaths) { PathSegment iterationPath = iterationPaths[index]; + for (int i = 0; i < 3; i++) { + iterationPath.color[i] = glm::max(iterationPath.color[i], 0.f); + } image[iterationPath.pixelIndex] += iterationPath.color; } } @@ -350,8 +415,32 @@ void pathtrace(uchar4* pbo, int frame, int iter) { // clean shading chunks cudaMemset(dev_intersections, 0, pixelcount * sizeof(ShadeableIntersection)); - // tracing dim3 numblocksPathSegmentTracing = (num_paths + blockSize1d - 1) / blockSize1d; + +#if ENABLE_CACHE_FIRST_INTERSECTION + if (depth == 0 && iter != 1) { + cudaMemcpy(dev_intersections, dev_cacheIntersections, sizeof(ShadeableIntersection) * num_paths, cudaMemcpyDeviceToDevice); + checkCUDAError("loadIntersections"); + } + else { + computeIntersections << > > ( + depth + , num_paths + , dev_paths + , dev_geoms + , hst_scene->geoms.size() + , dev_intersections + ); + checkCUDAError("trace one bounce"); + + if (depth == 0 && iter == 1) { + cudaMemcpy(dev_cacheIntersections, dev_intersections, sizeof(ShadeableIntersection) * num_paths, cudaMemcpyDeviceToDevice); + checkCUDAError("cacheIntersections"); + } + } + +#else + // tracing computeIntersections << > > ( depth , num_paths @@ -361,9 +450,19 @@ void pathtrace(uchar4* pbo, int frame, int iter) { , dev_intersections ); checkCUDAError("trace one bounce"); +#endif // CACHE_FIRST_INTERSECTION + cudaDeviceSynchronize(); depth++; + //thrust ptr + thrust::device_ptr thrust_dev_paths(dev_paths); + thrust::device_ptr thrust_dev_intersection(dev_intersections); + +#if MATERIAL_CONTIGUOUS + thrust::sort_by_key(thrust_dev_intersection, thrust_dev_intersection + num_paths, thrust_dev_paths); +#endif // MATERIAL_CONTIGUOUS + // TODO: // --- Shading Stage --- // Shade path segments based on intersections and generate new rays by @@ -380,7 +479,11 @@ void pathtrace(uchar4* pbo, int frame, int iter) { dev_paths, dev_materials ); - iterationComplete = true; // TODO: should be based off stream compaction results. + + thrust::device_ptr thrust_dev_paths_end = thrust::partition(thrust_dev_paths, thrust_dev_paths + num_paths, shouldContinue()); + dev_path_end = thrust_dev_paths_end.get(); + num_paths = dev_path_end - dev_paths; + iterationComplete = depth >= traceDepth || num_paths == 0; // TODO: should be based off stream compaction results. if (guiData != NULL) { @@ -390,7 +493,7 @@ void pathtrace(uchar4* pbo, int frame, int iter) { // Assemble this iteration and apply it to the image dim3 numBlocksPixels = (pixelcount + blockSize1d - 1) / blockSize1d; - finalGather << > > (num_paths, dev_image, dev_paths); + finalGather << > > (pixelcount, dev_image, dev_paths); /////////////////////////////////////////////////////////////////////////// diff --git a/src/scene.cpp b/src/scene.cpp index 3fb6239..24b636b 100644 --- a/src/scene.cpp +++ b/src/scene.cpp @@ -124,6 +124,10 @@ int Scene::loadCamera() { camera.lookAt = glm::vec3(atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str())); } else if (strcmp(tokens[0].c_str(), "UP") == 0) { camera.up = glm::vec3(atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str())); + } else if (strcmp(tokens[0].c_str(), "LENS") == 0) { + camera.lensRadius = atof(tokens[1].c_str()); + } else if (strcmp(tokens[0].c_str(), "FOCAL") == 0) { + camera.focalDistance = atof(tokens[1].c_str()); } utilityCore::safeGetline(fp_in, line); @@ -160,9 +164,9 @@ int Scene::loadMaterial(string materialid) { Material newMaterial; //load static properties - for (int i = 0; i < 7; i++) { - string line; - utilityCore::safeGetline(fp_in, line); + string line; + utilityCore::safeGetline(fp_in, line); + while (!line.empty() && fp_in.good()) { vector tokens = utilityCore::tokenizeString(line); if (strcmp(tokens[0].c_str(), "RGB") == 0) { glm::vec3 color( atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str()) ); @@ -181,6 +185,11 @@ int Scene::loadMaterial(string materialid) { } else if (strcmp(tokens[0].c_str(), "EMITTANCE") == 0) { newMaterial.emittance = atof(tokens[1].c_str()); } + else if (strcmp(tokens[0].c_str(), "REFRRGB") == 0) { + newMaterial.refractionColor = glm::vec3(atof(tokens[1].c_str()), atof(tokens[2].c_str()), atof(tokens[3].c_str())); + } + utilityCore::safeGetline(fp_in, line); + } materials.push_back(newMaterial); return 1; diff --git a/src/sceneStructs.h b/src/sceneStructs.h index da4dbf3..f40f59e 100644 --- a/src/sceneStructs.h +++ b/src/sceneStructs.h @@ -38,6 +38,7 @@ struct Material { float hasRefractive; float indexOfRefraction; float emittance; + glm::vec3 refractionColor; }; struct Camera { @@ -49,6 +50,8 @@ struct Camera { glm::vec3 right; glm::vec2 fov; glm::vec2 pixelLength; + float lensRadius; + float focalDistance; }; struct RenderState { @@ -66,10 +69,16 @@ struct PathSegment { int remainingBounces; }; + + // Use with a corresponding PathSegment to do: // 1) color contribution computation // 2) BSDF evaluation: generate a new ray struct ShadeableIntersection { + __host__ __device__ bool operator<(const ShadeableIntersection& s) const { + return materialId < s.materialId; + } + float t; glm::vec3 surfaceNormal; int materialId; diff --git a/src/utilities.h b/src/utilities.h index d459e33..e6279cd 100644 --- a/src/utilities.h +++ b/src/utilities.h @@ -13,6 +13,7 @@ #define TWO_PI 6.2831853071795864769252867665590057683943f #define SQRT_OF_ONE_THIRD 0.5773502691896257645091487805019574556476f #define EPSILON 0.00001f +#define INV_PI 0.31830988618379067154f class GuiDataContainer {