Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 54 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,59 @@
**University of Pennsylvania, CIS 565: GPU Programming and Architecture,
Project 1 - Flocking**

* (TODO) YOUR NAME HERE
* (TODO) [LinkedIn](), [personal website](), [twitter](), etc.
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
## Flocking with CUDA on the GPU
### Connie Chang
* [LinkedIn](linkedin.com/in/conniechang44), [Demo Reel](vimeo.com/ConChang/DemoReel)
* Tested on: Windows 10, Intel Xeon CPU E5-1630 v4 @ 3.70 GHz, GTX 1070 8GB (SIG Lab)

### (TODO: Your README)
Introduction
-------------
This project consists of three different algorithms for flocking, and a performance analysis. Everthing is written in CUDA for the GPU. I was responsible for implementing the flocking algorithms, the neighborhood search, and invoking CUDA kernels.

Include screenshots, analysis, etc. (Remember, this is public, so don't put
anything here that you don't want to share with the world.)
10,000 particles flocking together
![](images/10K.gif)

Screenshot of 5,000 particles
![](images/Screenshot.PNG)

The three algorithms are:
* Naive: Querying every particle to find close neighbors.
* Uniform Grid: Breaking the space into voxels, and only checking nearby voxels.
* Coherent Uniform Grid: The same as Uniform Grid, but with structuring particle data as contiguously as possible

Performance Graphs
------------------
A graph comparing the performance of each algorithm.
![](images/Default_5000Boids_128BlockSize.png)


A graph comparing the number of particles (boids).
![](images/NumberBoidsComparison.png)


A graph comparing the number of threads per block.
![](images/BlockSizeComparison.png)


A graph comparing the uniform grid's cell width, relative to the neighbor search distance.
![](images/CellWidthComparison.png)

Performance Questions and Answers:
-----------------------
*For each implementation, how does changing the number of boids affect performance? Why do you think this is?*

**Naive**: Increasing boids slows down the simulation. This makes sense because the algorithm takes more time to search through every boid.
**Uniform Grid**: Surprisingly, performance gets better as the number of boids increases. I've heard that GPUs can perform faster when larger blocks of threads are used. My guess is that's why my results are like so.
**Coherent**: Similar to uniform grid, the performance increases as the number of boids increases.

*For each implementation, how does changing the block count and block size affect performance? Why do you think this is?*

For all implementations, changing the block size did not significantly affect performance. I think this is because we need much more than 1024 threads, the maximum block size. Therefore, the blocks must be queued even at max block size, and we cannot go any faster.

*For the coherent uniform grid: did you experience any performance improvements with the more coherent uniform grid? Was this the outcome you expected? Why or why not?*

No, I did not see any performance improvements. In fact, it was sometimes slower to use the coherent uniform grid. This was not what I expected because I thought it would be faster. I expected lining up the boids more contiguously in memory would mean faster access to them. Seeing my results, my guess is that the extra steps to reshuffle the position/velocity data were too slow. It slowed down the process enough that the contiguous memory could not make up for it.

*Did changing cell width and checking 27 vs 8 neighboring cells affect performance? Why or why not? Be careful: it is insufficient (and possibly incorrect) to say that 27-cell is slower simply because there are more cells to check!*

No, changing the cell width did not affect performance. Even though we are checking more neighboring cells with a smaller width, we check for less boids in each cell. In other words, a large cell has a higher chance of encompassing more boids that are too far to influence the current boid. On the other hand, a small cell has a lower chance of encountering such boids.
Binary file added images/10K.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/5K.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/BlockSizeComparison.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/CellWidthComparison.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/Default_5000Boids_128BlockSize.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/NumberBoidsComparison.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/Screenshot.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
21 changes: 21 additions & 0 deletions performance/output_10000boids_128blocks.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
Naive, VIS
Average fps: 184.358

Naive, NO VIS
Average fps: 233.616

Uniform Grid, VIS
Average fps: 604.607

Uniform Grid, NO VIS
Average fps: 1264.15

Uniform Grid, Coherent, VIS
Average fps: 562.452

Uniform Grid, Coherent, NO VIS
Average fps: 1261.27

Uniform Grid, Coherent, VIS
Average fps: 584.581

18 changes: 18 additions & 0 deletions performance/output_2500boids_128blocks.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
Naive, VIS
Average fps: 477.27

Naive, NO VIS
Average fps: 1014.99

Uniform Grid, VIS
Average fps: 438.218

Uniform Grid, NO VIS
Average fps: 798.167

Uniform Grid, Coherent, VIS
Average fps: 440.226

Uniform Grid, Coherent, NO VIS
Average fps: 763.431

6 changes: 6 additions & 0 deletions performance/output_5000boids_1024blocks.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Naive, VIS
Average fps: 305.67

Uniform Grid, Coherent, VIS
Average fps: 454.431

54 changes: 54 additions & 0 deletions performance/output_5000boids_128block.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
Naive, VIS
Average fps: 357.328

Naive, NO VIS
Average fps: 554.034

Uniform Grid, VIS
Average fps: 440.004

Uniform Grid, NO VIS
Average fps: 801.031

Uniform Grid, Coherent, VIS
Average fps: 430.916

Uniform Grid, Coherent, NO VIS
Average fps: 759.633

Naive, VIS
Average fps: 340.335

Naive, NO VIS
Average fps: 550.725

Uniform Grid, VIS
Average fps: 428.152

Uniform Grid, NO VIS
Average fps: 813.542

Uniform Grid, Coherent, VIS
Average fps: 476.203

Uniform Grid, Coherent, NO VIS
Average fps: 763.405

Naive, VIS
Average fps: 342.638

Naive, NO VIS
Average fps: 547.646

Uniform Grid, VIS
Average fps: 435.811

Uniform Grid, NO VIS
Average fps: 761.638

Uniform Grid, Coherent, VIS
Average fps: 448.705

Uniform Grid, Coherent, NO VIS
Average fps: 779.181

18 changes: 18 additions & 0 deletions performance/output_5000boids_128blocks_1cellWidth.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
Naive, VIS
Average fps: 345.707

Uniform Grid, VIS
Average fps: 446.372

Uniform Grid, Coherent, VIS
Average fps: 449.218

Naive, NO VIS
Average fps: 545.644

Uniform Grid, NO VIS
Average fps: 747.858

Uniform Grid, Coherent, NO VIS
Average fps: 781.189

24 changes: 24 additions & 0 deletions performance/output_5000boids_128blocks_take2.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
Naive, VIS
Average fps: 337.955

Uniform Grid, VIS
Average fps: 460.023

Uniform Grid, Coherent, VIS
Average fps: 424.042

Naive, NO VIS
Average fps: 543.884

Uniform Grid, NO VIS
Average fps: 860.128

Uniform Grid, Coherent, NO VIS
Average fps: 786.732

Uniform Grid, NO VIS
Average fps: 831.391

Uniform Grid, Coherent, VIS
Average fps: 430.894

18 changes: 18 additions & 0 deletions performance/output_5000boids_256blocks.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
Naive, NO VIS
Average fps: 539.576

Uniform Grid, NO VIS
Average fps: 714.914

Uniform Grid, Coherent, NO VIS
Average fps: 797.649

Naive, VIS
Average fps: 354.235

Uniform Grid, VIS
Average fps: 440.193

Uniform Grid, Coherent, VIS
Average fps: 444.447

18 changes: 18 additions & 0 deletions performance/output_5000boids_64blocks.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
Naive, NO VIS
Average fps: 541.066

Uniform Grid, NO VIS
Average fps: 790.12

Uniform Grid, Coherent, NO VIS
Average fps: 794.772

Naive, VIS
Average fps: 356.021

Uniform Grid, VIS
Average fps: 437.014

Uniform Grid, Coherent, VIS
Average fps: 451.163

Loading