Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
0f6c9fe
Naive Flocking Achieved :)
risia Sep 6, 2018
a3cb0aa
Grid Neighbor Search added, need to generalize cell search range
risia Sep 8, 2018
f86d137
Generalized search range based on rule search radii
risia Sep 8, 2018
5863fec
Coherent Grid functional. Thrust had mem errors when new buff. alloc.…
risia Sep 8, 2018
d1dcb73
Update README.md
risia Sep 9, 2018
bd8b3f8
Code cleanup, and still debugging thrust bad_alloc for N between 5153…
risia Sep 9, 2018
a206dd9
Merge branch 'master' of https://github.com/risia/Project1-CUDA-Flocking
risia Sep 9, 2018
6ed2d5c
Fixed pos swap code typo/error
risia Sep 9, 2018
2fbb6f6
Update README.md
risia Sep 9, 2018
bc60ca9
Update README.md
risia Sep 9, 2018
d4d0c53
Added screenshots & gif of coherent grid
risia Sep 9, 2018
8c5fc1a
Merge branch 'master' of https://github.com/risia/Project1-CUDA-Flocking
risia Sep 9, 2018
feb29c2
Update README.md
risia Sep 9, 2018
848900d
Update README.md
risia Sep 9, 2018
62406de
Added average fps output fro easier perf. measurement
risia Sep 9, 2018
86ca749
Merge branch 'master' of https://github.com/risia/Project1-CUDA-Flocking
risia Sep 9, 2018
92f665b
Fixed running out of registers in scattered vel. update when block si…
risia Sep 10, 2018
98ca089
Added change to coherent as well & fixed typo
risia Sep 10, 2018
4e73d2c
Performance Plots
risia Sep 10, 2018
45252b3
Update README.md
risia Sep 10, 2018
ed63a26
Update README.md
risia Sep 10, 2018
94c7fee
Added my performance spreadsheet for reference
risia Sep 10, 2018
97b0db3
Fixed 'obvious' error in array size
risia Sep 11, 2018
3ca6a2a
Update README.md
risia Sep 11, 2018
c1ed862
Update README.md
risia Sep 11, 2018
85c0a0c
Update README.md
risia Sep 11, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added Project1 Performance.xlsx
Binary file not shown.
56 changes: 50 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,55 @@
**University of Pennsylvania, CIS 565: GPU Programming and Architecture,
Project 1 - Flocking**

* (TODO) YOUR NAME HERE
* (TODO) [LinkedIn](), [personal website](), [twitter](), etc.
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* Angelina Risi
* [LinkedIn](www.linkedin.com/in/angelina-risi)
* Tested on: Windows 10, i7-6700HQ @ 2.60GHz 8GB, GTX 960M 4096MB (Personal Laptop)

### (TODO: Your README)
**Images:**
Uniform Coherent Grid Under Default Conditions:
![Early in Simulation](/images/CoherentGridSim1.PNG)
![Late in Simulation](/images/CoherentGridSim3.PNG)

Animation at 15000 Boids (for increased FPS)
![Animated Coherent Grid](/images/uniformcoherentgrid.gif)

Include screenshots, analysis, etc. (Remember, this is public, so don't put
anything here that you don't want to share with the world.)
**Features:**
* Naive flocking iterating over all other boids to adjust velocity
* Grid-based search for influencing boids
- Region is divided into cells and before search the boids are sorted by containing cell using Thrust
- Modified search to dynamically change cell search bounds based on search radius
* Search not limited to fixed number of cells when rule distances, grid resolution, etc modified
* Coherent grid-based search
- In addition to sorting boids by cell, their position and velocity data are sorted into this order as well
* Allows more direct access to data, potentially improving performance

**Known Issues:**
* ~~Boid counts in the range of N = 5153 through N = 10112 cause CUDA launch failure or Thrust bad_alloc errors~~
- ~~Cause unknown, as boid counts above and below this range behave correctly~~
- ~~May be some sort of setup issue, needs testing on another machine but code appears correct~~
* ~~Tried changing architecture to sm_50 (which should be supported by graphics card) and running CMake, but CUDA launch failure~~
- ~~Does NOT occur in Naive simulation, probably since it doesn't use Thrust~~
* ~~After decreasing the cell width to max of rule distances, the program crashed with illegal memory access @ 5000 objects (triggered at buffer reset?), and crashed with a runtime error at higher values, but 4000 ojects works fine and is used for performance analysis~~

As of most recent commit these errors are resolved. It was due to an error in which variable was used to size the grid start/end buffers.

**Performance Analysis:**

Performance was tested on all three solutions by taking the average FPS during the runtime, which was timed for approximately 1 minute. The default standard for comparison is the number of boids at 5000, block size of 128, and cell width of twice the search radius. Visualization was disabled to get a more accurate measure of performance, especially when a large number of particles would need rendering.

* Changing # of Boids:
Increasing the number of boids has the overall trend of decreasing performance. This is especially true for the naive implementation as the processing time is proportional to N^2, due to each boid checking all other boids. However, the grid-based searches have abnormalities in their performance, dipping at low boid counts, then increasing and following a less sharp decrease in performance. From 10000 boids up you can see the linear decrease with doubling boid counts, indicating a logarithmic relationship less steep than the naive N^2 one. Due to errors described prior I had difficulties getting performance data for certain ranges of boid counts.
![FPS Graph w/ Change in Boid Count](/images/defaultFPS.PNG)

* Changing Block Size:
Changing the block size from the default 128, to 256 and 1024, changes the number of threads sharing the same block, which affects such things as memory sharing. Surprisingly, there is a slight decrease in performance seen in testing as the block size increases, but why this occurs is uncertain.
![FPS at Block Size 256](/images/block256FPS.PNG)
![FPS at Block Size 1024](/images/block1024FPS.PNG)

* Changing Cell Width:
At smaller boid counts, we can see a decrease in performance when the cell width decreases due to more overhead in checking more cells. I was unable to test at higher boid counts due to crashes I was unable to resolve. The performance was tested at both the default of cell width = twice the search radius, and cell width = search radius.
![FPS Graph w/ Change in CellWidth](/images/cellWidthFPS.PNG)

The coherent grid proved to perform significantly superior to the scattered grid. While both greatly improve on the Naive implementation at high boid counts, at the highest numbers tested the coherent grid had twice the FPS of the scattered grid. Low boid counts showed better performance for the naive solution, possibly due to the overhead of the extra sorting, buffers, and array accesses outweighing the gain of reduced looping through boids. This does not explain why the FPS at 5000 boids is so low compared to FPS at much higher counts for the grid solutions, though.


Binary file added images/CoherentGridSim1.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/CoherentGridSim2.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/CoherentGridSim3.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/CoherentGridSim4.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/block1024FPS.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/block256FPS.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/cellWidthFPS.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/defaultFPS.PNG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/uniformcoherentgrid.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading