From 257ab4220802753890ccb5a4eef012a23b7e12b5 Mon Sep 17 00:00:00 2001 From: pjewell Date: Tue, 21 Sep 2021 22:01:01 -0400 Subject: [PATCH 1/2] -ver1, project2 --- README.md | 111 ++++++++++- src/main.cpp | 109 +++++++++-- stream_compaction/common.h | 7 + stream_compaction/cpu.cu | 64 ++++++- stream_compaction/efficient.cu | 329 ++++++++++++++++++++++++++++++++- stream_compaction/efficient.h | 1 + stream_compaction/naive.cu | 80 +++++++- stream_compaction/thrust.cu | 44 ++++- 8 files changed, 697 insertions(+), 48 deletions(-) diff --git a/README.md b/README.md index 0e38ddb..d7351a0 100644 --- a/README.md +++ b/README.md @@ -3,12 +3,111 @@ CUDA Stream Compaction **University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 2** -* (TODO) YOUR NAME HERE - * (TODO) [LinkedIn](), [personal website](), [twitter](), etc. -* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab) +* Paul (San) Jewell + * [LinkedIn](https://www.linkedin.com/in/paul-jewell-2aba7379), [work website]( + https://www.biociphers.org/paul-jewell-lab-member), [personal website](https://gitlab.com/inklabapp), [twitter](https://twitter.com/inklabapp), etc. +* Tested on: (TODO) Linux pop-os 5.11.0-7614-generic, i7-9750H CPU @ 2.60GHz 32GB, GeForce GTX 1650 Mobile / Max-Q 4GB -### (TODO: Your README) +### GPU scan and stream compaction demonstration -Include analysis, etc. (Remember, this is public, so don't put -anything here that you don't want to share with the world.) +This repository contains some comparisons of basic algorithms compared between a CPU serial implementation, my GPU CUDA implementation, +and similar functions provided by the 'thrust' library. +I demonstrate usage of the naive, buffered, work-efficient, and chunked work-efficient scan algorithms as described in +[this link](https://developer.nvidia.com/gpugems/gpugems3/part-vi-gpu-computing/chapter-39-parallel-prefix-sum-scan-cuda). +The implementation is also broken down into a number of sub-functions for easier comprehension by the curious reader. + +### Performance comparison overview + +After manually testing a number of block sizes, I've found that a size of 64 works best for the efficient scan +implementation. (same as for boids)and 512 (the largest I can do) works best for the naive implementation. (this +pretty much makes sense as it is a largely processor-count-limited algorithm). Thrust does not allow specifying a +manual block size that I could find so I didn't do any testing there. + +![boidstats](img/chart1.png) + +For a first iteration, I was unable to run larger sizes without causing OOM on my local machine + (desktop completely crashed), so I tested up to 1mil only. Clearly something seems to have goofed in either + a single blocking line in my efficient implementation, or the measurement code location. I don't believe a simple + inefficiency would lead to the huge discrepancy between naive and efficient implementations here. The no-chunking + implementation should do the same amount of operations for the up and down sweep as the chunking implementation (ideal), + however, it will get the incorrect result. There is something very inefficient with the chunking that I've not yet been + able to work out. + +``` + **************** + ** SCAN TESTS ** + **************** + [ 29 18 29 17 11 3 19 31 3 49 25 31 31 ... 14 0 ] + ==== cpu scan, power-of-two ==== + elapsed time: 0.000686ms (std::chrono Measured) + [ 29 47 76 93 104 107 126 157 160 209 234 265 296 ... 23987 23987 ] + ==== cpu scan, non-power-of-two ==== + elapsed time: 0.000377ms (std::chrono Measured) + [ 29 47 76 93 104 107 126 157 160 209 234 265 296 ... 23950 23960 ] + passed + ==== naive scan, power-of-two ==== + elapsed time: 0.060032ms (CUDA Measured) + [ 29 47 76 93 104 107 126 157 160 209 234 265 296 ... 23987 23987 ] + ==== naive scan, non-power-of-two ==== + elapsed time: 0.055296ms (CUDA Measured) + [ 29 47 76 93 104 107 126 157 160 209 234 265 296 ... 23950 23960 ] + passed + ==== work-efficient scan no chunk, power-of-two ==== + elapsed time: 0.087648ms (CUDA Measured) + [ 29 47 76 93 104 107 126 157 160 209 234 265 296 ... 23987 23987 ] + passed + ==== work-efficient scan no chunk, non-power-of-two ==== + elapsed time: 0.041152ms (CUDA Measured) + [ 29 47 76 93 104 107 126 157 160 209 234 265 296 ... 23950 23960 ] + passed + ==== work-efficient scan, power-of-two ==== + elapsed time: 0.04512ms (CUDA Measured) + [ 29 47 76 93 104 107 126 157 160 209 234 265 296 ... 23987 23987 ] + passed + ==== work-efficient scan, non-power-of-two ==== + elapsed time: 0.058752ms (CUDA Measured) + [ 29 47 76 93 104 107 126 157 160 209 234 265 296 ... 23950 23960 ] + passed + ==== thrust scan, power-of-two ==== + elapsed time: 0.022528ms (CUDA Measured) + [ 29 47 76 93 104 107 126 157 160 209 234 265 296 ... 23987 23987 ] + passed + ==== thrust scan, non-power-of-two ==== + elapsed time: 0.021024ms (CUDA Measured) + [ 29 47 76 93 104 107 126 157 160 209 234 265 296 ... 23950 23960 ] + passed + + ***************************** + ** STREAM COMPACTION TESTS ** + ***************************** + [ 1 2 1 3 1 1 3 1 3 3 3 3 3 ... 0 0 ] + [ 1 2 1 3 1 1 3 1 3 3 3 3 3 ... 2 2 ] + ==== cpu compact without scan, power-of-two ==== + elapsed time: 0.002584ms (std::chrono Measured) + [ 1 2 1 3 1 1 3 1 3 3 3 3 3 ... 2 3 ] + passed + ==== cpu compact without scan, non-power-of-two ==== + elapsed time: 0.001953ms (std::chrono Measured) + [ 1 2 1 3 1 1 3 1 3 3 3 3 3 ... 2 2 ] + passed + ==== cpu compact with scan ==== + elapsed time: 0.004194ms (std::chrono Measured) + [ 1 2 1 3 1 1 3 1 3 3 3 3 3 ... 2 3 ] + passed + ==== cpu compact with scan, non-power-of-two ==== + elapsed time: 0.003411ms (std::chrono Measured) + [ 1 2 1 3 1 1 3 1 3 3 3 3 3 ... 2 2 ] + passed + ==== work-efficient compact, power-of-two ==== + elapsed time: 0.065216ms (CUDA Measured) + [ 1 2 1 3 1 1 3 1 3 3 3 3 3 ... 2 3 ] + passed + ==== work-efficient compact, non-power-of-two ==== + elapsed time: 0.063552ms (CUDA Measured) + [ 1 2 1 3 1 1 3 1 3 3 3 3 3 ... 2 2 ] + passed + sh: 1: pause: not found + + Process finished with exit code 0 +``` \ No newline at end of file diff --git a/src/main.cpp b/src/main.cpp index 896ac2b..a3de94a 100644 --- a/src/main.cpp +++ b/src/main.cpp @@ -13,7 +13,9 @@ #include #include "testing_helpers.hpp" -const int SIZE = 1 << 8; // feel free to change the size of array +//const int SIZE = 1 << 8; // feel free to change the size of array +//const int SIZE = 10000; // feel free to change the size of array +const int SIZE = 1000; const int NPOT = SIZE - 3; // Non-Power-Of-Two int *a = new int[SIZE]; int *b = new int[SIZE]; @@ -29,6 +31,7 @@ int main(int argc, char* argv[]) { genArray(SIZE - 1, a, 50); // Leave a 0 at the end to test that edge case a[SIZE - 1] = 0; + printArray(SIZE, a, true); // initialize b using StreamCompaction::CPU::scan you implement @@ -51,48 +54,102 @@ int main(int argc, char* argv[]) { printDesc("naive scan, power-of-two"); StreamCompaction::Naive::scan(SIZE, c, a); printElapsedTime(StreamCompaction::Naive::timer().getGpuElapsedTimeForPreviousOperation(), "(CUDA Measured)"); - //printArray(SIZE, c, true); - printCmpResult(SIZE, b, c); - - /* For bug-finding only: Array of 1s to help find bugs in stream compaction or scan - onesArray(SIZE, c); - printDesc("1s array for finding bugs"); - StreamCompaction::Naive::scan(SIZE, c, a); - printArray(SIZE, c, true); */ + printArray(SIZE, c, true); - zeroArray(SIZE, c); + zeroArray(NPOT, c); printDesc("naive scan, non-power-of-two"); StreamCompaction::Naive::scan(NPOT, c, a); printElapsedTime(StreamCompaction::Naive::timer().getGpuElapsedTimeForPreviousOperation(), "(CUDA Measured)"); - //printArray(SIZE, c, true); + printArray(NPOT, c, true); + printCmpResult(NPOT, b, c); + + + // For bug-finding only: Array of 1s to help find bugs in stream compaction or scan + +// onesArray(SIZE, a); +// zeroArray(SIZE, c); +// printDesc("1s array for finding bugs"); +// StreamCompaction::Naive::scan(NPOT, c, a); +// printArray(NPOT, c); + +// onesArray(NPOT, a); +// zeroArray(NPOT, c); +// printDesc("1s array for finding bugs"); +// StreamCompaction::Naive::scan(NPOT, c, a); +// printArray(NPOT, c); +// +// onesArray(SIZE, a); +// zeroArray(SIZE, c); +// printDesc("1s array for finding bugs"); +// StreamCompaction::Efficient::scan(SIZE, c, a); +// printArray(SIZE, c); + + +// onesArray(3, a); +// a[0] = 4; +// a[1] = 4; +// a[2] = 1; +// zeroArray(3, c); +// printDesc("1s array for finding bugs"); +// StreamCompaction::Efficient::scanplain(3, c, a); +// printArray(3, c); +// +// onesArray(SIZE, a); +// zeroArray(SIZE, c); +// printDesc("1s array for finding bugs"); +// StreamCompaction::Efficient::scan(SIZE, c, a); +// printArray(SIZE, c); +// +// onesArray(SIZE, a); +// zeroArray(SIZE, c); +// printDesc("1s array for finding bugs"); +// StreamCompaction::Efficient::scan(SIZE, c, a); +// printArray(SIZE, c); + + zeroArray(SIZE, c); + printDesc("work-efficient scan no chunk, power-of-two"); + StreamCompaction::Efficient::scanplain(SIZE, c, a); + exclusive2inclusive(SIZE, c, a[SIZE-1]); + printElapsedTime(StreamCompaction::Efficient::timer().getGpuElapsedTimeForPreviousOperation(), "(CUDA Measured)"); + printArray(SIZE, c, true); + printCmpResult(SIZE, b, c); + + zeroArray(SIZE, c); + printDesc("work-efficient scan no chunk, non-power-of-two"); + StreamCompaction::Efficient::scanplain(NPOT, c, a); + exclusive2inclusive(NPOT, c, a[NPOT-1]); + printElapsedTime(StreamCompaction::Efficient::timer().getGpuElapsedTimeForPreviousOperation(), "(CUDA Measured)"); + printArray(NPOT, c, true); printCmpResult(NPOT, b, c); zeroArray(SIZE, c); printDesc("work-efficient scan, power-of-two"); StreamCompaction::Efficient::scan(SIZE, c, a); + exclusive2inclusive(SIZE, c, a[SIZE-1]); printElapsedTime(StreamCompaction::Efficient::timer().getGpuElapsedTimeForPreviousOperation(), "(CUDA Measured)"); - //printArray(SIZE, c, true); + printArray(SIZE, c, true); printCmpResult(SIZE, b, c); zeroArray(SIZE, c); printDesc("work-efficient scan, non-power-of-two"); StreamCompaction::Efficient::scan(NPOT, c, a); + exclusive2inclusive(NPOT, c, a[NPOT-1]); printElapsedTime(StreamCompaction::Efficient::timer().getGpuElapsedTimeForPreviousOperation(), "(CUDA Measured)"); - //printArray(NPOT, c, true); + printArray(NPOT, c, true); printCmpResult(NPOT, b, c); zeroArray(SIZE, c); printDesc("thrust scan, power-of-two"); StreamCompaction::Thrust::scan(SIZE, c, a); printElapsedTime(StreamCompaction::Thrust::timer().getGpuElapsedTimeForPreviousOperation(), "(CUDA Measured)"); - //printArray(SIZE, c, true); + printArray(SIZE, c, true); printCmpResult(SIZE, b, c); zeroArray(SIZE, c); printDesc("thrust scan, non-power-of-two"); StreamCompaction::Thrust::scan(NPOT, c, a); printElapsedTime(StreamCompaction::Thrust::timer().getGpuElapsedTimeForPreviousOperation(), "(CUDA Measured)"); - //printArray(NPOT, c, true); + printArray(NPOT, c, true); printCmpResult(NPOT, b, c); printf("\n"); @@ -104,7 +161,16 @@ int main(int argc, char* argv[]) { genArray(SIZE - 1, a, 4); // Leave a 0 at the end to test that edge case a[SIZE - 1] = 0; + //a[NPOT - 1] = 0; +// +// a[0] = 0; +// a[1] = 2; +// a[2] = 1; +// a[3] = 2; +// a[4] = 1; + printArray(SIZE, a, true); + printArray(NPOT, a, true); int count, expectedCount, expectedNPOT; @@ -133,18 +199,25 @@ int main(int argc, char* argv[]) { printArray(count, c, true); printCmpLenResult(count, expectedCount, b, c); + zeroArray(SIZE, c); + printDesc("cpu compact with scan, non-power-of-two"); + count = StreamCompaction::CPU::compactWithScan(NPOT, c, a); + printElapsedTime(StreamCompaction::CPU::timer().getCpuElapsedTimeForPreviousOperation(), "(std::chrono Measured)"); + printArray(count, c, true); + printCmpLenResult(count, expectedNPOT, b, c); + zeroArray(SIZE, c); printDesc("work-efficient compact, power-of-two"); count = StreamCompaction::Efficient::compact(SIZE, c, a); printElapsedTime(StreamCompaction::Efficient::timer().getGpuElapsedTimeForPreviousOperation(), "(CUDA Measured)"); - //printArray(count, c, true); + printArray(count, c, true); printCmpLenResult(count, expectedCount, b, c); - zeroArray(SIZE, c); + zeroArray(NPOT, c); printDesc("work-efficient compact, non-power-of-two"); count = StreamCompaction::Efficient::compact(NPOT, c, a); printElapsedTime(StreamCompaction::Efficient::timer().getGpuElapsedTimeForPreviousOperation(), "(CUDA Measured)"); - //printArray(count, c, true); + printArray(count, c, true); printCmpLenResult(count, expectedNPOT, b, c); system("pause"); // stop Win32 console from closing on exit diff --git a/stream_compaction/common.h b/stream_compaction/common.h index d2c1fed..c2ad78d 100644 --- a/stream_compaction/common.h +++ b/stream_compaction/common.h @@ -30,6 +30,13 @@ inline int ilog2ceil(int x) { return x == 1 ? 0 : ilog2(x - 1) + 1; } +inline void exclusive2inclusive(int n, int* scanned, int lastvalue){ + for(int i=0; i #include "cpu.h" - +#include "testing_helpers.hpp" #include "common.h" namespace StreamCompaction { @@ -20,6 +20,25 @@ namespace StreamCompaction { void scan(int n, int *odata, const int *idata) { timer().startCpuTimer(); // TODO + /* + * out[0] = in[0]; // assuming n > 0 + for (int k = 1; k < n; ++k) + out[k] = out[k – 1] + in[k]; + */ + if(n < 1) return; + + // inclusive + odata[0] = idata[0]; + for(int i=1; i0; --i){ +// odata[i] = odata[i-1]; +// } +// odata[0] = 0; + timer().endCpuTimer(); } @@ -30,9 +49,17 @@ namespace StreamCompaction { */ int compactWithoutScan(int n, int *odata, const int *idata) { timer().startCpuTimer(); - // TODO + + int numNonZero = 0; + for(int i=0; i #include "common.h" #include "efficient.h" +#include + +#define checkCUDAErrorWithLine(msg) checkCUDAErrorFn(msg, __FILE__, __LINE__) +#define blockSize 64 namespace StreamCompaction { namespace Efficient { @@ -12,15 +16,267 @@ namespace StreamCompaction { return timer; } + __global__ void kernEfficientScanUp(int N, int d, int* odata) { + int index = (blockIdx.x * blockDim.x) + threadIdx.x; + if (index >= N) return; + + int power_plus = (int) powf(2.0f, (float) d+1); + if (index % power_plus != 0) return; + + int power_ident = power_plus / 2; + //int power_ident = (int) powf(2.0f, (float) d); + + odata[index + power_plus - 1] += odata[index + power_ident - 1]; + + + } + + __global__ void kernEfficientScanReset(int N, int* odata) { + // should not be needed in the end + int index = (blockIdx.x * blockDim.x) + threadIdx.x; + if (index != 0) return; + odata[N-1] = 0; + } + + + + + __global__ void kernEfficientMoveLastValue(int N, int block_idx, const int* odata, int* auxArray) { + // should not be needed in the end + int index = (blockIdx.x * blockDim.x) + threadIdx.x; + if (index != 0) return; + auxArray[block_idx] = odata[N-1]; + + + } + + __global__ void kernEfficientScanDown(int N, int d, int* odata) { + int index = (blockIdx.x * blockDim.x) + threadIdx.x; + if (index >= N) return; + + int power_plus = (int) powf(2.0f, (float) d+1); + if (index % power_plus != 0) return; + + int power_ident = power_plus / 2; + //int power_ident = (int) powf(2.0f, (float) d); + + int t = odata[index + power_ident - 1]; + + odata[index + power_ident - 1] = odata[index + power_plus - 1]; + odata[index + power_plus - 1] += t; + + + } + + __global__ void kernAddToAll(int N, int block_idx, const int* auxArr, int* odata) { + int index = (blockIdx.x * blockDim.x) + threadIdx.x; + if (index >= N) return; + + odata[index] += auxArr[block_idx]; + } + + void _scan(int N, int* dev_odata) { + dim3 fullBlocksPerGrid((N + blockSize - 1) / blockSize); + + + + for(int d=0;d<=ilog2ceil(N)-1;d++){ + kernEfficientScanUp<<>>(N, d, dev_odata); + + } + + kernEfficientScanReset<<>>(N, dev_odata); + + for(int d=ilog2ceil(N)-1;d>=0;d--){ + kernEfficientScanDown<<>>(N, d, dev_odata); + + + } + + + } + + void _scanBlock(int N, int* dev_odata, int block_idx, int* aux_arr) { + dim3 fullBlocksPerGrid((N + blockSize - 1) / blockSize); + + int offset = N*block_idx; + + for(int d=0;d<=ilog2ceil(N)-1;d++){ + kernEfficientScanUp<<>>(N, d, dev_odata + offset); + + } + + cudaMemcpy(aux_arr + block_idx, dev_odata + offset + N - 1, sizeof(int), cudaMemcpyDeviceToDevice); + //cudaMemset(dev_odata + offset + N -1 , 0, sizeof(int)); + kernEfficientScanReset<<>>(N, dev_odata + offset); + + for(int d=ilog2ceil(N)-1;d>=0;d--){ + kernEfficientScanDown<<>>(N, d, dev_odata + offset); + + + } + + + } + + + int getNextPower(int _N){ + int N = 1; + while(N < _N){ + N *= 2; + } + return N; + } + /** * Performs prefix-sum (aka scan) on idata, storing the result into odata. */ - void scan(int n, int *odata, const int *idata) { - timer().startGpuTimer(); - // TODO - timer().endGpuTimer(); + + + void _chunkedScanAutomatic(int N, int *odata){ + + + dim3 fullBlocksPerGrid((N + blockSize - 1) / blockSize); + + // divide array into blocks + const int num_per_block = 1024; + int num_blocks = ceil(N / (double) num_per_block); + int num_scan_blocks = getNextPower(num_blocks); + + int* dev_auxarr; + cudaMalloc((void**)&dev_auxarr, num_scan_blocks * sizeof(int)); + checkCUDAErrorWithLine("cudaMalloc dev_auxarr failed!"); + + + for(int block_idx=0; block_idx>>(num_per_block, block_idx, dev_auxarr, odata + offset); + } + + cudaFree(dev_auxarr); + } + void _chunkedScan(int N, int *odata, int *auxarr, int num_per_block, int num_blocks, int num_scan_blocks){ + + + dim3 fullBlocksPerGrid((N + blockSize - 1) / blockSize); + + + + + + + for(int block_idx=0; block_idx>>(num_per_block, block_idx, auxarr, odata + offset); + } + + + + } + + + void scan(int _N, int *odata, const int *idata) { + + + int N = getNextPower(_N); + + + //dim3 fullBlocksPerGrid((N + blockSize - 1) / blockSize); + + // divide array into blocks + const int num_per_block = 1024; + int num_blocks = ceil(N / (double) num_per_block); + int num_scan_blocks = getNextPower(num_blocks); + + int* dev_auxarr; + cudaMalloc((void**)&dev_auxarr, num_scan_blocks * sizeof(int)); + checkCUDAErrorWithLine("cudaMalloc dev_auxarr failed!"); + + // copy data to gpu buffer + int* dev_odata; + cudaMalloc((void**)&dev_odata, N * sizeof(int)); + checkCUDAErrorWithLine("cudaMalloc dev_odata failed!"); + + cudaMemset(dev_odata, 0, sizeof(int) * N); + cudaMemcpy(dev_odata, idata, sizeof(int) * _N, cudaMemcpyHostToDevice); + + timer().startGpuTimer(); + + _chunkedScan(N, dev_odata, dev_auxarr, num_per_block, num_blocks, num_scan_blocks); + + + timer().endGpuTimer(); + + cudaMemcpy(odata, dev_odata, sizeof(int) * _N, cudaMemcpyDeviceToHost); + + cudaDeviceSynchronize(); + cudaFree(dev_odata); + cudaFree(dev_auxarr); + + } + + void scanplain(int _N, int *odata, const int *idata) { + + + int N = getNextPower(_N); + + dim3 fullBlocksPerGrid((N + blockSize - 1) / blockSize); + + + // copy data to gpu buffer + int* dev_odata; + //int* dev_idata; + cudaMalloc((void**)&dev_odata, N * sizeof(int)); + checkCUDAErrorWithLine("cudaMalloc dev_odata failed!"); + + cudaMemset(dev_odata, 0, sizeof(int) * N); + cudaMemcpy(dev_odata, idata, sizeof(int) * _N, cudaMemcpyHostToDevice); + + timer().startGpuTimer(); + + _scan(N, dev_odata); + + timer().endGpuTimer(); + + cudaMemcpy(odata, dev_odata, sizeof(int) * _N, cudaMemcpyDeviceToHost); + + cudaDeviceSynchronize(); + cudaFree(dev_odata); + } + + + __global__ void kernBooleanAssigner(int N, int* odata, const int* idata) { + int index = (blockIdx.x * blockDim.x) + threadIdx.x; + if (index >= N) return; + + odata[index] = idata[index] == 0 ? 0 : 1; + + } + + __global__ void kernScatter(int N, int* odata, const int* idata, const int* boolarr, const int* scanres ) { + int index = (blockIdx.x * blockDim.x) + threadIdx.x; + if (index >= N) return; + + if(!boolarr[index]) return; + + odata[scanres[index]] = idata[index]; + + + } + + /** * Performs stream compaction on idata, storing the result into odata. * All zeroes are discarded. @@ -30,11 +286,66 @@ namespace StreamCompaction { * @param idata The array of elements to compact. * @returns The number of elements remaining after compaction. */ - int compact(int n, int *odata, const int *idata) { - timer().startGpuTimer(); - // TODO - timer().endGpuTimer(); - return -1; + int compact(int _N, int *odata, const int *idata) { + + int N = getNextPower(_N); + + dim3 fullBlocksPerGrid((N + blockSize - 1) / blockSize); + + // copy data to gpu buffer + int* dev_odata; + int* dev_idata; + int* dev_1bool; + int* dev_2scanres; + int numElements = -1; + + + cudaMalloc((void**)&dev_odata, N * sizeof(int)); + checkCUDAErrorWithLine("cudaMalloc dev_odata failed!"); + cudaMalloc((void**)&dev_idata, N * sizeof(int)); + checkCUDAErrorWithLine("cudaMalloc dev_idata failed!"); + cudaMalloc((void**)&dev_1bool, N * sizeof(int)); + checkCUDAErrorWithLine("cudaMalloc dev_1bool failed!"); + cudaMalloc((void**)&dev_2scanres, N * sizeof(int)); + checkCUDAErrorWithLine("cudaMalloc dev_2scanres failed!"); + + cudaMemset(dev_odata, 0, sizeof(int) * N); + cudaMemset(dev_idata, 0, sizeof(int) * N); + cudaMemset(dev_1bool, 0, sizeof(int) * N); + cudaMemset(dev_2scanres, 0, sizeof(int) * N); + + cudaMemcpy(dev_idata, idata, sizeof(int) * _N, cudaMemcpyHostToDevice); + //cudaMemcpy(dev_odata, odata, sizeof(int) * N, cudaMemcpyHostToDevice); + + timer().startGpuTimer(); + + kernBooleanAssigner<<>>(N, dev_1bool, dev_idata); + cudaMemcpy(dev_2scanres, dev_1bool, sizeof(int) * _N, cudaMemcpyDeviceToDevice); + + + _chunkedScanAutomatic(_N, dev_2scanres); + + + kernScatter<<>>(N, dev_odata, dev_idata, dev_1bool, dev_2scanres); + cudaMemcpy(&numElements, dev_2scanres + (_N-1), sizeof(int), cudaMemcpyDeviceToHost); + + if(_N % 2 != 0){ + numElements++; + } + + cudaDeviceSynchronize(); + timer().endGpuTimer(); + + cudaMemcpy(odata, dev_odata, sizeof(int) * _N, cudaMemcpyDeviceToHost); + + cudaDeviceSynchronize(); + cudaFree(dev_odata); + cudaFree(dev_idata); + cudaFree(dev_1bool); + cudaFree(dev_2scanres); + + + return numElements; } } } diff --git a/stream_compaction/efficient.h b/stream_compaction/efficient.h index 803cb4f..2499d5e 100644 --- a/stream_compaction/efficient.h +++ b/stream_compaction/efficient.h @@ -7,6 +7,7 @@ namespace StreamCompaction { StreamCompaction::Common::PerformanceTimer& timer(); void scan(int n, int *odata, const int *idata); + void scanplain(int n, int *odata, const int *idata); int compact(int n, int *odata, const int *idata); } diff --git a/stream_compaction/naive.cu b/stream_compaction/naive.cu index 4308876..22f690d 100644 --- a/stream_compaction/naive.cu +++ b/stream_compaction/naive.cu @@ -2,6 +2,11 @@ #include #include "common.h" #include "naive.h" +#include + +#define checkCUDAErrorWithLine(msg) checkCUDAErrorFn(msg, __FILE__, __LINE__) +#define blockSize 512 + namespace StreamCompaction { namespace Naive { @@ -12,14 +17,81 @@ namespace StreamCompaction { return timer; } // TODO: __global__ + __global__ void kernNaiveScan(int N, int d, int* odata, const int* idata) { + int index = (blockIdx.x * blockDim.x) + threadIdx.x; + if (index >= N) return; + +// int power = 1; +// if(d>1){ +// for(int i=0; i= power){ + + odata[index] = idata[index-power] + idata[index]; + + + }else{ + odata[index] = idata[index]; + } + + + } + + int getNextPower(int _N){ + int N = 1; + while(N < _N){ + N *= 2; + } + return N; + } /** * Performs prefix-sum (aka scan) on idata, storing the result into odata. */ - void scan(int n, int *odata, const int *idata) { - timer().startGpuTimer(); - // TODO - timer().endGpuTimer(); + void scan(int _N, int *odata, const int *idata) { + + int N = getNextPower(_N); + + dim3 fullBlocksPerGrid((N + blockSize - 1) / blockSize); + + // copy data to gpu buffer + int* dev_odata; + int* dev_idata; + cudaMalloc((void**)&dev_odata, N * sizeof(int)); + checkCUDAErrorWithLine("cudaMalloc dev_odata failed!"); + cudaMalloc((void**)&dev_idata, N * sizeof(int)); + checkCUDAErrorWithLine("cudaMalloc dev_idata failed!"); + + cudaMemcpy(dev_idata, idata, sizeof(int) * _N, cudaMemcpyHostToDevice); + cudaMemcpy(dev_odata, idata, sizeof(int) * _N, cudaMemcpyHostToDevice); + + + timer().startGpuTimer(); + + for(int d=1;d<=ilog2ceil(N);d++){ + int* tmp = dev_idata; + dev_idata = dev_odata; + dev_odata = tmp; + + kernNaiveScan<<>>(N, d, dev_odata, dev_idata); + + cudaDeviceSynchronize(); + + } + + cudaDeviceSynchronize(); + timer().endGpuTimer(); + + cudaMemcpy(odata, dev_odata, sizeof(int) * _N, cudaMemcpyDeviceToHost); + + cudaDeviceSynchronize(); + cudaFree(dev_odata); + cudaFree(dev_idata); } } } diff --git a/stream_compaction/thrust.cu b/stream_compaction/thrust.cu index 1def45e..6e2685f 100644 --- a/stream_compaction/thrust.cu +++ b/stream_compaction/thrust.cu @@ -6,6 +6,9 @@ #include "common.h" #include "thrust.h" +#define checkCUDAErrorWithLine(msg) checkCUDAErrorFn(msg, __FILE__, __LINE__) +#define blockSize 128 + namespace StreamCompaction { namespace Thrust { using StreamCompaction::Common::PerformanceTimer; @@ -17,12 +20,41 @@ namespace StreamCompaction { /** * Performs prefix-sum (aka scan) on idata, storing the result into odata. */ - void scan(int n, int *odata, const int *idata) { - timer().startGpuTimer(); - // TODO use `thrust::exclusive_scan` - // example: for device_vectors dv_in and dv_out: - // thrust::exclusive_scan(dv_in.begin(), dv_in.end(), dv_out.begin()); - timer().endGpuTimer(); + void scan(int N, int *odata, const int *idata) { + + dim3 fullBlocksPerGrid((N + blockSize - 1) / blockSize); + + // copy data to gpu buffer + int* dev_odata; + int* dev_idata; + cudaMalloc((void**)&dev_odata, N * sizeof(int)); + checkCUDAErrorWithLine("cudaMalloc dev_odata failed!"); + cudaMalloc((void**)&dev_idata, N * sizeof(int)); + checkCUDAErrorWithLine("cudaMalloc dev_idata failed!"); + + cudaMemcpy(dev_odata, odata, sizeof(int) * N, cudaMemcpyHostToDevice); + cudaMemcpy(dev_idata, idata, sizeof(int) * N, cudaMemcpyHostToDevice); + + + timer().startGpuTimer(); + + thrust::device_ptr dev_thrust_odata(dev_odata); + thrust::device_ptr dev_thrust_idata(dev_idata); + + // TODO use `thrust::exclusive_scan` + // example: for device_vectors dv_in and dv_out: + // thrust::exclusive_scan(dv_in.begin(), dv_in.end(), dv_out.begin()); + thrust::inclusive_scan(dev_thrust_idata, dev_thrust_idata+N, dev_thrust_odata); + + //cudaDeviceSynchronize(); + timer().endGpuTimer(); + + cudaMemcpy(odata, dev_odata, sizeof(int) * N, cudaMemcpyDeviceToHost); + + cudaDeviceSynchronize(); + cudaFree(dev_odata); + cudaFree(dev_idata); + } } } From 1dbd7e7ec99fd1e9ef4613d4e1bd1139bc11eb12 Mon Sep 17 00:00:00 2001 From: pjewell Date: Tue, 21 Sep 2021 22:01:22 -0400 Subject: [PATCH 2/2] -added chart image --- img/chart1.png | Bin 0 -> 28173 bytes 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 img/chart1.png diff --git a/img/chart1.png b/img/chart1.png new file mode 100644 index 0000000000000000000000000000000000000000..cb51c9c7c8ec72a91c9af5e47fb0a93ec7d1718f GIT binary patch literal 28173 zcmd43c|4VE+ctcnQb{yODN#uwWR6UQlraez7L_4088fRSrN}(bnXRSFma)++v&;>q zWh`UnZ$DJmbwBU(J@3EobN%k?`rW5>j>mZ%+rIDHzHR4|t4gxeI~jIj7)E{Nvh+0! zBgetN_1m|>U;2wE8!*hG?TYk;>-VB3`*z&iKeV>-(`w3mH|-YD<2&~YkUiS|;HeKi zJNtf`>!l%2;}g}B)n(Sx=;XCmIQDDf<+M4pex_`-+WJ$jnYo&Cdu@8Wq-QV7%D{B4 zxsi=kU|&ST)Ji+S$-%*4q3cSu`;6!(QRT$WaWO?{B|5kPFU7=P6rB`YofH_xoNx&f|HHYp*AF~~< zH(7W+w-$Hr+Y?^cpzV0|wbN|EmC3fACom?4le=Oy=Y%~ut+N*w^Ltk1*fWa*9u?8n zZ+GzC{U$9u@#>wycn5x_*Y;D*R(c)giq#?RA!U!{pq_T~CM);Z!@|G5JnSP}3R+zm z*a%&k9f_LF8Yve@cPiI&nG~3{o%=wqS9F5|OSm*jesOfNa6s_>`0{vm*dacn%Q`wb z?fTud_B|GbHErpthM!;EN)cdf2@VUBzjiH9$9-wYQX{#hz-qv<+_Z0UL@cn~u{=M2 zx8&S0JF;VQB?D_pyG4IUF4gI+UmKyU5O@DTr*S?1{dINq+9ZE*Q}XV@0e21)6O)uE zQh@By0DXV4jtSejX4$$I$E04`jx@5qTAOQCw&}fF6+y^W7GHXw7iiv{cg1;Of=yX$ zK6r3=IC6y1{nytNj@cJdy9272Jah^ydRQe^x;`1y2D8n~#%=UYwBogeRQ>$@!%oHC zySTpGSy5}>iPOj|G5h)LgUx{3Vnk?|n9D+VM@eROdG{6tH%rw4Uxy%4xpsj$&wSgW zH7Pl)!$tS-`GU}SozwVF>8Y`%T4iI`Zrm{LuY*_0)Jon}?&cy4cRBjD4ATZ3YvdrmOXOUOLVSDN^OKtEWz={)q^bWNCMUNfSerO**AhV zgO`8a6Sbd8>aowdTIsc8-=M$8YSPk8MiINT)nDKJQ&v=@-aVz`PzVbu@G70D<=}Om zGuF&CdcGtnL3$Zf|G3aSo=c~wsdVAXn}ltAoX@IkE@Tr{E2h z=(zs;FkHSZZg9{bLdYtX&5WZ@MV3)5vvIl9wo$lhi%egtVwB*pfMxlT(G^9-kdo>0 zk}K0PC5?>|8?Lit*&|Z#>hn8w&E8&0Rf-uCwR+CzGSyq0a$bq4^8mM7%<|}XL#&u~ ztlNSnsX6KK4tM8%u96AfC3u2ySHIX5q5gaCwwx4OTy_=9zCU|pVufbdb^Nt}^xBVi zG&B7!(^s6Go!j;G+?SH~)6v!NeG=7gEZcTc%yF54lF{yJuDkk!wRLi1;fwUN$Cqzq zGZ}YB?%B2WV-+Tcxz>LEU2GOlT4=6Yylzsbu~1Z5zO}ZoKCr>Z$s{Z+JS?UD@Z*E6 zRD+%ryU(uZ-23_{g|2RVK5}g?1;4+0IgyiP>~+;qz6%OfVkVi7EjthFn`~^M`AjtL zEo#mSG`Z1AGbJhJGCy`QA|gW3oxA(^%av_!pKkN2*jSsMY1hz;INQ`a`;t>$j$6yv z$2<2*5bKpr0QLZW>}Sb zw+k|`KX~oBWHWn9%(yAuv?qUFO2Rx}?S!PZ!9-ukWOqIv=c#T1qk)tJW&cCQjHI4I zE44E66=V9yg3>LclooDek|v9K8|!O|;Z5-gOjD=gTyvJLS~j{gXP%qnRJqR_kGsVw zru?SqYGPk$_oGz23LPEYH60!Eqpci@X5Iv~O!Wvi)7_1oUuFh_lm(0Zp2tjkk1q1j z7G3|cl0CoMc2&PoNL7$)2Oru9=z*YZaF{8)o&%nUI+aIs=0W#Q&%T}FhpI`2d>5LW;bSOW3n?{G7 zmM}3wNdLV8y|%#$+%s9*s!l#x+boizp1R=2sGvo=aXQfte=ISb z>E-gVES#outNxgdO2u+}w!w6NSqc?r^z_N#st)zc(o)gZnDjmqvXduI&WoR-wIj#h zBDhdZ~1-MLvI<-S~GAld*_9v1}CEcD;D^Y!}}v zt?ZjrO$7Eyt`@Gi(ArO*gzg`2YDv%-CsqEeY`1&!YpP^=uvJ+CAK4+9-PlIrw$lmR zb})ILqxX23*UV=TR`wr9$?dwmD_PR)^9vZA1_OpOb-$+0W^EnyT(oZEi7yYWxJos= zLdDF#9VfY+M0Za_ceJx&J%5hWDr;HvbsOK!m6hoVA&olH<0vIX+VCWKN`vuW^6tHf zwvHD4rCmBd9Yzx#tFQd9>PQ+6uTd<8|h+oGz zyvh2IF=tEXj!u^yL$B>Kwpa|crZd=?M-*Hh+fuN+)^y<~1Fxy>h4e*xxgCtBvsKTG zE~=$QZPz);;J28Kd*OUCYa5f>T4nr(_4dE_>|^q(+2q)q+=Vimt0f6`rD2w>9MyI& zY$_>eL~nk?k;=PwrWjgmZ7q%)8;dn(7Yp^?LQU5FHkqQtzSjF@_N(jNau~a%$|GBAW*Bbb@aW7%pRvjTAv zrQOY$I^r4af~<}LV#jE!I$c?p34J@GWA81?H%GPcxRQcyH);!Kv&*&vzKxpgH#D@) zR&_Kb6@DCjpW>X!cD6X9W$M=XNt;$%%ec1}-UYWjQGAHke>I|=9o&|FlEKBY!M3wE zF63r;1*YZKv=w{a#Ick&L?Wfe34Sk}+u80iUDlaru=OH6#_zrPbF!OVbZlb_q3?r; z(m3QA0{iO8C11gO2+rs-d^FtpQ&Z%Zt`*j#^S>-hlgy;XLF(|b*J&h%*RXc~At6hh z_J!3|*F)zWva)o`v_u>xP6wS*4h=XVE4X|R!z9n;*EZR=Dhkz1^_3*w`4rn;5EdF* z3(1_}xA#v_2(78B3pys{ec95IUL}ZIgg$+i6l~cHFL`qD9>c zTd59QHZ{$7c8Kp|FVoqZp1TeUDi}3H8rIH>37WL9R1HA7>%)oXwKMsWQCAznaq`ru z)NVnG4_XE05qV~vg#IeXcHiRFya~(u4;+Z>I8&0B$Ddo4P-Pm)cSi|tarf@sZ}ZMl z+P5&C?+_2ebjzHJanu;s(~PE&fPiCLw{C4OfP0kZ<>Nc-nw63u^|+%uS>M}(td&f%Gd@@3?-Q`%aR!5U#2@{ctWxfWi@Oo)W5s%cPbq}o ze9%!J!Ivx_#$~kOeH0#hd#XA;aqr%}FX5m5DwoA6PAYmX^BH}3;ID0CF!L};GwJ`` z`Tzg(szimJIQ%J&8O^p;yOn4;m1x~ zdiDasq$*eO;9itE=2;E&iwZ*BaY)QLZ*XX+22ynvAt4P|0(}G#PvdLl`CqH{o&UA| zosO-R{>u3HptA6FGH*eU1Ym{u1jqU{b$)^{Oe)t}8h?Ibu?vA%ZWp@q1?>BW5ac zdeT+6yj68?YM+n8eRVEnCP(&dl>#bsL_6zMv+Ocf#{sv(Lkb>^!Xx2Pej~Nj)h`}A zcwo9R{d@VDG2R(%O2{vOg>H5pYin3s`8HnMu3fU1=4A+#&n-Aj8w7O|!nIwa?U`YR zzPA%_=OcIxpOPLt^Y!KPM+y1z{P>8v8kuy7)KS_l>m)z79|QjKbXQ21IOu7U64~u0 zyK<|%vgmd1z41ziYUgd9QH;m>Qeb%a=_^;R@LF{91NcQpXx;8FbuyCMfo=D#e^VYG z@{J02a(#uaM0)pfa+Co1m&CN4xNfyyQ3I&x-#-TvVz=CcKz z<8h+v9;@PtXgi|jq_TwF_S%Fn=jcKtpNH$KONq;+kYfZe8mOI=jdE+&6Cy6?OXi{k-6DfpjPq;B!S z7cC8C&X~_>bWnba6+S1`->e;@)!YtGY5Ku8E9X?tzrW`^SC0tg!dB2d1=kiXS!&}~ zjA+F$^r(4qq1r=v{Zs7}lJB98mOLGjl~-FYO;d>iae_hI{`R18EdfmWD7tcBnG26V z6*}=bbm5TNzIMuH7r(pNT-fp5=x&qmMW`t?9C#xCAAd4_t+p^{hiAZ!@7vtPJVs6F zmMFdF7`j*Xb|U=_9F_t3h-6xDw-&Kwg{VP;?RkQ(el?i;8|Px!zVl~&j!JT(D?Vz; zSC1gbVD?a4NS@yBVHl!XY_msl^F3~Lk*BA^QKW4S+ne&?jB=8OCNdyjF5 z35onF4U9sb^pfC!MW($(dJxPk+Gco+roAWY@Hvg2KFmm9W9UtXp4K!8Q6-k3M?vqJ z^VUBDDUMFz1p^r&OvKvIj}2i*mQ5JFvP|@R!#OmG$|vo}jBm6(A}sd1C<7oL?J9B1eqU8pXHnS4 z3RR7O|Io-t)XZ3WW-@H?I?Wgy61I6+C$O!_lrH;Ktw~i8+$#PeL|~vNsvptoo(5_t zKvh{xP0+IUbk$LpthuQ^?d3VriFc1_!k+9qRXf_8bdsA}*?ZTazT(mzEEND>u^KV z&_wR1(5$R8HUAA{Wvjf}lglTu+n=0d2jyL(D^}m6N?dL(_x6x@|KWr1RSqoR41}Xn zr`eZq$E?{oyv4y`UAHbvL$eHh4Zg6SeaDU+ zye`3{*yjUsWV89>+216x1x1#tcK85CFxRxHqNe`@Q8|DMCSyiB0Dm= z^T*jwOrBq)aWElY;W{in9u@vgw7D6K~)e{DDdMV#5SPy_|cd{<0h@!Mb?L-Wc*a{D}!oYH9H?*7$00Eo@d29uQpbE^Gpa_ zKJFjZ9114Q&Usm;+s2Y9zd$8m6dC?jIV?Ff8ceEXmt zq0WVIqKkV6XT7S8)#=P2i=RheAvIxZchb=|`zWR*22bCXYU4P6VUFb1mr3QRj?ibG zP0J`Gs03Zm7hO<%frH*;h`i|{3~QP)$uzV69-JDz$>rcPg|#}x+v6X>RjX%KyY-P9;=aO^SdQzlKpRFS5G5%}DtU2Nx0DB8a`M$U}pVigFy+($W&I!^925 zi`h+lRzWQJ30Xh8^aAYowdUh65jTUg2xM7Yr9UpRv9YBS_0-i_-InJ9I0!`-<>ciJ zfBpE12=)J-*y6GTJbk`$;>CT$%d$~`v4`zgm;Xz-qPH|0Cc8nq@DxfXi}`-gC7$SEn! z4~gbIGV97Qw7gCu$-+Kh5CU(U_O{Al;`)!r64hxcN%fyO)qr%?B2KilwA4FbX!;v| zv?bgAI&BwDMn+}^sPZ}uJvX&{&J!o3cJADnF(BNwSlL;$AO!V~hIr?PAF^Rl(WgF`&Ht+YyUM{rk(8~~Wo_oQjn&Q(r zj$a-sv?E=Ex>gTALc=JM*C9x)V{bu8D-${Zh)J(}#O}#un~@ickrLgQwBhJ7FnjfC zz77w!Q3t+!$r2%G5%~-NAJr8; zAiu5iEf|W7KE+z3_=cao^#C~an!n(8eXpbeo33Sus_obS4Hn79F4dX)NnCqmp})+f z?(HKgMgp~`dV)cSn$jdmR}7QJ!{}g51}6(bxp1Ql*u`hCz%l^K6y)hW&!TzUb;;}x zGI(u;$t#B@>MB}J#70O!+A5f^Jk)>MIh$o8r?EIj#+-p5!w#>ufEMUIn6B>&W-Q8k z)W_ZB95-h${Qh3hg(Ix-6J6LhG|2@n(tCbZFa6BkTDz6`8JzpJcFoQ|Xqw2BKqYC; zp=AIoz8E6D3LpTpgOmI}XetdL;rUrD-0BY`+(4%%%qVUG%q>74+LFIBfe-Czs5cr3 z(bfylQgVX5K#@M6w+k)s9k5bgU(qfu9a^`!O$mFx0x*!Ps{Ua^X$k_Lr!vnt;d-9Hy~iL=+M}`5YMcSZvKS36C&n@ zlDTjI09u!S2QOuFPska&-Z494jqn0|IR6WtDgd05LXugzso1s%7YLEKnjQ7~Hv2_y137Iqj0%kFqZ@0QfI{kS}@GqiS+pu5MIV>2XIQi;^SfQa9VP)Q*$;9D$&bCzDxJ^f^yAId#RO&bdtYikuuTnK@*O!|q?g27oe4 z)~j&O_v5o0ZPM!Dw{Fj8l=RYG^Qm5UQt_WWujAKn@Qa+luOw@e!Q_7Max>#uVAGoQ0Q1aLQVeO$oF7#G2 zp!h~?OvJ%_F4V66lQt6Aeg~c}IjA_lntbnxn$m43a;Bg}^U+;^QltzPHMSk5Gd~B) z(~_t>>tIDC1{K&iEq?brYQbJmPkWB47Bn+R$GxZlh}d%<7S?hw!rDOu4dTh;Sx{QT?8Xjn?izk4Bq$f% z)n{~(I8=^K$;gHTPX>*Q;L{d=QnaWJCf=7V{AI7}G(_hF?d2ip^agr)Z6S5#rXpI{ zrsibWFTD{@ikE3=m%n}g-mnIp&pM!@25CHgJl^e*d__S?^yhoiD)HfP z{lTz`)#P65chsDOQ#WN{1>^pd@v-{7new~3iTc^A-~4S4_onLi@OLe=Og6YbN3D6A z>9WPzxb?YHhs0frUM45kjAvJHCCU1S!fU9V@6@weSo!3!enQH77tav&VZrJU4P7?J z-n);XYD((Lrq0^}V{ez*f_DCBm%JVBAWnqE`uPzXC z_v0aI@xq*(Yf!(uRTGF;LCBQ+UNE*Akq?Ot>5gKjs6s7`f8Hg>>)gGwA(*ztRJ6Q5 z!fw7YV+AQ3945P;RDCyT8Mo}tt9eFNGMeZcIVZ{-jNZKb-;9j;jEMah%Xx>1v-*X+ zyu1v(j_-Z;AihdByK{>GwGJ@J*6rU#UX&*$CU(r$*4Ea~dS3mWx>IuQ$3JU<`3%rv zcL)luW9g~&@%KuKZAWS|)G`MNB~!eM&=nraxIbr4>MhRg+JK~l&|ToNXt@N?h04FL zP82>S=*g1)q-Sp1-F{Vd>!*$7tgql&CL^oH-9*p6=wE3pYsiq5lY2^HJpvS3u}7C- zSi}MDGUuNQJyu~rZHkQQ_qonC%m8V2$xnRQm^4sP0sLZ@0$p>a2Dk7~uxg~5MEtSM zOSuL?Zd*r}UbF5WV)jX-ps|W(x-CUUgPXm~RHfvQeIp){#t6%|3@?&j1X_mKc|Z5-+8vC2nX{6Ou&?@)V^ zw4R;GET`=%&@^cJ@q+Ohed!b|SJ%~_RYvMdp7$vzBlGvkK0Bx~ZEIi5gsUf(mg?-? zx3t|W_tz7Ore(TeO~VvlpCfv1>^7ZC$M60;{*arlxMcm&t0+?>mMN@QQ~c=1*zi||0B&85611pa`_nr@0S*@W_58SPx#y=#RSV3!jz2!?oVMephOSdG;>4hR z%ststD0E*Q+8Bzcc;YUb+nYeuaA2V+->89M6Se^z!$A46uOQf)drdZ8@g3@@4_2S> z-{q)=Nhd+g-SzJAtus*Z1WRlzeX@agUkCD+46y|e=JaiODc4%YHavUk>TVd3htc*O znlm$X+;xgVeJt5!3y~&tsskPyWu&KVAMbkt>TRPWQ9R#aHrn!P9+W6`K#L86Xelts zPb8!JyPn5d{tkTv(h@(Q`9gEY9&TW7&tr?7-;3W4D_d=;u+n@|%{bzlrN?zmYhoa@ zbzrfq`vmWxX~3RBuPiN;fQiD(L5+^t$EC~D9vd8}2?;N+_t`+pE9l*+jLha6%1fjo z`55Scq}4b=3tUa~E8$g0scSH2CCEoPhH+}M9`#RctP=3*1B9@F2^pCUFGSt(-|a0ThdSBvqgK9IXJctwzVmDV9hD4MiP)L@@+5xR?v`W? zh73d=uCGmdM0Id5b_ziCjaay$ZnNwV099hQZ7YF3Zg{Npy<6!0S*}fb-S)O$1c_6)>Yx5I!%;%EKlDcp)6%5Do zZP^m*8V2t=kM~K@tIyTGjm)Eyvm%2u%rd`tC{0~y=zz>gv;nOa!+Gb^KmVOuZc&n0C-*jX!Uj#SZJ38z4X2Y5-U8} zvYy*~NQ94ogoGYoCr746wqV^J(z6dhwCr+;!MrX!nWPa^W`bH0yI{j&w>EJ{!q(u{ z_{@wA-_jUr3R|eCCTBK|*S{0F65Gfd9Ts*cl&K|TEH^jzovsje%!o;P`s=`1&+3uO zqb+K^EA!*0KsU+Hsp3Es*H7ERTHngXX4)d7psK3cAo4=4DwuTUJ#Z|%8rR5;4Kp5b zq4ZH)_F@=8EtykJ#JGGooN%?f*fy1gl{IocSA6}KJ|9^oVLT1@ltY6M934G2o?}84 zMxfpiAHp`kdQD{%h+SSE@xXiXGU_z@yB77=R*ozm4p`6-82_)j3a z#gc&t!|A!qwetQ4PGu<_nd05m)*YI6t^mwNx$Vk(b|OKZvPoiNb@rA=OmOi2;<1cO zh0&JIR=VMQDwQzimJf3I$#qJdy)BcN)*%-TdS4w`W~()te6dgT)dfN;)gy5p<)}<- zzt{q(Up>nkB^V< zUhifEahvA&IbyX5Zmef_rDk7G`H8N{)uHKf<-p{+6B`35`@UrgQKprd|TD?!$Rsk$`Z=<$EFz1JA{Aq z0==dzSH-GLZk#Uv0)l^_pBnDk(S_fMbL%+Mfgh1O#ojsV2x*|eii_v4EsI^*&yR;^ zKw+!f4SANCLYQEb_fx?WR*?L%EMaR^X|rBXg-WF;MBH8v+_5^NO|(ll$sm~gM@yLA zRY=>wzvABeW~o7+g@6sK-1d#;g8s#0xT3Sa*2ZhYKWxkjJ-{2kHD!r$CzbT++G}{+ z8NX8ld>J!}cZx+KNQ9c#s>COD1VJHQm*HmKVdaNUw((ACwN@s??O~7x?W~CXa$qepI-S<$I~&6F8EJZ>E0c9W6U_Ha#(&?h)mHr{7YZf~(h*(SoTwyc|mJ(X-&n$O(3J#U8&eZW5H!;sr2tluU*8f`S z>n5i6z!iFz=>l;#^`YZblP&!z>IHGZOB=@Gi@mlD(@xYlNIU#=oJq;N_kP@LwR!1b zY*rWQkN{GtyTgkWx($T;@{A6jZ?XO`-rk|km5g&+2(4C@SZmAkBE7cX@HS?IiZ9c` z<=5B8igjA5#+O^1O~34HyE1U6;Zq|Btp`elsC6>53oVBe&r+LpKS<4cENi0dx9yBe zxL3(r2BKxjq+sjY_RkEyk@fjbdC6p#JMTq<>_x|8ZJw1ho+HJOWA`P##S#kg+fbjU zkkq5}N}z3?*tzA)Sw?BS@^<31@W(P@ygNhI1R00BNAcGC9Fipmr%R^ATl-QxBqf#x zgN&wRs1GuP^S>a8GrXM3m1N3oy{p?zM}O7!X#$jZZ@m~-9B_?XXZ;j3ZZv<|mn`2a zz@Nf#gLvDwOh)mcPJ=V$LMyRHqGXQmIycvL^z{zmzq{ zq?G&M;yyO*6EEU&PL5ds>fiBser(pj69D-TkQft{!wJ?ef>=-kxPF zOnIko+9K&)eW0T$xgoP#7LUh?!~)CXrdEUV7!|lb{@r>0@$PjB&r5XN1+}rU1T~5a z#6=QR|=L>eYRita!rKsMS-Bn*_LA`$ry-J5c~OVh&8=S=(88mm9KI)@eOv! zH|hKa6oJ?#?B{KHxyBK#_Iet8JC;ZY* zMh-mQ$$4w}?($BN?uQ17%TAeiN0I!!z55n;aFNyam2rX_WAPVoyDyvkYRTleJ^b`2 zmY|aJmOb@sEJvaV$KWv%I~NKWk#olKT*7jzn642^d{*X=j2jz&3miT zpO;z_lR2FPQlB(crYERm>$V*e6={E@8<95O+N91zjU!IP__?`&ZdUQ?!&ykEvgaOO zlM3y};=6u#DP_wArB(}RSK71P4J{o++#cvYzY(N+IeAvF<0a02p}06Nv&CI6duNt# z`wQGnx?t}utXJ^k?hT!-nP)3tQEOGD6~`%6%yr#~Ywyk;UQrQIx##Jp?_F{PNkH~4 zL^QUpbO6yVeGrPpj-)>wYsR-g|KOh{CCGDYw~LQdhOlByEXe~mhOPt8Y>b11$9evd z(&Y^KGoxRm%bYJ#Vq;zoB9hSLyastz>e);|&yEj}$ABKsc8S1cMjK9c<4%2h*F$8oBE zJyQ&Ae)D$I5uzNi-%n!wUTOvwFTkN`86wsN>7n9G|5!xMtt!)E;>KEFaMO%ayPfEd zf729P(vJqAfncbp)GS($aRR|4=zH05wCPxy{`fC_oIrAEX=%sId3)h6o)iY3V#V@9 z$;KcDZLgn&Qbf>9y5Fu9~W>$3F3j(K;A9?*+1*?nmd*k#6*utRk9mnga-=clVRtqUa>!_Qra(mkZ&wl@G>T*3@-B^?7-LLf@Yg)~Kb}gWeyP|i(u}wv` z24T=@)6umx zsM&-h_CX&tMG2J~I<^f0Hk0|Cks_mU-gf3Or zPH%MPotoUp*VEKIcHY{pyKGKMz2ql%^Wp6xnlA2t{}Lm|9oeXk@)2Sl>u}X+Wpd3NRQY4 zaw4Q_?V(6d8{ur)8Rn2)4oM`d!RSb~!=ZA>Dg%Rq4IZRBcy!1tHisTbUg>Z=LoG2M z)B5HiWoL$!{xq@|rJTQ^mnrLibXb5tx;-@k>IEa5f$FSF+R9ejUDWyv$04@wo1IP3 zZvq1iE65njoCxc^5Fj)U0*N^4?Y_nIE{k0sP}C{kL{?m(2+OaU^TH$_%BcfcUaYkv zdwN6t{m%BdJ}6I$M`F$1K6L+Oq7hE8Ok`T zFb>6)V9kw{4zqQPOfqRADACylm$@x6(Yb!7-CmOdlaEd5rg^?UA${fRlNNE6(LN3$ zZewx&`G;32BBqiq$Z7M=@tj{f(=wHdQEzq7(7F3)2_z?RJo=Z~&*cOui)*6Za&5Oj zM^iZACW*2PvLva?zq2GqQcJRFRKe%b4_nDBy7STNrmxLADx?%8-cQcAy;cyJ$~2O%1*|dshzAK%UI$O)KJv&D z;kFF46Dt{PO_E6~l;skNnjcaK)3;vDWn7;;ke`{E|4R5>3dBhR3?F(t}3gZ0=$Vu3r#z8;7d$Pyb0j^ z>e;2hi~vDVE&Nm&qlB<}IVw(sAWfjDfC(m57oeW&3nbVRpAT14|TN zXBxzMk8|!DD+5ui?<4hz6BIkh;zJV+)rAFm7Do*0Mw~C+v%mZ)kaO!m+tO#hM5e*2 zw+)=`)F>;XJNOH4*ZWHMSlahMyJvnapBUFFJLf&@pZUALpj5wNl4f{0`&k=_k$-UX zu6?|-+k9H)2dRgb&wdXiw~G$lz1qRkgnFJP%`Ks&k*$Y~zqENKX>j*D^jH{WP?hnj zjHiB05oaJ?@;hSkQAZ4no2+T=q6ELxmMKkYS!$EfXu(Cd_Ln1i{!TwX&_|^o$f{9jb9sB zF`W9s85YRU;8UKHT)Nc&! z2YJia#0yfXc{g6)eiT7Grfy>0*^xgI>(ZF#K;qvUC1fQ?rc`ohe(ZY8yG&uI#`2yJ zUcc56(L9W`Wrkmv^W2xWqWI8T_hr28b(!y3b8ng!@!yN`s5DZKH4{uHQ`$I=EjqW^SBnGr4Ur$y{BbK%ArNGcra*QYCzof8QH>5hDKv&LX4JMSNQ5>1HyH--U zD<8X>daZ}fIvx74O~gk6b~wjAdn32wI70kzatLuMKsRwGHqQ?FWYsZU?%kM{?<>8s zv{|ju=p(@JCKpgyw5%XC>vl5PlxG+KO?L5R(6p$7n}?a#WK-{Un=)XqTSM~1!YG{~ z*l7g}W1izk=HqgyB2CHn?RG>U-|L~8cZQ(2H%d>a^oo!?3+=h%s;E7O)Xec$90&ua zTXQMvzMf*@es@P9zyT+9IMjYA!V}5673AkW>-(!&P&qW;p*9y;O&0b&kO}P3fI{e% z(Tl369h=fJuik}pit#n!I#^HAU@~xwqMnSVnz#0o|WltPA2J z(YE!=o?$qLh@@$1JUw)y4A=|u?VONSI1-KIZm_W(-P!Qt#@$z0I{q^?#xcwGfSS`W_Mb*XTt$LUU#t&wuLm3`HUs(M`d`x{c1>hoj9Xv&>{S`vs<&G-9^d zsKBL?V|u804c)H>XBSyU`{31qjkP#Lz28dbMzVCtLAKYNHZcy-F7uMrCM$u3&1G+O z9l(bI2a>Y_?@eF1{GOa^)I1kWFYv7p=def6%QA zC8FhfuR#`)dt-WzOX~r)9&MIXi68R93s2)77l$e);ryRGd6EMr($<_!p17!a0FS)d zb-=pcB0OtzvlG2>v$Cup+T>;rFtAbWvK&Z3KHN>3n%rnA4 zo}Ob^cOci|&4aCf1&fQ6yACo#{u=}apPH9ff_k&KJUu~2AzW60Zv3ndu_>AA1qxC# z9}inq%X61q&vyg8b6}9_n%wD4!;NNx6lU$~)*yp8_y`b>6!tr3_S)2O+JN3dqeu5Y zg^PP6|20rJ6pCY@a-<{bdaT)Z72Qo*9oO5)m=7ngWtG10PkKeasA)A)_AXi=9$U zQQ+aIr~Dse-Cos;bu#XA^gVZ|LlfMR#s?02SxfDcf5`KXpAt(9nGzN3w zh($O7EW=<{|NghzVTz@(S{qT^}f0s7A z3{c`&>us)9de74!GW#OF@-l5xU5tH&CH6(!W%0JD65<&+K+yJuP}|o5Q57+anB+m| z?-ylO7p$9@7F^bO3R2eJ=i4+hkVs%EVeW-l+pG4^bVgUx;DNdKej5mN9Z{#Nb<+;A z9khnn;{GAud(c^WDTPFnNEcaD)n6M@3%pG;m^LcKKFEV;5=4j$a4f_}^Wowvi4XyH)sE@?j;4H-EI0#9r=8kJ3Xd*p8s&Y#n< zBO#3#R4Ujx^z@pB<5@P)560f_hOxRshveqwn}_}b3Y$8^bE*HCE$}>oLxmUA35ezL zK^B`vHk1QwI#;a8;9hLyI<#$Kx^?3 zyx?u{g1z*fSOko3x|1*+1nKT4IqU_)+WQs%c&O>n9Az&7&5XoQDU`o-9|7)T8OsAH>0bQj$yL9M#Mx1vfnD-IYlqA;dfQ8av zIhDH1?fM%ragWJq*-@mx&>>$?153xgUCl_L69@>yw>NIxO8D)F;p;I&Uf-B1JmBAQ zjN^NTC;qcx1HXPyrq6vGu1G2>ItQ_;(op3P5w)z)2SyB<6J>@o|8XGxF#wy1dHmaM z^4tcCH!Jf8c)6;bM)XE4-E`hFSzv&o_#3Y;mai|H6$358!o#Bi%OnIQ@48Oj8)TqAQ$8bRaSVnT%@1TMu^P&A~ zB>~?a3OiqT(Ud$5(qUC15?=ezq8a-y!u;7bHzUowDrZB^Jg-0avibin{HNv2b34-T zI-e|%A=v*hGGKciKq5W|OwAB7icnz3k!o_Z>y*s=rkBBo1umCx2NsBt9~X8NZTyGn z?DyWr|Cr7;6_Xb;K?J?9r$tW{A5G`Zow`e=mjn<7Fl?6bQloM+|Lr zC>fI+zh?V)+T)R34^&za&;#9}li@hs558=Dt%99YR3|~|_VL*v{iBEf{Sd;*AXY`2 z+-}Kn)!+6A?81|brgk7Iix;}hGN-KIOFCN!$n-c|AE8ek9#de=2~I_lDLG#69}_@R zq_o;-4)mIHk?zf?SG~}(FXN9-W+9S49{B+!vxH|~DZkI%D3t9_w3yM~u_4H?4+P#0 zbXd_D@Q^^=ax6^OIh`c9E6>%?z~J(Q3m5X&>~`HSGBU!Cj2FF%+*Eb@2#1u&s83`N zUA{#Wo=xG{jvRnv7dtoB7bOtz4gw1sq^cyGZ0h2FR1AHD3fT$Qzv)1`|BwMq{EnyQ zxS1Nw? zajb;+B9|B?G^F5#A&v~`SnJiEI3QENDoNQRuq*dY6zMptLO8RgrY3w#Oaa(TUvzM0 zOFOIq6`4Eb#(!!7`rZmlHyP(V5k(-Tc+LzuG< zQt>*daCwcqypf>NWWM*Wyt@ryN$J1xZgnv7G2dIbFo)cD%*HTZ-*EgkKVXVz*CR3e z17n^t!o7#9&d_>@JOC){f~D9`@Q|gOol0mmNpy-IIxhy1mVV86G!D4WN@7rA+*G`;4qP0zUT z9ypX3Vv)D48OfEvMF^uEZ;9M^o>u{C(zHn;f6_Z`ZM!_nrlC z#rxY;LAacf3^sapSma143LoVPjTOuExEcIEeiH_aH6_=znI8nBS5$QeoD0~+GjKR~ z_+Elb1Y{~-Ond0h++YSP4X;uCL0gaiBcXhrK_2j-KYx_;6$MN=8Qf?~rCQh@pJ3+e z-#$U5>r_6j|9BxhPt(~aWQsp}^e82{j*FAC!93dp%q{julK}eo&wY*GkdyVVSZyEc z%lS&>-Y><*#nm{;j#+Egdl5i$#di91T7d<8X)F-5TxJ8mzY=pllQ1dSb_2Osj^WCb ziY_|yybynhOe}EF?=Qxb=~}hxMZ+gu@P+61`mp_{=?6=^4!iwNACGCD%pjZE^ca;% zZ;FXA>2$zD_)u72Y^-H1RNFIuHDCMpw5BSDAz+&U8#E}k73PL)7r=huDrG+e>l3t`HrSJ3cC4>d!kCn^!mYq_X3&xIKSGMdOT{$M+ zYdOE{U*olN;>88YK8%*w;V}dnk7WG~XI~9c{R<(K%hl38BUmSs1m( zV^{YLiubf_zLyob-{5;x%nTM@7Cms$T+6Ng%@!W(53o6sE*Y+s&HA}}fPOL=mtM{; zdx+EP`+p3=}& zMB($NxdEO)YN@Rdf9|vrn;vWh+=kg)ct&k9ciw%-`S)Z@BYdn#yi9`2meEQ?56Lwu z^_el+JW`_7HH>DXaj!OUz*%FMkDrq^xb6|S#if?l4M$H|8HgHZf&BB6bm8N9_&6O>)GF;+U$h&>mkJ7wL@!tox+8f(~urIFfkiy^f_xNA&K!bQ(2?ueKwKKk% zm$v)(;Rt@P6L7EI@B!Ay4ijqS6Cb!~PR&kz&f=@{)tSMoj3by8oT;iwQYilsJ-OY4 zWc|7MB|Wl}qYEgdq7LWkU%UoIx>aul)th_GF|e@`bSy&ie^IMl!!?^7Q~pY zeplvoB=jZitJsB0>+4cFdako8m;FDbKdXh!deNvct8hHY@#SP+j9cA*@3g}CuHct@ zpNc(We2+y2j}_uEbiNwGt9A++tsn;p9eBmxVRL{vbCC;Dki1O(!M(s};c&7ppKrQ^ z4+f0!JOi749HBd!?PP|V5~6gm-CV!WbkrnpbSSvAASI&T#oMlWy)RPAVm{sq$j0=(_^#%gK?CvfKuoDXK4Ps znqgpcy4XkZW_l(6(~D|oosEzq^zl3$eXB!00gREAi)@C5`-M~=TVXH&OMFxT zxvpb$v#D71FfCOIqDYkXu^OY{R`U$Ojnm*?xmab=mo=B1tZuN?HGUy-?YK(Bz!itl z^hF$gn;U-^v#E(Kc!u;BC5;Bgk4feEVrb^P`AD{BjQU|^uBX|U7Lotw32)< z@F^_M#e5;{lP5SQe>Xl526`VW?02h`VUDj3K;_1(n{$XJDbZ??YdZ>N?wPzr7|G*W_2KO)`Q z$qZW+m@LKToTH@SG$?|b_D?QLT28B)(Z8!vOb9sMVVYuk3<(w(@h}%vBuQ4W&Uu)J zDI1%iEz#`3^Nc1+-Mb3MFG|9Gm$r%X1zik-jE-@2s$an_C*kR_QSKaWANtmV5DQNZ+^o%D3w2@Y4;*OHFFdG)KUlaf@4Ah0BBn`49IMWNW=Ay^W7_uoz2cNXnrVCHEmVR$n z;hkSWhbuLB5(Jlh*ppY@nJ&>&;}%JAR%fw zEeeB8wy{32(d=>L<)d@#vcMWF#E~K{BaCOa{3phcEvu_z#-TbAICu$;e`5FR3JSRm zJAGTn9&AghlYiyX!A_OPHg}Os}D$`PrTF#W^!3a`^{*`bM2JQkBvi!IIA{Ota zos^d6 z(TTWT{hdBEF{I2EA3-|2#y~jIq_4CQ>aXaMJ#Kn%hz^yF!=U zkQaUybr1=B-prql+^Un!meuhIjkz_np+D#AF+cW-az}nLgAT zu3Z?^R1Dp&^W4E}R{MRuJCSSUcw*Gore35g0nz6=Tzt*E-K1$%uVN*2n6SieB+P_W z+*K~29jZAaqb2odDo2^kv2b`zxpZ~X(izK+$k5iDP{dWV1@GXH-&^!!D6m2}E-&bC zW-chM!UQ9OG&Kz7+hM{D-2TbUNIG^OEt1kJ1L*WkDr%@3CC(Yt?^$@_jHC zRAOUm8wh>ufbS2fUVJr!CH5bYZ(Z{BU4u_kf|<3*W!=VMA3oVrytK<^aF6U!bEtGz z0$m%&3y~YFjqLyVrSiipJ|F~+>fCo@XyTBRQnEf)_##?oW6Z1MqNE%evh*Y+VgB%* z$aQ9w`Us;pjFQ6cAKB+a?!6h=!k#TMonMSfo3p=v=(1lVS+Eu@uZ*?9n01Tx zlz3z4zk3RkGK~`$%C3o%LKuCuAF4k@6=nXpnh|m}=;PYT;sG1hvSgjM{_S9kdzByM zRfW6d(}0q6gt5~7hIQJYf^|W*v0QDeLP)1LSUjGVtiR59meVlyW*Qd0py(dulN@4= zt7?8G{JM2Ci%l-+(Zi>x@nO3!>v3uIN6dm5Xl+>^ODpd7g7dM??vFO5(VgS3S9>uO zB`$0eQb03o{ap*onF%;!o&uIsrLk>u{{qG5f-5XIK|s2F{MF3CUg%4$t2d-Q8`Wtd zO~G61|9jQe13t($l#4kbYjBF)Ee$G-?pPK|U#CUxwRjLQl;2JxXNika%bAPPBqnGH zY)F4-eHZr@Av@mTQN0o&5MY%m098zWh}BKv*)N+ihMhb5`l-(K+f4t%Exb^!l78rH zVE3scXui%wTCSi2JP+?{43QCgregZ<%Fkp~slnWeY$D|f5Fv)oIZKxkRQxwsxkE9J zZZ2SaUuo18-2U26JK43;nJah}0JgrQzy1Tt$;oMGU|`@nRSCri+`L&chW&%VZztrG zxZ4lM;JZrNw?7)0se!kBkKObCFs-Eb{+r*h z)()Lhb82eec+XO^34yRGTWL5a5)2P6k3n$bjkP);$l#rJ58?zu=ye>&H=5;RB_qQ$ z*5EoduPp(x4f#d9QO!)+Dm;%#fA}~ggj_73>6}WDZM6Aup%F=uISxekY#4FNXY0hS zm>ADyjorVEQs|1T=&qN^8iLn55dfv;?aCb}ZQ%AVJ|J_Q1AtKvkK?l)Ao#2WbVRNY zw~0b)Vfil7&)Q+qxHBC2XC9RAq6ns2o+nRq)^Cm3Z?q`;s9iI|cefjh=Fdlw2X?Zo zCz8ME&t{Dyyk=C=<_-&CNJ*C%NtY5N@l}QR9YoHC#WG{ot2(I2eos=`RsF-Z#YuE% zq&P&nL+2OQEyrYmt5?4Q{sgs5CG`(^7d4Cj-fr^H4;f6hjY{|1P|AFWH3qS>p9`w* zpPaSqfHo7`kG(23>s>p}YwGqnZC`Mlp0&cx1Q|=8$#MA(4q&?i<0G=J~^3U}?Ad|Gdw80E}u6YVgz`xCJ@$@?o>%E1fz0 z&FZaB-YJ_Uo@ND4Wc?rH1~(p^5>C)g|CTN#`}-I25}aqq(NFJfG5nFRl8=nRT~kE2 zu#=_z4El^zke~aOe`zA7gq(ts^j98fLWG20D)4#Z!WHn&pEYgay)N$SPn=ltp`AqL z+u~a_10R~tAj~CueHz|X;I2~g1P~i|#LJ`HLxa@=D&~eA$uS z@6b4yPdvNnp<7dY&?+S}XvOo?7wre6|2JFx6^Q5*Cq z>30>&^W)_@*XRP~`r}FqVqo)m4h{!2@JoO5mcK>1VbQ(#oNtpxruft zdhqGZfApEGo(}v!EHf<=6O)?D5mzta5nk)~?S_Pst;?F!^O9#tcp~9#4B!uefsMil z1fmBdEv`{peIRV~3lyRMvY}IT?Uo;Z7d3B??~tm@=Kk7G$nfHE!r#vB4}!_n4SCOo znX!6NbLIE6w6uPQGyz?$`uWecWJPxEX_=NP&$TOr>H8xA?z+@XrC^8!e?WP%nkx8qT#6~O8Yj2-|z0%-#$JT7ittybB* zYrzRRp9BNPS+&oK2~~gu@p@U{_;5JU3L@^k*j;iMJ`Y5Dsc4J|U`xDvyf;rQRXWlH z2G}dqYz*M@z;6stVLzC)Ms4fB<*QUXC}?nDNOf>*eFwqmMPL=p0Qx4?V+NQ!Ae6tr zLd77&b4hC@$}U})05W86&K;yh^NtzSp9p&8^nf^v7#uXA5RjL*q~ZqwiUnZ>s6*6@ zN;Zkj00$d2!iDMv3?y(DRc?h)`JVuF&>NHwTx3+%7ZIQbvIhf*Uxj)= zWR?I%TXWR-?5D>O)Bf`bODKS+a8t>p&jKw?;6wm!_)5cYi5D=A@-KnVt?_`LEvxU06^yGo2Rxl`Vf~-tFQ~;+pgj7UrIuhM?i((-0;ZA)2?K` zchf}=&Q8=iJu(;osGdBg3)frI;kz{jS)bCfvz^yqD`_w30V2H72hb{B6M!Q$9R!!G zLk~9}fWVC3!WUNkTc6dOE{v>6yG{(@jZI9l**|8_f`AaK;;rYx3TVZ;H*b^y>)g@- zdNHu#-6}_KkaeTCQ$*(}YJVA%N-JRyoqU-KB2n)~&p-x)BPG-#!Fp2W2_iBTyoNl? zXK5zG(=szn2-wR)z$XJ*9krI-Fz0Hmjnly_)r#s`Ml=Fu;!YC~SU`&)j4-Tq=-^xN zz=$2aa`Rb*$wqoH zuaWb4wQ&y_J`{j(>ZnDs^XXkwwq+?eW@iXv^Hy5I>BwqK52Wi>B zSrlV6Rfq%lVL@Qb0T=$+NACNSVlrkag!_>_&zBIu>@g;knqH;jh0Uf?Sj@9yiXxP8M})#voC0s|TAJUbqCNCSqSb>2EHWY8OE fqyN_}%MRhsTI3@6mWiI=R3Mt_dTNy_w&DK;v}?{Z literal 0 HcmV?d00001