|
1 | | -# `CRR Binomial Tree` Sample |
| 1 | +# FPGA support was removed from the Intel® oneAPI Toolkits starting 2025.1 |
2 | 2 |
|
3 | | -The `CRR Binomial Tree` sample demonstrated a Cox-Ross-Rubinstein (CRR) binomial tree model using five Greeks for American option pricing and exercising in the form of a field programmable gate array (FPGA)-optimized reference design. |
| 3 | +Deprecation Notice: The Intel® oneAPI DPC++/C++ Compiler integrated support for Altera FPGA is now deprecated and will be removed with the compiler's release in the first quarter of 2025. Altera* will continue to provide FPGA support through their dedicated FPGA software development tools. Existing customers can continue to use the Intel® oneAPI DPC++/C++ Compiler 2025.0 release which supports FPGA development and is available through Linux* package managers such as APT, YUM/DNF, or Zypper. Additionally, customers with an active support license can access the Intel® oneAPI DPC++/C++ Compiler 2025.0 via their customer support account. |
4 | 4 |
|
5 | | -| Optimized for | Description |
6 | | -|:--- |:--- |
7 | | -| What you will learn | How to implement a Cox-Ross-Rubinstein (CRR) binomial tree for an FPGA |
8 | | -| Time to complete | ~1 hr (excluding compile time) |
9 | | -| Category | Reference Designs and End to End |
| 5 | +Find FPGA samples for earlier versions of the compiler than 2025.0 by selecting the approriate [tag](https://github.com/oneapi-src/oneAPI-samples/tags) in this repository. |
| 6 | +Find FPGA samples for 2025.0 and subsequent patches in the new Altera [hls-samples](https://github.com/altera-fpga/hls-samples) git repository. |
10 | 7 |
|
11 | | - |
12 | | -## Purpose |
13 | | - |
14 | | -This sample implements the Cox-Ross-Rubinstein (CRR) binomial tree model that is used in the finance field for American exercise options with five [Greeks](https://en.wikipedia.org/wiki/Greeks_(finance)) (delta, gamma, theta, vega, and rho). The code demonstrates how to model all possible asset price paths using a binomial tree. |
15 | | - |
16 | | -## Prerequisites |
17 | | - |
18 | | -This sample is part of the FPGA code samples. |
19 | | -It is categorized as a Tier 4 sample that demonstrates a reference design. |
20 | | - |
21 | | -```mermaid |
22 | | -flowchart LR |
23 | | - tier1("Tier 1: Get Started") |
24 | | - tier2("Tier 2: Explore the Fundamentals") |
25 | | - tier3("Tier 3: Explore the Advanced Techniques") |
26 | | - tier4("Tier 4: Explore the Reference Designs") |
27 | | -
|
28 | | - tier1 --> tier2 --> tier3 --> tier4 |
29 | | -
|
30 | | - style tier1 fill:#0071c1,stroke:#0071c1,stroke-width:1px,color:#fff |
31 | | - style tier2 fill:#0071c1,stroke:#0071c1,stroke-width:1px,color:#fff |
32 | | - style tier3 fill:#0071c1,stroke:#0071c1,stroke-width:1px,color:#fff |
33 | | - style tier4 fill:#f96,stroke:#333,stroke-width:1px,color:#fff |
34 | | -``` |
35 | | - |
36 | | -Find more information about how to navigate this part of the code samples in the [FPGA top-level README.md](/DirectProgramming/C++SYCL_FPGA/README.md). |
37 | | -You can also find more information about [troubleshooting build errors](/DirectProgramming/C++SYCL_FPGA/README.md#troubleshooting), [using Visual Studio Code with the code samples](/DirectProgramming/C++SYCL_FPGA/README.md#use-visual-studio-code-vs-code-optional), [links to selected documentation](/DirectProgramming/C++SYCL_FPGA/README.md#documentation), etc. |
38 | | - |
39 | | -| Optimized for | Description |
40 | | -|:--- |:--- |
41 | | -| OS | Ubuntu* 20.04 <br> RHEL*/CentOS* 8 <br> SUSE* 15 <br> Windows* 10, 11 <br> Windows Server* 2019 |
42 | | -| Hardware | Intel® Agilex® 7, Arria® 10, and Stratix® 10 FPGAs |
43 | | -| Software | Intel® oneAPI DPC++/C++ Compiler |
44 | | - |
45 | | -> **Note**: Even though the Intel DPC++/C++ oneAPI compiler is enough to compile for emulation, generating reports and generating RTL, there are extra software requirements for the simulation flow and FPGA compiles. |
46 | | -> |
47 | | -> For using the simulator flow, Intel® Quartus® Prime Pro Edition and one of the following simulators must be installed and accessible through your PATH: |
48 | | -> - Questa*-Intel® FPGA Edition |
49 | | -> - Questa*-Intel® FPGA Starter Edition |
50 | | -> - ModelSim® SE |
51 | | -> |
52 | | -> When using the hardware compile flow, Intel® Quartus® Prime Pro Edition must be installed and accessible through your PATH. |
53 | | -> |
54 | | -> :warning: Make sure you add the device files associated with the FPGA that you are targeting to your Intel® Quartus® Prime installation. |
55 | | -
|
56 | | -> **Note**: You'll need a large FPGA part to be able to fit this design |
57 | | -
|
58 | | -### Performance |
59 | | - |
60 | | -Performance results are based on testing as of May 14, 2024. |
61 | | - |
62 | | -> **Note**: Refer to the [Performance Disclaimers](/DirectProgramming/C++SYCL_FPGA/README.md#performance-disclaimers) section for important performance information. |
63 | | -
|
64 | | -| Device | Congifuration | Throughput |
65 | | -|:--- |:--- |:--- |
66 | | -| Intel® FPGA SmartNIC N6001-PL | Outer unroll: 1; Inner unroll: 64 | 329 assets/s |
67 | | - |
68 | | - |
69 | | -## Key Implementation Details |
70 | | - |
71 | | -### Design Inputs |
72 | | - |
73 | | -This design reads inputs from the `ordered_inputs.csv` file. The inputs parameters are listed in the table. |
74 | | - |
75 | | -| Input | Description |
76 | | -|:--- |:--- |
77 | | -| `n_steps` | Number of time steps in the binomial tree. The maximum `n_steps` in this design is **8189**. |
78 | | -| `cp` | -1 or 1 represents put and call options, respectively. |
79 | | -| `spot` | Spot price of the underlying price. |
80 | | -| `fwd` | Forward price of the underlying price. |
81 | | -| `strike` | Exercise price of the option. |
82 | | -| `vol` | Percent volatility that the design reads as a decimal value. |
83 | | -| `df` | Discount factor to option expiry. |
84 | | -| `t` | Time, in years, to the maturity of the option. |
85 | | - |
86 | | -### Design Outputs |
87 | | - |
88 | | -This design writes outputs to the `ordered_outputs.csv` file. The outputs are: |
89 | | - |
90 | | -| Output | Description |
91 | | -|:--- |:--- |
92 | | -| `value` | Option price |
93 | | -| `delta` | Measures the rate of change of the theoretical option value with respect to changes in the underlying asset's price. |
94 | | -| `gamma` | Measures the rate of change in the `delta` with respect to changes in the underlying price. |
95 | | -| `vega` | Measures sensitivity to volatility. |
96 | | -| `theta` | Measures the sensitivity of the derivative's value to the passage of time. |
97 | | -| `rho` | Measures sensitivity to the interest of rate. |
98 | | - |
99 | | -### Design Correctness |
100 | | - |
101 | | -This design tests the optimized FPGA code's correctness by comparing its output to a golden result computed on the CPU. |
102 | | - |
103 | | -### Design Performance |
104 | | - |
105 | | -This design measures the FPGA performance to determine how many assets can be processed per second. |
106 | | - |
107 | | -### Additional Design Information |
108 | | - |
109 | | -#### Source Code Explanation |
110 | | - |
111 | | -| File | Description |
112 | | -|:--- |:--- |
113 | | -| `main.cpp` | Contains both host code and SYCL* kernel code. |
114 | | -| `CRR_common.hpp` | Header file for `main.cpp`. Contains the data structures needed for both host code and SYCL* kernel code. |
115 | | - |
116 | | - |
117 | | -#### Compiler Flags Used |
118 | | - |
119 | | -| Flag | Description |
120 | | -|:--- |:--- |
121 | | -|`-Xshardware` | Target FPGA hardware (as opposed to FPGA emulator) |
122 | | -|`-Xsdaz` | Denormals are zero |
123 | | -|`-Xsrounding=faithful` | Rounds results to either the upper or lower nearest single-precision numbers |
124 | | -|`-Xsparallel=2` | Uses 2 cores when compiling the bitstream through Quartus® |
125 | | -|`-Xsseed=2` | Uses seed 2 during Quartus®, yields slightly higher f<sub>MAX</sub> |
126 | | - |
127 | | -#### Preprocessor Define Flags |
128 | | - |
129 | | -| Flag | Description |
130 | | -|:--- |:--- |
131 | | -|`-DSET_OUTER_UNROLL=<N>` | Sets the value for the constant OUTER_UNROLL to N, controls the number of CRRs that can be processed in parallel. The default value is 1 for all target platforms. |
132 | | -|`-DSET_INNER_UNROLL=<N>` | Sets the value for the constant INNER_UNROLL to N, controls the degree of parallelization within the calculation of 1 CRR. The default value is 64 for all target platforms. |
133 | | -|`-DSET_OUTER_UNROLL_POW2=<N>` | ets the value for the constant OUTER_UNROLL_POW2 to N, controls the number of memory banks. The default value is 1 for all target platforms. |
134 | | - |
135 | | -> **Note**: The `Xsseed` values differ depending on the board being targeted. You can find more information about the unroll factors in `/src/CRR_common.hpp`. |
136 | | -
|
137 | | -## Build the `CRR Binomial Tree` Sample |
138 | | - |
139 | | -> **Note**: When working with the command-line interface (CLI), you should configure the oneAPI toolkits using environment variables. |
140 | | -> Set up your CLI environment by sourcing the `setvars` script located in the root of your oneAPI installation every time you open a new terminal window. |
141 | | -> This practice ensures that your compiler, libraries, and tools are ready for development. |
142 | | -> |
143 | | -> Linux*: |
144 | | -> - For system wide installations: `. /opt/intel/oneapi/setvars.sh` |
145 | | -> - For private installations: ` . ~/intel/oneapi/setvars.sh` |
146 | | -> - For non-POSIX shells, like csh, use the following command: `bash -c 'source <install-dir>/setvars.sh ; exec csh'` |
147 | | -> |
148 | | -> Windows*: |
149 | | -> - `C:\"Program Files (x86)"\Intel\oneAPI\setvars.bat` |
150 | | -> - Windows PowerShell*, use the following command: `cmd.exe "/K" '"C:\Program Files (x86)\Intel\oneAPI\setvars.bat" && powershell'` |
151 | | -> |
152 | | -> For more information on configuring environment variables, see [Use the setvars Script with Linux* or macOS*](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-linux-or-macos.html) or [Use the setvars Script with Windows*](https://www.intel.com/content/www/us/en/develop/documentation/oneapi-programming-guide/top/oneapi-development-environment-setup/use-the-setvars-script-with-windows.html). |
153 | | -
|
154 | | -### On Linux* |
155 | | - |
156 | | -1. Change to the sample directory. |
157 | | -2. Configure the build system for the Agilex® 7 device family, which is the default. |
158 | | - |
159 | | - ``` |
160 | | - mkdir build |
161 | | - cd build |
162 | | - cmake .. |
163 | | - ``` |
164 | | - |
165 | | - > **Note**: You can change the default target by using the command: |
166 | | - > ``` |
167 | | - > cmake .. -DFPGA_DEVICE=<FPGA device family or FPGA part number> |
168 | | - > ``` |
169 | | - > |
170 | | - > Alternatively, you can target an explicit FPGA board variant and BSP by using the following command: |
171 | | - > ``` |
172 | | - > cmake .. -DFPGA_DEVICE=<board-support-package>:<board-variant> |
173 | | - > ``` |
174 | | - > The build system will try to infer the FPGA family from the BSP name. |
175 | | - > If it can't, an extra option needs to be passed to `cmake`: `-DDEVICE_FLAG=[A10|S10|Agilex7]` |
176 | | - > **Note**: You can poll your system for available BSPs using the `aoc -list-boards` command. The board list that is printed out will be of the form |
177 | | - > ``` |
178 | | - > $> aoc -list-boards |
179 | | - > Board list: |
180 | | - > <board-variant> |
181 | | - > Board Package: <path/to/board/package>/board-support-package |
182 | | - > <board-variant2> |
183 | | - > Board Package: <path/to/board/package>/board-support-package |
184 | | - > ``` |
185 | | - > |
186 | | - > You will only be able to run an executable on the FPGA if you specified a BSP. |
187 | | -
|
188 | | -3. Compile the design. (The provided targets match the recommended development flow.) |
189 | | -
|
190 | | - 1. Compile for emulation (fast compile time, targets emulated FPGA device). |
191 | | - ``` |
192 | | - make fpga_emu |
193 | | - ``` |
194 | | - 2. Generate the HTML performance report. |
195 | | - ``` |
196 | | - make report |
197 | | - ``` |
198 | | - The report resides at `<project name>/reports/report.html`. |
199 | | -
|
200 | | - 3. Compile for FPGA hardware (longer compile time, targets FPGA device). |
201 | | - ``` |
202 | | - make fpga |
203 | | - ``` |
204 | | -
|
205 | | -### On Windows* |
206 | | -
|
207 | | -1. Change to the sample directory. |
208 | | -2. Configure the build system for the Agilex® 7 device family, which is the default. |
209 | | - ``` |
210 | | - mkdir build |
211 | | - cd build |
212 | | - cmake -G "NMake Makefiles" .. |
213 | | - ``` |
214 | | -
|
215 | | - > **Note**: You can change the default target by using the command: |
216 | | - > ``` |
217 | | - > cmake -G "NMake Makefiles" .. -DFPGA_DEVICE=<FPGA device family or FPGA part number> |
218 | | - > ``` |
219 | | - > |
220 | | - > Alternatively, you can target an explicit FPGA board variant and BSP by using the following command: |
221 | | - > ``` |
222 | | - > cmake -G "NMake Makefiles" .. -DFPGA_DEVICE=<board-support-package>:<board-variant> |
223 | | - > ``` |
224 | | - > The build system will try to infer the FPGA family from the BSP name. |
225 | | - > If it can't, an extra option needs to be passed to `cmake`: `-DDEVICE_FLAG=[A10|S10|Agilex7]` |
226 | | - > **Note**: You can poll your system for available BSPs using the `aoc -list-boards` command. The board list that is printed out will be of the form |
227 | | - > ``` |
228 | | - > $> aoc -list-boards |
229 | | - > Board list: |
230 | | - > <board-variant> |
231 | | - > Board Package: <path/to/board/package>/board-support-package |
232 | | - > <board-variant2> |
233 | | - > Board Package: <path/to/board/package>/board-support-package |
234 | | - > ``` |
235 | | - > |
236 | | - > You will only be able to run an executable on the FPGA if you specified a BSP. |
237 | | -
|
238 | | -3. Compile the design. (The provided targets match the recommended development flow.) |
239 | | -
|
240 | | - 1. Compile for emulation (fast compile time, targets emulated FPGA device). |
241 | | - ``` |
242 | | - nmake fpga_emu |
243 | | - ``` |
244 | | - 2. Generate the HTML performance report. |
245 | | - ``` |
246 | | - nmake report |
247 | | - ``` |
248 | | - The report resides at `<project name>.a.prj/reports/report.html`. |
249 | | -
|
250 | | - 3. Compile for FPGA hardware (longer compile time, targets FPGA device). |
251 | | - ``` |
252 | | - nmake fpga |
253 | | - ``` |
254 | | -> **Note**: If you encounter any issues with long paths when compiling under Windows*, you may have to create your 'build' directory in a shorter path, for example `c:\samples\build`. You can then run cmake from that directory, and provide cmake with the full path to your sample directory, for example: |
255 | | -> |
256 | | -> ``` |
257 | | - > C:\samples\build> cmake -G "NMake Makefiles" C:\long\path\to\code\sample\CMakeLists.txt |
258 | | -> ``` |
259 | | -## Run the `CRR Binomial Tree` Program |
260 | | -
|
261 | | -### On Linux |
262 | | -
|
263 | | -1. Run the sample on the FPGA emulator (the kernel executes on the CPU). |
264 | | - ``` |
265 | | - ./crr.fpga_emu <input_file> [-o=<output_file>] |
266 | | - ``` |
267 | | - where: |
268 | | - - `<input_file>` is an **optional** argument to specify the input data file name. The default input file is `/data/ordered_inputs.csv`. |
269 | | - - `-o=<output_file>` is an **optional** argument to specify the name of the output file. The default name of the output file is `ordered_outputs.csv`. |
270 | | -2. Run the sample on the FPGA simulator. |
271 | | - ``` |
272 | | - CL_CONTEXT_MPSIM_DEVICE_INTELFPGA=1 ./crr.fpga_sim <input_file> [-o=<output_file>] |
273 | | - ``` |
274 | | -3. Run the sample on the FPGA device (only if you ran `cmake` with `-DFPGA_DEVICE=<board-support-package>:<board-variant>`). |
275 | | - ``` |
276 | | - ./crr.fpga <input_file> [-o=<output_file>] |
277 | | - ``` |
278 | | -
|
279 | | -### On Windows |
280 | | -
|
281 | | -1. Run the sample on the FPGA emulator (the kernel executes on the CPU). |
282 | | - ``` |
283 | | - crr.fpga_emu.exe <input_file> [-o=<output_file>] |
284 | | - ``` |
285 | | - where: |
286 | | - - `<input_file>` is an **optional** argument to specify the input data file name. The default input file is `/data/ordered_inputs.csv`. |
287 | | - - `-o=<output_file>` is an **optional** argument to specify the name of the output file. The default name of the output file is `ordered_outputs.csv`. |
288 | | -2. Run the sample on the FPGA simulator. |
289 | | - ``` |
290 | | - set CL_CONTEXT_MPSIM_DEVICE_INTELFPGA=1 |
291 | | - crr.fpga_sim.exe <input_file> [-o=<output_file>] |
292 | | - set CL_CONTEXT_MPSIM_DEVICE_INTELFPGA= |
293 | | - ``` |
294 | | -
|
295 | | -> **Note**: Hardware runs are not supported on Windows. |
296 | | -
|
297 | | -## Example Output |
298 | | -
|
299 | | -``` |
300 | | -Running on device: ofs_n6001 : Intel OFS Platform (ofs_ec00000) |
301 | | - |
302 | | -============= Correctness Test ============= |
303 | | -Running analytical correctness checks... |
304 | | -CPU-FPGA Equivalence: PASS |
305 | | - |
306 | | -============= Throughput Test ============= |
307 | | - Avg throughput: 329.5 assets/s |
308 | | -``` |
309 | | -
|
310 | | -## License |
311 | | -
|
312 | | -Code samples are licensed under the MIT license. See [License.txt](/License.txt) for details. |
313 | | -
|
314 | | -Third party program Licenses can be found here: [third-party-programs.txt](/third-party-programs.txt). |
| 8 | +This specific sample can be found in the hls-samples repository [here](https://github.com/altera-fpga/hls-samples/ReferenceDesigns/crr/README.md). |
0 commit comments