Bump to AWS FPGA 1.4.19 #39

davidbiancolin · 2021-06-15T21:38:50Z

There were two main workarounds to get this bump to work:

Demote a route message (35-1) from Error to Critical Warning
https://forums.aws.amazon.com/thread.jspa?threadID=338653&tstart=0
In a follow up PR, we should handle this after-the-fact by collecting black-listed critical warnings.
Add dummy loads on all CL clocks.
https://forums.aws.amazon.com/thread.jspa?threadID=338735&tstart=0
Initially this appeared not to work, but once i added an extra load to the PLL reference clock this error has appeared to have disappeared.

1. Fixed XRT installation instruction for 2018.3 2. artifact update for SDAccel helloworld_ocl_runtime example 3. updating tool version tables in README.md

* New functionality: * Improved AFI load times for pipelined accelerator designs. For more details please see [Amazon FPGA image (AFI) pre-fetch and caching features](./hdk/docs/load_times.md). * Ease of Use features: * [Improved SDK Error messaging](./sdk/userspace/fpga_libs/fpga_mgmt/fpga_mgmt.c) * [Improved documentation](./hdk/docs/IPI_GUI_Vivado_Setup.md#switching-between-hdk-and-hlx-flows) to help with transition from [HLX to HDK command line flows](https://forums.aws.amazon.com/thread.jspa?threadID=302718&tstart=0) and vice versa * Incorporates feedback from [aws-fpga Issue 458](aws/aws-fpga#458) by making the ```init_ddr``` function, used in design simulations to initialize DDR, more generic by moving out ATG deselection logic to a new ```deselect_atg_hw``` task * Bug Fixes: * Fixed Shell simulation model (sh_bfm) issue on PCIM AXI read data channel back pressure which was described in HDK 1.4.8 Errata. * Fixed HDK simulation example which [demonstrates DMA and PCIM traffic in parallel](./hdk/cl/examples/cl_dram_dma/verif/tests/test_dma_pcim_concurrent.sv).

Release V1.4.10 * New functionality: * SDK now sorts the slots in DBDF order. Any scripts or integration maintainers should note that the slot order will be different from previous versions and should make any updates accordingly. * Bug Fixes: * Fixes a bug in the [Automatic Traffic Generator (ATG)](./hdk/cl/examples/cl_dram_dma/design/cl_tst.sv). In SYNC mode, the ATG did not wait for write response transaction before issuing read transactions. * Released [Xilinx runtime(XRT) version 2018.3.3.2](https://github.com/Xilinx/XRT/releases/tag/2018.3.3.2) to fix the following error: `symbol lookup error: /opt/xilinx/xrt/lib/libxrt_aws.so: undefined symbol: uuid_parse!` discussed in this [forum post](https://forums.aws.amazon.com/thread.jspa?messageID=899474&#899474). * This release fixes a bug wherein concurrent AFI load requests on two or more slots resulted in a race condition which sometimes resulted in Error: `(20) pci-device-missing` * This release fixes a issue with coding style of logic which could infer a latch during synthesis in [sde_ps_acc module](./hdk/cl/examples/cl_sde/design/sde_ps_acc.sv) within cl_sde example

Release 1.4.11 * FPGA developer kit now supports Xilinx SDx/Vivado 2019.1 * To upgrade, use Developer AMI v1.7.0 on the AWS Marketplace. The Developer Kit scripts (hdk_setup.sh or sdaccel_setup.sh) will detect the tool version and update the environment based on requirements needed for Xilinx 2019.1 tools. * New functionality: * Added a developer resources section that provides guides on how to setup your own GUI Desktop and compute cluster environment. * Developers can now ask for AFI limit increases via the AWS Support Center Console * Create a case to increase your `EC2 FPGA` service limit from the console. * HLx IPI flow updates * HLx support for AXI Fast Memory mode. * HLx support for 3rd party simulations. * HLx support for changes in shell and AWS IP updates(e.g. sh_ddr). * Bug Fixes: * Documentation fixes in the Shell Interface Specification * Fixes for forum questions * Unable to compile aws_v1_0_vl_rfs.sv in Synopsys VCS * Use fpga_mgmt init in HLx runtime * New XRT versions added to the XRT Installation Instructions to fix segmentation faults when using xclbin instead of awsxclbin files. * Deprecations: * Removed GUI Setup scripts from AMI v1.7.0 onwards.

In xilinx_2019.1 @ 0ec1aef (and not previous versions) the hello world example is located in $SDACCEL_DIR/examples/xilinx/getting_started/hello_world/helloworld_ocl/ rather than $SDACCEL_DIR/examples/xilinx/getting_started/host/helloworld_ocl/

* Update path Attempting to change directories into this path (i.e., `/home/centos/src/project_data/aws-fpga/SDAccel/examples/xilinx_2017.4/getting_started/host/helloworld_ocl`) draws an error because the `xilinx_2017.4` directory is empty, now, and `xilinx` is a symbolic link that points to the latest release (i.e., `/home/centos/src/project_data/aws-fpga/SDAccel/examples/xilinx_2019.1`). Additionally: `helloworld_ocl` is no longer a sub-directory of `host`; it's now a sub-directory of `hello_world`: so this part of the path should be updated as well. * Fix unset flags The alphanumeric part of these flags (i.e., `f`) are preceded by en dashes, when they should be hyphens; as using the former instead of the later results in errors (`-bash: unset: ``–f': not a valid identifier`). * Update path to workspace directory This change is consistent with 48c3048, and the basename of the image, but has a different filename extension because of the transparency.

* Update gui_fig_2.png and gui_fig_3.png to SDx 2019.1 SDx's welcome screen has stratified the option (read: clickable button) to `Create SDx Project` into two new ones: `Create Platform Project` and `Create Library Project`. These changes are rendered in Figure 2, while Figure 3 captures cosmetic differences to the user interface. * Add Figure 2 * Add Figure 3

* Added supported versions for BJS AMI's (#589) * Added release notes * Added re:Invent 19 workshop link

* Added Xilinx 2019.2 toolset support * Enabled Vitis Runs * Updated XRT link for 2019.1 * Update ERRATA.md * Add errata that CL cannot connect shell generated clock directly to BUFG in CL. Co-authored-by: AWSkhanasif <[email protected]>

* Added link to Xilinx AR# 73360 * Updated README to point to Vitis as the initial start point for new development

* Fixes for prepare_new_cl.sh * Update to FAQ * Early exit if running sdaccel_setup on a vitis instance * Changes per feedback from CSA

If my understand is correct, SDE IP use three interfaces of shell, including PCIS, PCIM and OCL, instead of two.

* Fixes to create_vitis_sfi script * Fixed based on feedback

* Added a new platform file to fix DDR bandwidth issue * Add Vitis Debug document * Updated broken link in the Testing doc

FireSim 1.10.0 Release (Dev -> Master PR)

The old link to the Vitis examples in the README.md lead to https://github.com/aws/aws-fpga/blob/master/Vitis/examples/xilinx_2019.2, which is a 404 error. The new link is the result of going to https://github.com/aws/aws-fpga/blob/master/Vitis/examples and then entering the xilinx_2019.2 folder.

Fix table error in Overview of Development Tools

* Add Vitis Debug document (#601) * Create Debug_OpenCL_Kernel.md * Update and rename Debug_OpenCL_Kernel.md to Debug_Vitis_Kernel.md * Updated the shell interface spec to reflect current shell (#603) * Updated the shell interface spec to reflect current shell and pointed to the DDR Data Retention doc * Update hdk/docs/AWS_Shell_Interface_Specification.md * Enhance DDR Model Build qualifiers in hdk_setup.sh script. (#604) * Enhance DDR Model Build qualifiers in hdk_setup.sh script. * Enhance the DDR model build's lock file creation+check to not rely on external tools. * Update Virtual_JTAG_XVC.md (#606) * AR73068 patching (#608) * Added patching mechanism for Vivado AR73068 * Updated supported versions * Added dma range error to interrupt status register metrics (#591) * added dma range error to interrupt status register metrics * updated tests to match change to output * Fixing test_fpga_tools to accomodate dma range error addition. (#609) * Fixed the lines where we expect ` clock group c `

* Add upgrade ip changes to the init.tcl file * Updated the cl_dram_dma public AFI

* FPGA developer kit now supports Xilinx Vivado/Vitis 2020.1 * Updated Vitis examples to include usage of Vitis Libraries. * Added documentation and examples to show Xilinx Alveo design migration to F1. * Removed support for Xilinx toolsets 2017.4, 2018.2 and 2018.3.

Q: Can I delete an AFI? and Q: Can I share an AFI with other AWS accounts?

* Add code syntax highlight Add syntax highlight to C and Python codes * Add hyperlink on XDMA README

* xdma: driver update * xdma: add back F1 specific device ids * xdma: update license header check * keep enable_credit_mp disabled * back out experimental aio code

* Updated XDMA Driver to allow builds on newer kernels * Updated documentation on Alveo U200 to F1 platform porting * Added Vitis 2019.2 Patching for AR#73068

Co-authored-by: Joost Hoozemans <[email protected]>

* Added peek and poke for the BAR1 interface * Added peek and poke for the SDA and PCIS AXI slave interfaces. Now all the 4 slave interface have a specific set of peek/poke functions callable from C. The new cl_peek_ocl and cl_poke_ocl are actually the same as the original cl_peek/cl_poke, which are still needed because they are used internally to configure the DDR controllers and test circuits. Co-authored-by: Joost Hoozemans <[email protected]>

* Updated XRT installation instructions * Added AL2 Kernel

FireSim 1.11 Release (Dev -> Master)

* Fixed the broken links pointing to the AXI interface specifications * Enable Xilinx 2020.2 tools * Updated FAQ on how to request an AFI limit increase

* Bug Fix release to fix CDC path in flop_ccf.sv * Updated Errata with details

FireSim 1.12 Release (Dev -> Master) Tracking PR

If we run into clock partitioning proble3ms again, we'll need to disable it

davidbiancolin · 2021-06-15T21:40:17Z

hdk/cl/developer_designs/cl_firesim/design/cl_firesim.sv

@@ -55,6 +55,46 @@ logic rst_extra1_n_sync;
  assign cl_sh_id1[31:0] = `CL_SH_ID1;


+// Clock Region Partitioning Workaround


Clock Partitioning Workaround

davidbiancolin · 2021-06-15T21:40:54Z

hdk/cl/developer_designs/cl_firesim/build/scripts/create_dcp_from_cl.tcl


 # Promote the following critical warnings to errors to prevent AGFI generation
 # Design not completely routed
-set_msg_config -id {Route 35-1} -new_severity "ERROR"
+#set_msg_config -id {Route 35-1} -new_severity "ERROR"


Router Error Workaround

Did the final answer from AWS come to you as a direct message? Could you update https://forums.aws.amazon.com/thread.jspa?threadID=338653&tstart=0 and paste a link to that here? Have you implemented the TCL checks for design not completely routed in this PR so that we won't hit it or are you going to do that in a follow on?

timsnyder-siv

In general, it's awesome to have this. I had one question about whether you're checking for partially routed nets yet in the TCL. I'm fine with that being a follow-on though if you want to get this merged.

timsnyder-siv · 2021-06-16T16:33:37Z

FAQs.md

+Yes, the Xilinx UltraScale+ FPGA devices used on the F1 instances have a maximum power limit that must be maintained.
+If a loaded AFI consumes maximum power, the F1 instance will automatically gate the input clocks provided to the AFI in order to prevent errors within the FPGA. 
+Developers are provided warnings when power (Vccint) is greater than 85 watts. Above that level, the CL is in danger of being clock gated.  


@davidbiancolin I'm reading through the whole diff just for my education. I know this is just upstream bumping. However, have you ever run into the power limit on a FireSim design? Does the FireSim driver pay attention or know whether it has been clock-gated so that it can report that to the user in some way? It might be useful to somehow indicate this in the FMR calculation output.

Nope. I figure our designs being as slow as they are help considerably, + there's just a lot of grey silicon in the SoCs we tend to simulate. This could change with instance mulithreading, but if we're still running at < 1/4th the nominal frequency high performance FPGA designs run at, i think it's unlikely to be a problem.

timsnyder-siv · 2021-06-16T17:09:52Z

conftest.py

@@ -23,6 +23,8 @@




It's interesting that Amazon seems to use pytest to drive running tests of the aws-fpga stuff.

timsnyder-siv · 2021-06-16T17:14:02Z

developer_resources/DCV_with_ParallelCluster.md

+     * [Launch Vivado](#launch-vivado)
+     * [ParallelCluster Configuration](#pcluster-config)
+     * [Building a DCP On ParallelCluster Using SGE](#building-a-dcp-on-parallelcluster-using-sge)
+     * [Building a DCP On ParallelCluster Using Slurm](#building-a-dcp-on-parallelcluster-using-slurm)


It's cool that they added docs on how to use AWS ParallelCluster for FPGA development. Using AWS ParallelCluster to add support to FireSim manager for running with a cluster managing the build and runtime node provisioning and dolling them out might be a way to develop support before asking IT to make them available in a cluster in the datacenter.

Admittedly i don't know a thing about ParallelCluster, i'll have to look into it.

timsnyder-siv · 2021-06-16T17:29:08Z

hdk/cl/developer_designs/cl_firesim/build/scripts/create_dcp_from_cl.tcl


 # Promote the following critical warnings to errors to prevent AGFI generation
 # Design not completely routed
-set_msg_config -id {Route 35-1} -new_severity "ERROR"
+#set_msg_config -id {Route 35-1} -new_severity "ERROR"


Did the final answer from AWS come to you as a direct message? Could you update https://forums.aws.amazon.com/thread.jspa?threadID=338653&tstart=0 and paste a link to that here? Have you implemented the TCL checks for design not completely routed in this PR so that we won't hit it or are you going to do that in a follow on?

timsnyder-siv · 2021-06-16T17:29:35Z

hdk/cl/developer_designs/cl_firesim/build/scripts/create_dcp_from_cl.tcl

@@ -199,6 +208,11 @@ switch $strategy {
    }
 }

+# Biancolin: Disable phys_opt to temporarily workaround around a clock paritioning issue
+# See https://forums.aws.amazon.com/thread.jspa?threadID=338735&tstart=0


Thanks for putting a link to the forum post. :)

timsnyder-siv · 2021-06-16T17:47:29Z

hdk/cl/developer_designs/cl_firesim/design/cl_firesim.sv

+logic clk_extra_a1_reg;                          //Extra clock A1 (phase aligned to "A" clock group)
+logic clk_extra_a2_reg;                          //Extra clock A2 (phase aligned to "A" clock group)
+logic clk_extra_a3_reg;                          //Extra clock A3 (phase aligned to "A" clock group)
+logic clk_extra_b0_reg;                          //Extra clock B0 (phase aligned to "B" clock group)
+logic clk_extra_b1_reg;                          //Extra clock B1 (phase aligned to "B" clock group)
+logic clk_extra_c0_reg;                          //Extra clock C0 (phase aligned to "B" clock group)
+logic clk_extra_c1_reg;                          //Extra clock C1 (phase aligned to "B" clock group)
+
+always_ff @(posedge clk_extra_a1) begin
+    clk_extra_a1_reg <= 1'b1;
+end
+
+always_ff @(posedge clk_extra_a2) begin
+    clk_extra_a2_reg <= 1'b1;
+end
+
+always_ff @(posedge clk_extra_a3) begin
+    clk_extra_a3_reg <= 1'b1;
+end
+
+always_ff @(posedge clk_extra_b0) begin
+    clk_extra_b0_reg <=  1'b1;
+end
+
+always_ff @(posedge clk_extra_b1) begin
+    clk_extra_b1_reg <= 1'b1;
+end
+
+always_ff @(posedge clk_extra_c0) begin
+    clk_extra_c0_reg <= 1'b1;
+end
+
+always_ff @(posedge clk_extra_c1) begin
+    clk_extra_c1_reg <= 1'b1;
+end


Obvi this is working right now but I'm surprised that Vivado doesn't optimize these unused, constant flops out of the design. Or, maybe you also have a dont_touch put on them somewhere in the TCL I haven't seen yet?

No don't touches. I read elsewhere that it worked without them so i didn't bother.

davidbiancolin · 2021-06-16T18:31:41Z

The operative part of our private message:

I believe it is narrowed down to this particular call from your log:

set_msg_config -id {Route 35-1} -new_severity "ERROR"

The current theory is that set_msg_config is wildcarding this ID to include the 'finish' ID which is 35-16. We haven't been able to pinpoint where this is coming from in AWS scripts, is it possible that you added this particular command to your run? You also may be able to remove it if you modified the create_dcp_from_cl.tcl file.

AWSaalluri and others added 30 commits April 17, 2019 16:19

Patching HDK 1.4.8

eca10db

1. Fixed XRT installation instruction for 2018.3 2. artifact update for SDAccel helloworld_ocl_runtime example 3. updating tool version tables in README.md

Remove duplicate line (#468)

32833e3

Correct path for helloworld example (#466)

ba941a1

In xilinx_2019.1 @ 0ec1aef (and not previous versions) the hello world example is located in $SDACCEL_DIR/examples/xilinx/getting_started/hello_world/helloworld_ocl/ rather than $SDACCEL_DIR/examples/xilinx/getting_started/host/helloworld_ocl/

Fixed missing extern C declaration (#467) (#473)

c9dcecb

RC v1.4.12 (#474)

1f67d8e

* Added supported versions for BJS AMI's (#589) * Added release notes * Added re:Invent 19 workshop link

Release v1.4.13 (#478)

5e9f4cb

* Added Xilinx 2019.2 toolset support * Enabled Vitis Runs * Updated XRT link for 2019.1 * Update ERRATA.md * Add errata that CL cannot connect shell generated clock directly to BUFG in CL. Co-authored-by: AWSkhanasif <[email protected]>

Update Errata and documentation (#481)

0b840c6

* Added link to Xilinx AR# 73360 * Updated README to point to Vitis as the initial start point for new development

Fixes to scripts based on customer feedback (#484)

4f7d97d

* Fixes for prepare_new_cl.sh * Update to FAQ * Early exit if running sdaccel_setup on a vitis instance * Changes per feedback from CSA

fix typo in README of CL_SDE example (#488)

61f6e2e

If my understand is correct, SDE IP use three interfaces of shell, including PCIS, PCIM and OCL, instead of two.

Fixes to create_vitis_sfi script (#598) (#487)

b220124

* Fixes to create_vitis_sfi script * Fixed based on feedback

Release v1.4.14 (#489)

c61dbf5

* Added a new platform file to fix DDR bandwidth issue * Add Vitis Debug document * Updated broken link in the Testing doc

Merge pull request #25 from firesim/dev

65deb47

FireSim 1.10.0 Release (Dev -> Master PR)

Update XRT_installation_instructions.md (#493)

0a7e06b

Update README.md (#495)

8762e6f

Fix table error in Overview of Development Tools

Upgrade DDR IP and regenerate outputs to fix AR73068 issue (#500)

7a6093a

* Add upgrade ip changes to the init.tcl file * Updated the cl_dram_dma public AFI

Removed duplicate FAQ questions (#503)

56f37f1

Q: Can I delete an AFI? and Q: Can I share an AFI with other AWS accounts?

Add syntax highlight + link on Markdowns (#497)

8b5c86b

* Add code syntax highlight Add syntax highlight to C and Python codes * Add hyperlink on XDMA README

xdma: driver update (#498)

288bee0

* xdma: driver update * xdma: add back F1 specific device ids * xdma: update license header check * keep enable_credit_mp disabled * back out experimental aio code

Update Setup_AWS_CLI_and_S3_Bucket.md (#504)

cbd4e77

Release v1.4.17 (#505)

ef05306

* Updated XDMA Driver to allow builds on newer kernels * Updated documentation on Alveo U200 to F1 platform porting * Added Vitis 2019.2 Patching for AR#73068

Fixed failing cl_hello_world_vhdl example (#510)

bc6ff26

Co-authored-by: Joost Hoozemans <[email protected]>

deeppat and others added 13 commits January 11, 2021 14:02

* Added AL2 XRT updates (#512)

aeda393

* Updated XRT installation instructions * Added AL2 Kernel

Merge pull request #29 from firesim/dev

b4544da

FireSim 1.11 Release (Dev -> Master)

Release v1.4.18 (#514)

4750aac

* Fixed the broken links pointing to the AXI interface specifications * Enable Xilinx 2020.2 tools * Updated FAQ on how to request an AFI limit increase

Merge tag 'v1.4.18' into dev

3dabfaa

Release v1.4.19 (#515)

f063204

* Bug Fix release to fix CDC path in flop_ccf.sv * Updated Errata with details

Suppress same messages as the dma_dram example

48130d2

Report clock utilization

86023a3

Merge remote-tracking branch 'upstream/master' into aws-1.4.18-bump

1ea0db5

Demote Route 35-1 back to CW

8490605

Merge pull request #32 from firesim/dev

e2a9752

FireSim 1.12 Release (Dev -> Master) Tracking PR

Workaround clock-partitioning error by disabling phys opt

96fc945

Merge remote-tracking branch 'origin/master' into aws-1.4.18-bump

080cdf0

Re-enable post-placement phys_opt

7ed2a98

If we run into clock partitioning proble3ms again, we'll need to disable it

davidbiancolin commented Jun 15, 2021

View reviewed changes

timsnyder-siv approved these changes Jun 16, 2021

View reviewed changes

timsnyder-siv mentioned this pull request Jun 16, 2021

Bump to AMI 1.10 / AWS FPGA 1.4.19 / Vivado 2020 firesim/firesim#788

Merged

10 tasks

davidbiancolin merged commit c3c590a into dev Jun 17, 2021

davidbiancolin deleted the vivado-2020-bump2 branch June 17, 2021 22:05

davidbiancolin mentioned this pull request Jun 17, 2021

Bump to 1.4.18 upstream aws-fpga #36

Closed

		@@ -55,6 +55,46 @@ logic rst_extra1_n_sync;
		assign cl_sh_id1[31:0] = `CL_SH_ID1;


		// Clock Region Partitioning Workaround

Bump to AWS FPGA 1.4.19 #39

Bump to AWS FPGA 1.4.19 #39

Uh oh!

Conversation

davidbiancolin commented Jun 15, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

timsnyder-siv left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davidbiancolin commented Jun 16, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants