Skip to content

Commit bb6ae56

Browse files
committed
Docker Image Availability: The CUDA 13.0.1 image reference in your Dockerfile (nvidia/cuda:13.0.1-cudnn-devel-ubuntu24.04) should be verified on Docker Hub. CUDA 13.0 was released in August 2025, and image availability may vary.
ROCm 7.1 Preview: AMD has released ROCm 7.1 preview (October 2025). You may want to consider testing with this version for early access to new features. Breaking Changes: CUDA 13.0 dropped support for Maxwell/Pascal/Volta architectures HIP 7.0 introduced behavioral changes (see AMD documentation)
1 parent 4ce87e0 commit bb6ae56

File tree

12 files changed

+49
-42
lines changed

12 files changed

+49
-42
lines changed

.gitignore

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -162,4 +162,7 @@ pmc_perf_*
162162
*.pid
163163

164164
# Memory dumps
165-
*.dmp
165+
*.dmp
166+
167+
.cursor/
168+
AGENTS.md

README.md

Lines changed: 30 additions & 26 deletions
Large diffs are not rendered by default.

modules/module1/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ After completing this module, you will be able to:
2020

2121
### Prerequisites
2222
- NVIDIA GPU with CUDA support OR AMD GPU with ROCm support
23-
- CUDA Toolkit 13.0+ or ROCm 7.0+ (Docker images provide CUDA 13.0.1 and ROCm 7.0.1)
23+
- CUDA Toolkit 13.0+ or ROCm 7.0+ (Docker images provide CUDA 13.0.1 and ROCm 7.0)
2424
- C/C++ compiler (GCC, Clang, or MSVC)
2525

2626
Tip: You can skip native installs by using our Docker environment (recommended):

modules/module1/examples/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -43,13 +43,13 @@ This directory contains practical examples that accompany Module 1 of the GPU Pr
4343

4444
### For CUDA Examples
4545
- NVIDIA GPU with compute capability 5.0+
46-
- NVIDIA drivers 550+ recommended
47-
- CUDA Toolkit 12.0+ (Docker uses CUDA 12.9.1)
46+
- NVIDIA drivers 580+ recommended
47+
- CUDA Toolkit 13.0+ (Docker uses CUDA 13.0.1)
4848
- GCC/Clang compiler
4949

5050
### For HIP Examples
5151
- AMD GPU with ROCm support OR NVIDIA GPU
52-
- ROCm 6.0+ (for AMD) or CUDA 12.0+ (for NVIDIA backend)
52+
- ROCm 7.0+ (for AMD) or CUDA 13.0+ (for NVIDIA backend)
5353
- HIP compiler (hipcc)
5454

5555
## Quick Start

modules/module2/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ After completing this module, you will be able to:
1919

2020
### Prerequisites
2121
- NVIDIA GPU with CUDA support OR AMD GPU with ROCm support
22-
- CUDA Toolkit 13.0+ or ROCm 7.0+ (Docker images provide CUDA 13.0.1 and ROCm 7.0.1)
22+
- CUDA Toolkit 13.0+ or ROCm 7.0+ (Docker images provide CUDA 13.0.1 and ROCm 7.0)
2323
- C/C++ compiler (GCC, Clang, or MSVC)
2424

2525
Recommended: use our Docker dev environment

modules/module2/examples/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -61,8 +61,8 @@ Comprehensive memory bandwidth optimization techniques:
6161
## Building and Running Examples
6262

6363
### Prerequisites
64-
- CUDA Toolkit 12.0+ (for CUDA examples)
65-
- ROCm 6.0+ (for HIP examples)
64+
- CUDA Toolkit 13.0+ (for CUDA examples)
65+
- ROCm 7.0+ (for HIP examples)
6666
- Compatible GPU (NVIDIA or AMD)
6767
- C++17 compatible compiler
6868

@@ -108,4 +108,4 @@ rocprof --stats ./build/02_memory_coalescing_hip
108108

109109
## Notes
110110

111-
These examples are designed to be educational and performance-oriented. Use the provided Docker environment for consistent toolchains (CUDA 12.9.1, ROCm latest). Binaries are emitted to the `build/` directory by the Makefile.
111+
These examples are designed to be educational and performance-oriented. Use the provided Docker environment for consistent toolchains (CUDA 13.0.1, ROCm 7.0). Binaries are emitted to the `build/` directory by the Makefile.

modules/module4/examples/Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -529,7 +529,7 @@ help_extended:
529529
@echo ""
530530
@echo "Requirements:"
531531
@echo " - CUDA Toolkit 10.0+ (for CUDA examples)"
532-
@echo " - HIP/ROCm 4.0+ (for HIP examples)"
532+
@echo " - HIP/ROCm 7.0+ (for HIP examples)"
533533
@echo " - Compute Capability 5.0+ (3.5+ for dynamic parallelism)"
534534
@echo " - Multi-GPU system recommended for full testing"
535535
@echo " - OpenMP support for parallel host code"

modules/module5/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -191,7 +191,7 @@ rocprof --version # AMD ROCm Profiler
191191
```
192192

193193
**Minimum Requirements:**
194-
- CUDA Toolkit 12.0+ or HIP/ROCm 6.0+
194+
- CUDA Toolkit 13.0+ or HIP/ROCm 7.0+
195195
- Compute Capability 6.0+ (recommended for full feature support)
196196
- Profiling tools installed and properly configured
197197
- Sufficient GPU memory for performance testing (4GB+ recommended)

modules/module6/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ By completing this module, you will:
1717
## Prerequisites
1818

1919
**Recommended Requirements:**
20-
- CUDA Toolkit 12.0+ or ROCm 6.0+
20+
- CUDA Toolkit 13.0+ or ROCm 7.0+
2121

2222
### Core Content
2323
- **content.md** - Comprehensive guide covering all fundamental parallel algorithm patterns
@@ -169,7 +169,7 @@ make system_info
169169
```
170170

171171
**Minimum Requirements:**
172-
- CUDA Toolkit 11.0+ or ROCm 5.0+
172+
- CUDA Toolkit 13.0+ or ROCm 7.0+
173173
- Compute Capability 6.0+ recommended
174174
- 8GB+ GPU memory for large dataset examples
175175
- C++14 compatible compiler

modules/module7/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -170,7 +170,7 @@ rocm-smi --showproductname
170170
```
171171

172172
**Recommended Requirements:**
173-
- CUDA Toolkit 12.0+ or ROCm 6.0+
173+
- CUDA Toolkit 13.0+ or ROCm 7.0+
174174
- Compute Capability 7.0+ (Tensor Cores for applicable algorithms)
175175
- 16GB+ GPU memory for large-scale problems
176176
- Multi-GPU setup recommended for distributed algorithms

0 commit comments

Comments
 (0)