Skip to content

Commit b196bcb

Browse files
authored
Merge pull request #6159 from dotlogix/Features/ComputeShader_Tutorial
Adding example code for compute shaders
2 parents fc6ea5e + 245544f commit b196bcb

File tree

2 files changed

+248
-0
lines changed

2 files changed

+248
-0
lines changed
Lines changed: 247 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,247 @@
1+
.. _doc_compute_shaders:
2+
3+
Using compute shaders
4+
=====================
5+
6+
7+
Think of compute shaders as blocks of code that are executed on the GPU for any purpose we want.
8+
Compute shaders are independent from the graphics pipeline and do not have much fixed-functionality.
9+
Contrast this with fragment shaders which are used specifically for assigning a color to a fragment in a render target.
10+
The big benefit of compute shaders over code executed on a CPU is the high amount of parallelization that GPUs provide.
11+
12+
Because compute shaders are independent of the graphics pipeline we don't have any user defined inputs or outputs
13+
(like a mesh going into the vertex shader or a texture coming out of a fragment shader). Instead, compute shaders
14+
make changes directly to memory stored on the GPU from which we can read and write using scripts.
15+
16+
How they work
17+
-------------
18+
19+
Compute shaders can be thought of as a mass of small computers called work groups.
20+
Much like super computers they are aligned in rows and columns but also stacked on top of each other
21+
essentially forming a 3D array of them.
22+
23+
When creating a compute shader we can specify the number of work groups we wish to use.
24+
Keep in mind that these work groups are independent from each other and therefore can not depend on the results from other work groups.
25+
26+
In each work group we have another 3D array of threads called invocations, but unlike work groups, invocations can communicate with each other. The number of invocations in each work group is specified inside the shader.
27+
28+
So now lets work with a compute shader to see how it really works.
29+
30+
Creating a ComputeShader
31+
------------------------
32+
33+
To begin using compute shaders, create a new text file called "compute_example.glsl". When you write compute shaders in Godot, you write them in GLSL directly. The Godot shader language is based off of GLSL so if you are familiar with normal shaders in Godot the syntax below will look somewhat familiar.
34+
35+
Let's take a look at this compute shader code:
36+
37+
.. code-block:: glsl
38+
39+
#[compute]
40+
#version 450
41+
42+
// Invocations in the (x, y, z) dimension
43+
layout(local_size_x = 2, local_size_y = 1, local_size_z = 1) in;
44+
45+
// A binding to the buffer we create in our script
46+
layout(set = 0, binding = 0, std430) restrict buffer MyDataBuffer {
47+
double data[];
48+
}
49+
my_data_buffer;
50+
51+
// The code we want to execute in each invocation
52+
void main() {
53+
// gl_GlobalInvocationID.x uniquely identifies this invocation across all work groups
54+
my_data_buffer.data[gl_GlobalInvocationID.x] *= 2.0;
55+
}
56+
57+
This code takes an array of doubles, multiplies each element by 2 and store the results back in the buffer array.
58+
59+
To continue copy the code above into your newly created "compute_example.glsl" file.
60+
61+
Create a local RenderingDevice
62+
------------------------------
63+
64+
To interact and execute a compute shader we need a script. So go ahead and create a new script in the language of your choice and attach it to any Node in your scene.
65+
66+
Now to execute our shader we need a local :ref:`RenderingDevice <class_RenderingDevice>` which can be created using the :ref:`RenderingServer <class_RenderingServer>`:
67+
68+
.. tabs::
69+
.. code-tab:: gdscript GDScript
70+
71+
# Create a local rendering device.
72+
var rd := RenderingServer.create_local_rendering_device()
73+
74+
.. code-tab:: csharp
75+
76+
// Create a local rendering device.
77+
var rd = RenderingServer.CreateLocalRenderingDevice();
78+
79+
After that we can load the newly created shader file "compute_example.glsl" and create a pre-compiled version of it using this:
80+
81+
.. tabs::
82+
.. code-tab:: gdscript GDScript
83+
84+
# Load GLSL shader
85+
var shader_file := load("res://compute_example.glsl")
86+
var shader_spirv: RDShaderSPIRV = shader_file.get_spirv()
87+
var shader := rd.shader_create_from_spirv(shader_spirv)
88+
89+
.. code-tab:: csharp
90+
91+
// Load GLSL shader
92+
var shaderFile = GD.Load<RDShaderFile>("res://compute_example.glsl");
93+
var shaderBytecode = shaderFile.GetSpirv();
94+
var shader = rd.ShaderCreateFromSpirv(shaderBytecode);
95+
96+
97+
Provide input data
98+
------------------
99+
100+
As you might remember we want to pass an input array to our shader, multiply each element by 2 and get the results.
101+
102+
To pass values to a compute shader we need to create a buffer. We are dealing with an array of doubles, so we will use a storage buffer for this example.
103+
A storage buffer takes an array of bytes and allows the CPU to transfer data to and from the GPU.
104+
105+
So let's initialize an array of doubles and create a storage buffer:
106+
107+
.. tabs::
108+
.. code-tab:: gdscript GDScript
109+
110+
# Prepare our data. We use doubles in the shader, so we need 64 bit.
111+
var input := PackedFloat64Array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
112+
var input_bytes := input.to_byte_array()
113+
114+
# Create a storage buffer that can hold our double values.
115+
# Each double has 8 byte (64 bit) so 10 x 8 = 80 bytes
116+
var buffer := rd.storage_buffer_create(input_bytes.size(), input_bytes)
117+
118+
.. code-tab:: csharp
119+
120+
// Prepare our data. We use doubles in the shader, so we need 64 bit.
121+
var input = new double[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
122+
var inputBytes = new byte[input.Length * sizeof(double)];
123+
Buffer.BlockCopy(input, 0, inputBytes, 0, inputBytes.Length);
124+
125+
// Create a storage buffer that can hold our double values.
126+
// Each double has 8 byte (64 bit) so 10 x 8 = 80 bytes
127+
var buffer = rd.StorageBufferCreate((uint)inputBytes.Length, inputBytes);
128+
129+
With the buffer in place we need to tell the rendering device to use this buffer.
130+
To do that we will need to create a uniform (like in normal shaders) and assign it to a uniform set which we can pass to our shader later.
131+
132+
.. tabs::
133+
.. code-tab:: gdscript GDScript
134+
135+
# Create a uniform to assign the buffer to the rendering device
136+
var uniform := RDUniform.new()
137+
uniform.uniform_type = RenderingDevice.UNIFORM_TYPE_STORAGE_BUFFER
138+
uniform.binding = 0 # this needs to match the "binding" in our shader file
139+
uniform.add_id(buffer)
140+
var uniform_set := rd.uniform_set_create([uniform], shader, 0) # the last parameter (the 0) needs to match the "set" in our shader file
141+
142+
.. code-tab:: csharp
143+
144+
// Create a uniform to assign the buffer to the rendering device
145+
var uniform = new RDUniform
146+
{
147+
UniformType = RenderingDevice.UniformType.StorageBuffer,
148+
Binding = 0
149+
};
150+
uniform.AddId(buffer);
151+
var uniformSet = rd.UniformSetCreate(new Array<RDUniform> { uniform }, shader, 0);
152+
153+
154+
Defining a compute pipeline
155+
---------------------------
156+
The next step is to create a set of instructions our GPU can execute.
157+
We need a pipeline and a compute list for that.
158+
159+
The steps we need to do to compute our result are:
160+
161+
1. Create a new pipeline.
162+
2. Begin a list of instructions for our GPU to execute.
163+
3. Bind our compute list to our pipeline
164+
4. Bind our buffer uniform to our pipeline
165+
5. Execute the logic of our shader
166+
6. End the list of instructions
167+
168+
.. tabs::
169+
.. code-tab:: gdscript GDScript
170+
171+
# Create a compute pipeline
172+
var pipeline := rd.compute_pipeline_create(shader)
173+
var compute_list := rd.compute_list_begin()
174+
rd.compute_list_bind_compute_pipeline(compute_list, pipeline)
175+
rd.compute_list_bind_uniform_set(compute_list, uniform_set, 0)
176+
rd.compute_list_dispatch(compute_list, 5, 1, 1)
177+
rd.compute_list_end()
178+
179+
.. code-tab:: csharp
180+
181+
// Create a compute pipeline
182+
var pipeline = rd.ComputePipelineCreate(shader);
183+
var computeList = rd.ComputeListBegin();
184+
rd.ComputeListBindComputePipeline(computeList, pipeline);
185+
rd.ComputeListBindUniformSet(computeList, uniformSet, 0);
186+
rd.ComputeListDispatch(computeList, xGroups: 5, yGroups: 1, zGroups: 1);
187+
rd.ComputeListEnd();
188+
189+
Note that we are dispatching the compute shader with 5 work groups in the x-axis, and one in the others.
190+
Since we have 2 local invocations in the x-axis (specified in our shader) 10 compute shader invocations will be launched in total.
191+
If you read or write to indices outside of the range of your buffer, you may access memory outside of your shaders control or parts of other variables which may cause issues on some hardware.
192+
193+
194+
Execute a compute shader
195+
------------------------
196+
197+
After all of this we are done, kind of.
198+
We still need to execute our pipeline, everything we did so far was only definition not execution.
199+
200+
To execute our compute shader we just need to submit the pipeline to the GPU and wait for the execution to finish:
201+
202+
.. tabs::
203+
.. code-tab:: gdscript GDScript
204+
205+
# Submit to GPU and wait for sync
206+
rd.submit()
207+
rd.sync()
208+
209+
.. code-tab:: csharp
210+
211+
// Submit to GPU and wait for sync
212+
rd.Submit();
213+
rd.Sync();
214+
215+
Ideally, you would not synchronize the RenderingDevice right away as it will cause the CPU to wait for the GPU to finish working. In our example we synchronize right away because we want our data available for reading right away. In general, you will want to wait at least a few frames before synchronizing so that the GPU is able to run in parellel with the CPU.
216+
217+
Congratulations you created and executed a compute shader. But wait, where are the results now?
218+
219+
Retrieving results
220+
-----------------
221+
222+
You may remember from the beginning of this tutorial that compute shaders don't have inputs and outputs, they simply change memory. This means we can retrieve the data from our buffer we created at the start of this tutorial.
223+
The shader read from our array and stored the data in the same array again so our results are already there.
224+
Let's retrieve the data and print the results to our console.
225+
226+
.. tabs::
227+
.. code-tab:: gdscript GDScript
228+
229+
# Read back the data from the buffer
230+
var output_bytes := rd.buffer_get_data(buffer)
231+
var output := output_bytes.to_float64_array()
232+
print("Input: ", input)
233+
print("Output: ", output)
234+
235+
.. code-tab:: csharp
236+
237+
// Read back the data from the buffers
238+
var outputBytes = rd.BufferGetData(outputBuffer);
239+
var output = new double[input.Length];
240+
Buffer.BlockCopy(outputBytes, 0, output, 0, outputBytes.Length);
241+
GD.Print("Input: ", input)
242+
GD.Print("Output: ", output)
243+
244+
Conclusion
245+
----------
246+
247+
Working with compute shaders is a little cumbersome to start, but once you have the basics working in your program you can scale up the complexity of your shader without making many changes to your script.

tutorials/shaders/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ Shaders
1010
your_first_shader/index
1111
shader_materials
1212
visual_shaders
13+
compute_shaders
1314
screen-reading_shaders
1415
converting_glsl_to_godot_shaders
1516
shaders_style_guide

0 commit comments

Comments
 (0)