-
Notifications
You must be signed in to change notification settings - Fork 5.2k
drm/v3d: Address race-condition between per-fd GPU stats and fd release #6973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
pelwell
merged 4 commits into
raspberrypi:rpi-6.12.y
from
mairacanal:v3d/downstream/queue-lock-v2
Jul 25, 2025
Merged
drm/v3d: Address race-condition between per-fd GPU stats and fd release #6973
pelwell
merged 4 commits into
raspberrypi:rpi-6.12.y
from
mairacanal:v3d/downstream/queue-lock-v2
Jul 25, 2025
+83
−89
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Instead of storing a pointer to the DRM file data, store a pointer directly to the private V3D file struct. Signed-off-by: Maíra Canal <[email protected]>
Instead of storing the queue's active job in four different variables, store the active job inside the queue's state. This way, it's possible to access all active jobs using an index based in `enum v3d_queue`. Signed-off-by: Maíra Canal <[email protected]>
Each V3D queue works independently and all the dependencies between the jobs are handled through the DRM scheduler. Therefore, there is no need to use one single lock for all queues. Using it, creates unnecessary contention between different queues that can operate independently. Replace the global spinlock with per-queue locks to improve parallelism and reduce contention between different V3D queues (BIN, RENDER, TFU, CSD). This allows independent queues to operate concurrently while maintaining proper synchronization within each queue. Signed-off-by: Maíra Canal <[email protected]>
When the file descriptor is closed while a job is still running, there's a race condition between the job completion callback and the file descriptor cleanup. This can lead to accessing freed memory when updating per-fd GPU stats, such as the following example: [56120.512903] Unable to handle kernel paging request at virtual address 0000330a92b9688a [56120.520881] Mem abort info: [56120.523687] ESR = 0x0000000096000005 [56120.527454] EC = 0x25: DABT (current EL), IL = 32 bits [56120.532785] SET = 0, FnV = 0 [56120.535847] EA = 0, S1PTW = 0 [56120.538995] FSC = 0x05: level 1 translation fault [56120.543891] Data abort info: [56120.546778] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 [56120.552289] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [56120.557362] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [56120.562690] user pgtable: 16k pages, 47-bit VAs, pgdp=0000000023f54000 [56120.569239] [0000330a92b9688a] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 [56120.577975] Internal error: Oops: 0000000096000005 [raspberrypi#1] PREEMPT SMP CPU: 0 UID: 1000 PID: 1497409 Comm: mpv Not tainted 6.12.37-ncvm5+ raspberrypi#1 Hardware name: Raspberry Pi 5 Model B Rev 1.0 (DT) pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : v3d_job_update_stats+0x64/0x168 [v3d] lr : v3d_job_update_stats+0x40/0x168 [v3d] sp : ffffc00080003e60 x29: ffffc00080003e60 x28: ffff800002860000 x27: 0000000000000000 x26: 0000000000000000 x25: ffff800002860000 x24: ffff800002630800 x23: ffff800060786000 x22: 0000330a933c31fb x21: 0000000000000001 x20: 0000330a92b96302 x19: ffff800060786b10 x18: 0000000000000000 x17: ffffaf90506a0000 x16: ffffd06fce57c360 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : ffffd06f5d0fec40 x8 : 0000000000000000 x7 : 0000000000000000 x6 : 000002978dbd535a x5 : 00ffffffffffffff x4 : 0000000000000015 x3 : 0000300001fddf88 x2 : 0000000000000020 x1 : 0000000000010001 x0 : 0000330a92b96872 Call trace: v3d_job_update_stats+0x64/0x168 [v3d] v3d_irq+0x118/0x2e0 [v3d] __handle_irq_event_percpu+0x60/0x220 Fix such an issue by protecting all accesses to `job->file_priv` with the queue's lock. With that, we can clear `job->file_priv` before the V3D per-fd structure is freed and assure that `job->file_priv` exists during the per-fd GPU stats updates. Fixes: e1bc3a1 ("drm/v3d: Avoid NULL pointer dereference in `v3d_job_update_stats()`") Signed-off-by: Maíra Canal <[email protected]>
popcornmix
approved these changes
Jul 25, 2025
Changes look sane, and a test of kodi, neverball and webgl aquarium worked fine. |
I also built a kernel out of this branch and submitted it to Mesa CI to run the full CTS testing suite. No regressions were identified. Thanks for your review! |
popcornmix
added a commit
to raspberrypi/firmware
that referenced
this pull request
Aug 1, 2025
kernel: drm/v3d: Address race-condition between per-fd GPU stats and fd release See: raspberrypi/linux#6973 kernel: pwm: rp1: use pwmchip_get_drvdata() instead of container_of() See: raspberrypi/linux#6972
popcornmix
added a commit
to raspberrypi/rpi-firmware
that referenced
this pull request
Aug 1, 2025
kernel: drm/v3d: Address race-condition between per-fd GPU stats and fd release See: raspberrypi/linux#6973 kernel: pwm: rp1: use pwmchip_get_drvdata() instead of container_of() See: raspberrypi/linux#6972
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This patch series was initially motivated by a race condition (exposed in PATCH 4/4) where we lacked synchronization for
job->file
access. This led to use-after-free issues when a file descriptor was closed while a job was still running.However, beyond fixing this specific race, the series introduces broader improvements to active job management and locking. While PATCH 1/4 and 2/4 are primarily code refactors, PATCH 3/4 introduces changes to the locking scheme. Previously, all queues shared the same spinlock, which caused unnecessary contention during high GPU usage across different queues. PATCH 3/4 allows queues to operate more independently.
[1] Upstream series: https://lore.kernel.org/dri-devel/[email protected]/T/
Closes #6958