Skip to content

Conversation

@mitchellxh
Copy link
Contributor

@mitchellxh mitchellxh commented Oct 10, 2025

Summary

  • avoid retaining per-image probability tensors on the GPU during prediction
  • detach probabilities to CPU and drop intermediate GPU tensors after each batch
  • prevents large runs (~7k images) from exceeding VRAM (32GB)

Fixes #145.

@hlapp
Copy link
Member

hlapp commented Oct 10, 2025

@mitchellxh thanks so much for your contribution!! Would you mind creating an issue that documents the problem you're seeing without this change? We can then link this PR to it.

Depending on the nature of the problem, we might also want to add a test case.

@hlapp
Copy link
Member

hlapp commented Oct 11, 2025

@mitchellxh many thanks for posting the issue! Is the del step required, or would the following also work:

probs = self.create_probabilities(img_features, txt_features)
probs = probs.detach().cpu()

@hlapp
Copy link
Member

hlapp commented Oct 11, 2025

And for img_features did you see that the variable going out of scope does not release the GPU memory, necessitating the explicit deletion of the object?

@mitchellxh
Copy link
Contributor Author

@hlapp the deletion step is not needed!

img_features can be tamed by setting an appropriate batch size for the VRAM

@hlapp
Copy link
Member

hlapp commented Oct 11, 2025

I've simplified the fix to what I understand is strictly necessary. @mitchellxh if you have time, could you check that the result should still be equivalent in outcome to your initial patch?

@hlapp hlapp requested a review from Copilot October 11, 2025 17:30
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a GPU memory management issue by releasing GPU tensors after prediction to prevent VRAM accumulation during large batch processing.

  • Adds explicit tensor detachment from GPU to CPU after probability computation
  • Prevents GPU memory exhaustion when processing large image sets (~7k images)
  • Includes explanatory comment for the memory management fix

Copy link
Member

@hlapp hlapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, thanks @mitchellxh for your contribution. I'll wait a bit for your 👍🏻 to my simplification before merging to ensure I haven't missed or misunderstood anything.

@mitchellxh
Copy link
Contributor Author

confirmed! VRAM remains stable with your simplified fix.

@hlapp hlapp merged commit 412ed1f into Imageomics:main Oct 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GPU OOM during prediction

2 participants