perf: Prevent GPU OOM and Improve Processing Efficiency in estimate_pose module #157

tankchenggeng · 2025-08-27T10:09:01Z

📝 What does this PR do?

This PR introduces a major refactor of the estimate_ose module in video2motion.py to address critical memory and performance limitations:

Generator for Lazy Loading: Replaces the list used for loading images to the device (GPU) with a generator.
Chunk Processing: Implements a chunking mechanism to process data_chunks in smaller, manageable batches.

🚫 Motivation and Context

GPU Memory Overflow (OOM): The original code loaded all video frames into GPU memory at once ([frame.to(device) for frame in all_frames]), causing CUDA Out Of Memory errors with longer videos.
Exponential Time Complexity: The processing time of the original algorithm scaled poorly with input size, making long video processing times impractical.

These changes transform the resource usage from being a function of the total video length to a function of the processing batch size.

🔧 Changes Made

File Modified: video2motion.py

Specific Changes:

Generator Expression:
- func(images_crop): Used a generator expression.
Chunk Processing Logic:
- Refactored the processing loop to break down the input data_chunks into smaller, fixed-size batches.
- This prevents the algorithm from processing excessively large amounts of data in a single operation, changing the time complexity from exponential to near-linear for the overall video processing task.

🧪 How Has This Been Tested?

Functional Regression Test:
- Ran the script on a short video (<5s) and verified the output matches the original behavior.
Performance & Stress Test:
- Processed a long video (>5 minutes) that previously caused an OOM error. The script now completes successfully with manageable memory usage and processing time.
- Monitored GPU memory usage via nvidia-smi. The memory footprint is now stable and significantly lower throughout the entire process, instead of climbing until a crash.

✅ Checklist

My code follows the established code style of this project.
I have thoroughly tested my changes, both for correct functionality and performance improvements.
This change is a refactor of existing code and does not require documentation updates.

…ng for memory and speed

perf(estimate_pose): replace list with generator and implement chunki…

9facd06

…ng for memory and speed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: Prevent GPU OOM and Improve Processing Efficiency in estimate_pose module #157

perf: Prevent GPU OOM and Improve Processing Efficiency in estimate_pose module #157

Uh oh!

tankchenggeng commented Aug 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

perf: Prevent GPU OOM and Improve Processing Efficiency in estimate_pose module #157

Are you sure you want to change the base?

perf: Prevent GPU OOM and Improve Processing Efficiency in estimate_pose module #157

Uh oh!

Conversation

tankchenggeng commented Aug 27, 2025

📝 What does this PR do?

🚫 Motivation and Context

🔧 Changes Made

🧪 How Has This Been Tested?

✅ Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant