Skip to content

Conversation

@qclayssen
Copy link
Collaborator

@qclayssen qclayssen commented Aug 19, 2025

Addresses PCGR's limitation in processing hypermutated samples with over 500,000 variants. The current threshold requires filtering, which introduces can discrepancies.

Key Changes:

  1. Chunked VCF Processing:

    • Removed variant downsampling.
    • Split input VCF files into chunks of 450,000 variants.
    • Process chunks in parallel to bypass the variant limit and retain data integrity.
  2. Parallel Processing:

    • Improved efficiency and reduced runtime through parallel chunk processing.
  3. Logging Enhancements:

    • Enhanced logging for better monitoring and error tracking.

@qclayssen qclayssen force-pushed the feature/hypermutation branch from 367015f to 7386015 Compare August 26, 2025 05:48
@qclayssen qclayssen self-assigned this Aug 26, 2025
@qclayssen qclayssen added the enhancement New feature or request label Aug 26, 2025
@qclayssen qclayssen requested a review from scwatts August 26, 2025 06:08
@qclayssen qclayssen marked this pull request as ready for review August 26, 2025 06:12
## Variation selection (annotation) ##
######################################
MAX_SOMATIC_VARIANTS = 500_000
MAX_SOMATIC_VARIANTS = 450_000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update PR description to match this new change

@scwatts
Copy link
Member

scwatts commented Aug 29, 2025

I've pushed a small commit to get the tests working again - please take a look and see whether this aligns with your expectations wrt your changes in the PR

@scwatts
Copy link
Member

scwatts commented Aug 29, 2025

Approved, please merge at your discretion!

@qclayssen qclayssen merged commit d792557 into release/0.3.0 Aug 29, 2025
2 checks passed
qclayssen added a commit that referenced this pull request Sep 15, 2025
qclayssen added a commit that referenced this pull request Oct 2, 2025
qclayssen added a commit that referenced this pull request Oct 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants