Skip to content

Conversation

@holmeso
Copy link
Contributor

@holmeso holmeso commented Aug 5, 2025

Description

This pull request introduces several improvements and bug fixes to the qsnp pipeline, primarily focused on memory management during BAM processing, code modernization, and test cleanup. The most significant update is the addition of a safeguard to limit memory usage when encountering large numbers of poor-quality reads at the same genomic position. Additionally, the codebase is modernized with Java 8+ idioms, and tests are updated for clarity and conciseness.

Pipeline memory and performance improvements:

  • Added logic in the Producer class to limit the number of failed filter records per start position to 1000, preventing memory issues when many poor-quality reads map to the same location (Pipeline.java). [1] [2]

Code modernization and cleanup:

  • Replaced explicit type arguments with the diamond operator (<>) and updated collection usage to use Java 8+ idioms, such as List::toList instead of stream().collect(Collectors.toList()) (Pipeline.java, StandardPipelineTest.java). [1] [2] [3] [4]
  • Updated deprecated or verbose code, such as using getFirst() instead of get(0) for lists and simplifying exception messages (Pipeline.java). [1] [2]

Test improvements and cleanup:

  • Updated assertions in tests to use more descriptive and modern JUnit methods (PipelineTest.java). [1] [2]
  • Removed unused imports and deprecated or ignored test methods, streamlining the test codebase (PipelineTest.java, StandardPipelineTest.java). [1] [2] [3]

Documentation and naming consistency:

  • Improved parameter naming and documentation for clarity in method signatures and comments (Pipeline.java). [1] [2]

These changes collectively improve the pipeline's robustness, maintainability, and code clarity.) were we now keep the bases of the failed filter reads rather than just a count in the Accumulator object.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

No new unit tests, but I ran this against the 47 failed SDFTM runs in cromwell prod and they all succeeded.

Are WDL Updates Required?

Not specifically, but the adamajava version used in the various workflows will need to be updated once this change has been released

@holmeso
Copy link
Contributor Author

holmeso commented Aug 5, 2025

fyi, copilot wrote the bulk of this PR summary

@newellf
Copy link
Contributor

newellf commented Aug 5, 2025

Like the PR summary by (mostly!) copilot

@holmeso holmeso merged commit 2664248 into master Aug 5, 2025
1 check passed
@holmeso holmeso deleted the qsnp_unfiltered_read_bug branch August 5, 2025 01:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants