Skip to content

Conversation

@mike-w-wilson
Copy link
Contributor

This unfurls and runs gnomad_methods validate_release_t on the v4 exomes release HT. It depends on broadinstitute/gnomad_methods#625 and there is some room for improvements but I just wanted to get it in. It writes out the HT with unfurled annotations that will move to the next step.

@mike-w-wilson
Copy link
Contributor Author

Loggers seem to not be working for me right now

@mike-w-wilson
Copy link
Contributor Author

Loggers are a gnomad_methods issue: https://the-tgg.slack.com/archives/C6BD1M01E/p1697463923095569

@mike-w-wilson
Copy link
Contributor Author

Copy link
Contributor

@jkgoodrich jkgoodrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is looking good, I only looked at it for logic, which is totally fine at this point.

I think we might want to add in some other checks, like a check for length of globals being the expected length, and maybe counts being the expected counts. Maybe check for any uses of 'pop' or 'oth' in the globals?

Are there any other checks that we thing should be done on the globals? @ch-kr ?

Then only a few small suggestions

@mike-w-wilson
Copy link
Contributor Author

I've updated this and it depends on broadinstitute/gnomad_methods#630. In the prepare_ht_for_validation we select a subset of fields needed for VCF export and this gets checkpointed. I didnt want to add the freqs and fafs to this since they are so massive and are not needed downstream. For this reason, I put the global checks before that instead of calling it in gnomad methods validate_release_t.

@mike-w-wilson
Copy link
Contributor Author

Copy link
Contributor

@jkgoodrich jkgoodrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only very small changes. I love seeing the output from the new checks, thank you so much for adding!!

NEW_SITE_FIELDS = [
"monoallelic",
"only_het",
"transmitted_singleton",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add sibling singletons

from gnomad_qc.v4.resources.basics import get_logging_path
from gnomad_qc.v4.resources.release import release_sites, validated_release_ht

# TODO: Check global lengths match the fields they reference
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# TODO: Check global lengths match the fields they reference

"release_ht": [
"gs://gnomad-tmp/gnomad.exomes.v4.0.qc_data/release/gnomad.exomes.sites.test.updated_101723.ht"
]
}, # release_sites().ht()},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is just for testing, but don't forget to change it back

return vep_csq_header


def check_globals_for_retired_terms(ht: hl.Table) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thank you!!

@mike-w-wilson
Copy link
Contributor Author

Back to you @jkgoodrich -- agree on seeing the output of the globals and len checks as confirmations, good call on adding those

Copy link
Contributor

@jkgoodrich jkgoodrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two small things then lgtm

Copy link
Contributor

@jkgoodrich jkgoodrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mike-w-wilson mike-w-wilson merged commit 496af45 into main Oct 19, 2023
@mike-w-wilson mike-w-wilson deleted the mw/validity_check_v4_exomes branch February 22, 2024 13:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants