-
Notifications
You must be signed in to change notification settings - Fork 31
validity check code of VEP annotations in protein-coding genes #548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
KoalaQin
commented
May 27, 2023
- Downloaded the ensembl 101 & 105 gene intervals and parsed them to Hail format, and stored as reference data;
- Code to count the total number of variants and the number of variants annotated as "protein-coding" biotype in each interval of all the protein-coding genes, to confirm if a gene is not covered by gnomAD release data, or there might be an annotation issue.
- Confirmed no issue in VEP105 annotation for v4 exomes & genomes, but 4 genes were mistakenly annotated by VEP 101 in v3 genomes to having no protein-coding variant in a protein-coding;
- ~50-70 genes were not covered by the release data, maybe due to low coverage in these gene regions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial thoughts for modifications
put the count functions inside: gnomad_qc/v4/annotations/generate_variant_qc_annotations.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some suggestions to simplify the code and maybe make it just a little faster.
@jkgoodrich I made changes to your suggestions, back to your 2nd round, thx! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few more things
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one more small thing then it LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
resources imported with:
|