Skip to content

Conversation

KoalaQin
Copy link
Contributor

  1. Downloaded the ensembl 101 & 105 gene intervals and parsed them to Hail format, and stored as reference data;
  2. Code to count the total number of variants and the number of variants annotated as "protein-coding" biotype in each interval of all the protein-coding genes, to confirm if a gene is not covered by gnomAD release data, or there might be an annotation issue.
  3. Confirmed no issue in VEP105 annotation for v4 exomes & genomes, but 4 genes were mistakenly annotated by VEP 101 in v3 genomes to having no protein-coding variant in a protein-coding;
  4. ~50-70 genes were not covered by the release data, maybe due to low coverage in these gene regions?

@KoalaQin KoalaQin requested a review from jkgoodrich May 27, 2023 02:21
@KoalaQin KoalaQin self-assigned this May 27, 2023
Copy link
Contributor

@jkgoodrich jkgoodrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial thoughts for modifications

@KoalaQin
Copy link
Contributor Author

KoalaQin commented Jun 1, 2023

put the count functions inside: gnomad_qc/v4/annotations/generate_variant_qc_annotations.py

@KoalaQin KoalaQin requested a review from jkgoodrich June 5, 2023 01:16
Copy link
Contributor

@jkgoodrich jkgoodrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions to simplify the code and maybe make it just a little faster.

@KoalaQin KoalaQin requested a review from jkgoodrich June 15, 2023 18:43
@KoalaQin
Copy link
Contributor Author

@jkgoodrich I made changes to your suggestions, back to your 2nd round, thx!

Copy link
Contributor

@jkgoodrich jkgoodrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few more things

@KoalaQin KoalaQin requested a review from jkgoodrich June 20, 2023 14:49
Copy link
Contributor

@jkgoodrich jkgoodrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one more small thing then it LGTM

@KoalaQin KoalaQin requested a review from jkgoodrich June 20, 2023 18:24
Copy link
Contributor

@jkgoodrich jkgoodrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@KoalaQin KoalaQin merged commit becc882 into main Jun 20, 2023
@KoalaQin
Copy link
Contributor Author

KoalaQin commented Jun 20, 2023

resources imported with:
hailctl dataproc submit qh1 /Users/heqin/PycharmProjects/gnomad_methods/gnomad/resources/import_resources.py grch38.ensembl_interval.101

hailctl dataproc submit qh1 /Users/heqin/PycharmProjects/gnomad_methods/gnomad/resources/import_resources.py grch38.ensembl_interval.105

@KoalaQin KoalaQin deleted the qh/valid_vep branch August 23, 2023 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants