-
Notifications
You must be signed in to change notification settings - Fork 31
Add import code for GTEx v10 RSEM #742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just 2 things, but looks good
from gnomad.utils.vep import vep_or_lookup_vep | ||
|
||
|
||
def _import_gtex_rsem(gtex_path: str, meta_path: str, **kwargs) -> hl.MatrixTable: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this exactly the same as the one in grch37? If so, we should probably just import it from there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is, I just wonder if you used this function to get that MT, it seems the same structure as I see in the MT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also because it's giving a warning this underscore named function: access to a protected member * of a module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, don't worry about the warning, or remove the _
. Let me confirm, but I think it's the same function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this
from gnomad.resources.grch37.reference_data import _import_gtex_rsem
try:
mt = hl.read_matrix_table("gs://gnomad/resources/grch38/gtex/gtex_rsem_v10.mt")
except:
mt = _import_gtex_rsem(
gtex_path="gs://gnomad/resources/gtex/v10/GTEx_Analysis_2022-06-06_v10_RSEMv1.3.3_transcripts_tpm.txt.bgz",
meta_path="gs://gnomad/resources/gtex/v10/GTEx_Analysis_v10_Open_Access_Reduced_Annotations_SampleAttributesDS.txt.bgz",
min_partitions=1000,
).checkpoint("gs://gnomad/resources/grch38/gtex/gtex_rsem_v10.mt", _read_if_exists=True)
gnomad/utils/filtering.py
Outdated
"This Gencode CDS interval filter does not filter by transcript! Please see the" | ||
" documentation for more details to confirm it's being used as intended." | ||
) | ||
if padding: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe put this in another PR since it's not related to the resource
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, sorry, this might be from my work on the toolbox, I wanted to modify this function at the beginning then Riley told me their function to get the CDS, I didn't look into the difference, but I did find the reason for the difference, and I commented on your toolbox PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This copies the import function as for GRCh37 v7, but modified a bit according the current path.