Skip to content

Conversation

KoalaQin
Copy link
Contributor

@KoalaQin KoalaQin commented Dec 9, 2024

This copies the import function as for GRCh37 v7, but modified a bit according the current path.

Copy link
Contributor

@jkgoodrich jkgoodrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just 2 things, but looks good

from gnomad.utils.vep import vep_or_lookup_vep


def _import_gtex_rsem(gtex_path: str, meta_path: str, **kwargs) -> hl.MatrixTable:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this exactly the same as the one in grch37? If so, we should probably just import it from there

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, I just wonder if you used this function to get that MT, it seems the same structure as I see in the MT.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also because it's giving a warning this underscore named function: access to a protected member * of a module

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, don't worry about the warning, or remove the _. Let me confirm, but I think it's the same function

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this

from gnomad.resources.grch37.reference_data import _import_gtex_rsem

try:
    mt = hl.read_matrix_table("gs://gnomad/resources/grch38/gtex/gtex_rsem_v10.mt")
except:
    mt = _import_gtex_rsem(
        gtex_path="gs://gnomad/resources/gtex/v10/GTEx_Analysis_2022-06-06_v10_RSEMv1.3.3_transcripts_tpm.txt.bgz",
        meta_path="gs://gnomad/resources/gtex/v10/GTEx_Analysis_v10_Open_Access_Reduced_Annotations_SampleAttributesDS.txt.bgz",
        min_partitions=1000,
    
    ).checkpoint("gs://gnomad/resources/grch38/gtex/gtex_rsem_v10.mt", _read_if_exists=True)

"This Gencode CDS interval filter does not filter by transcript! Please see the"
" documentation for more details to confirm it's being used as intended."
)
if padding:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe put this in another PR since it's not related to the resource

Copy link
Contributor Author

@KoalaQin KoalaQin Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, sorry, this might be from my work on the toolbox, I wanted to modify this function at the beginning then Riley told me their function to get the CDS, I didn't look into the difference, but I did find the reason for the difference, and I commented on your toolbox PR.

@KoalaQin KoalaQin requested a review from jkgoodrich December 18, 2024 16:06
Copy link
Contributor

@jkgoodrich jkgoodrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@KoalaQin KoalaQin merged commit 4b4a33c into main Dec 18, 2024
5 checks passed
@KoalaQin KoalaQin deleted the qh/gtex_v10_resources branch December 18, 2024 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants