Skip to content

Conversation

nawatts
Copy link
Contributor

@nawatts nawatts commented Apr 26, 2021

This adds support for multiple resource buckets in order to provide access to resources hosted on different cloud providers through our resources framework (#255, #263, #264).

This adds a GnomadPublicResource resource class and subclasses for each resource type (GnomadPublicTableResource, GnomadPublicMatrixTableResource, etc). When reading data, GnomadPublicResource rewrite their path attribute based on the value of gnomad.resources.config.gnomad_public_resource_configuration.source.

Thus, to read resources from gcp-public-data--gnomad instead of gnomad-public-requester-pays,

from gnomad.resources.config import gnomad_public_resource_configuration, GnomadPublicResourceSource
from gnomad.resources.grch37 import gnomad

gnomad_public_resource_configuration.source = GnomadPublicResourceSource.GOOGLE_CLOUD_PUBLIC_DATASETS

ds = gnomad.public_release("exomes").ht()

Alternatively, a custom path can be used (for example, if someone has created their own copy of the gnomAD public buckets).

from gnomad.resources.config import gnomad_public_resource_configuration, GnomadPublicResourceSource
from gnomad.resources.grch37 import gnomad

gnomad_public_resource_configuration.source = "gs://my-bucket/gnomad-resources"

ds = gnomad.public_release("exomes").ht()

return f"gs://gcp-public-data--gnomad{relative_path}"

return (
f"{resource_source.rstrip('/')}{relative_path}" # pylint: disable=no-member
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't figure out why Pylint insisted this was an error. The tests show that this works.

Pylint also only complained about this when running in the PR checks. It passed when running Pylint locally.

@nawatts nawatts requested review from a team and jkgoodrich and removed request for a team April 26, 2021 21:06
Copy link
Contributor

@jkgoodrich jkgoodrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly questions for my knowledge.

@nawatts nawatts requested a review from jkgoodrich May 5, 2021 17:14
Copy link
Contributor

@jkgoodrich jkgoodrich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just one comment, but lgtm. Thank you for answering all my questions and adding all the extra tests!

assert ds == read_block_matrix.return_value


def gnomad_public_resource_test_parameters(path: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add return type

@nawatts nawatts force-pushed the resources-buckets-take-two branch from 92fe502 to 842768c Compare May 5, 2021 18:12
@nawatts
Copy link
Contributor Author

nawatts commented May 5, 2021

Had to rebase to resolve merge conflicts between the from typing import ... added in bc1b495 and the module docstring added in #372.

@nawatts nawatts merged commit aaccd4a into master May 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants