Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion gnomad/resources/grch38/gnomad.py
Original file line number Diff line number Diff line change
Expand Up @@ -221,7 +221,10 @@
"""

POPS_STORED_AS_SUBPOPS = TGP_POPS + HGDP_POPS
POPS_TO_REMOVE_FOR_POPMAX = {"asj", "fin", "oth", "ami", "mid", "remaining"}
POPS_TO_REMOVE_FOR_POPMAX = {
"v3": {"asj", "fin", "mid", "oth", "ami", "remaining"},
"v4": {"asj", "fin", "oth", "ami", "remaining"},
Comment on lines +225 to +226
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

either way this is fine, just wondering does v3 have "remaining" and does v4 have "oth"? i don't remember what steps the name changes went in

Copy link
Contributor Author

@mike-w-wilson mike-w-wilson Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is a bit annoying, v4 genomes uses remaining but v3 genomes used oth but the code needs both for backwards compatibility and since the v4 genomes uses the v3 set because we still have fewer than 1000 mid samples in the v4 genomes....this is why we need a resources overhaul.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept oth in v4 as a safety measure here because of the joint_frequency work, thats also why ami is in v4 even though v4 exomes do not have ami samples

}
"""
Populations that are removed before popmax calculations.
"""
Expand Down
5 changes: 4 additions & 1 deletion gnomad/utils/vcf.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,10 @@
Quality histograms used in VCF export.
"""

FAF_POPS = ["afr", "amr", "eas", "nfe", "sas"]
FAF_POPS = {
"v3": ["afr", "amr", "eas", "nfe", "sas"],
"v4": ["afr", "amr", "eas", "mid", "nfe", "sas"],
}
"""
Global populations that are included in filtering allele frequency (faf) calculations. Used in VCF export.
"""
Expand Down