This tool uses the DBSCAN algorithm to automatically cluster release groups into tiers based on their Golden Popcorn Performance Index (GPPI) values. It generates a structured JSON output that can be used with Radarr custom format creation scripts.
The Golden Popcorn Performance Index (GPPI) measures how likely a release group is to produce high-quality "Golden Popcorn" encodes. This script:
- Takes a JSON file containing release group GPPI values
- Uses DBSCAN to find natural clusters/tiers in the data
- Outputs a JSON file with tiered group assignments
No manual assignment of tiers or predefined number of clusters is required - the algorithm discovers the natural tiers in your data.
- Python 3.6+
- Required packages:
- numpy
- pandas
- scikit-learn
Install dependencies with:
pip install numpy pandas scikit-learn
./dbscan.py input_file.json --resolution 1080p --type Quality
This will:
- Read GPPI values from
input_file.json
- Cluster the groups using DBSCAN with default parameters
- Save the results to
1080p Quality.json
The input file should be a JSON object with release group names as keys and their GPPI values as numeric values:
{
"EbP": 412.45,
"DON": 350.55,
"HiDt": 227.22,
...
}
The output file contains:
- Metadata about the clustering
- Statistics for each tier
- A
tiered_groups
array with objects containingname
andtier
properties
{
"metadata": {
"total_groups": 37,
"total_tiers": 5,
"resolution": "1080p",
"type": "Quality",
"algorithm": "DBSCAN",
"eps_value": 49.37,
"eps_factor": 0.12,
"min_samples": 1
},
"tier_statistics": {
"tier_1": {
"count": 2,
"min_gppi": 350.55,
"max_gppi": 412.45,
"avg_gppi": 381.5
},
...
},
"tiered_groups": [
{
"name": "EbP",
"tier": 1
},
{
"name": "DON",
"tier": 1
},
...
]
}
positional arguments:
input_file Input JSON file with GPPI values
required arguments:
--resolution {SD,720p,1080p,2160p}
Resolution for the output
--type {Quality,Efficient}
Type of release groups
optional arguments:
--output-dir OUTPUT_DIR
Directory for output JSON file
--eps EPS Epsilon factor (as proportion of data range) for DBSCAN
--min-samples MIN_SAMPLES
Minimum samples parameter for DBSCAN
--optimize Automatically optimize epsilon to get 3-8 clusters
--verbose Print detailed information about clusters
To automatically find the best epsilon value that gives between 3-8 clusters:
./dbscan.py input.json --resolution 1080p --type Quality --optimize --verbose
You can control the clustering sensitivity with the --eps
parameter:
./dbscan.py input.json --resolution 1080p --type Quality --eps 0.15
The epsilon value is specified as a proportion of the data range:
- Lower values (0.05-0.10): More tiers, finer granularity
- Higher values (0.15-0.25): Fewer tiers, broader categories
Add the --verbose
flag to see detailed information about the discovered tiers:
./dbscan.py input.json --resolution 1080p --type Quality --verbose
- Start with
--optimize --verbose
to see what the automatic optimization discovers - If you want more tiers, use a smaller epsilon (e.g.,
--eps 0.08
) - If you want fewer tiers, use a larger epsilon (e.g.,
--eps 0.2
) - Look at the GPPI distribution to understand natural groupings in your data
The output of this script is designed to work directly with Radarr custom format creation scripts. The JSON structure provides a tiered_groups
array that can be used to create custom formats for different quality tiers.
- Generate GPPI values for release groups at a specific resolution
- Run this script to automatically cluster them into tiers
- Use the resulting JSON with your custom format creation script
- Import the custom formats into Radarr
This approach ensures that your quality tiers are based on natural groupings in the GPPI data rather than arbitrary assignments.