-
Notifications
You must be signed in to change notification settings - Fork 22
Hyperparameter optimization
To understand how a parameter can be automatically tuned via bayesian optimization, let's look at the following example configuration:
...
# Optimizer params
optimizer/hyperparameter.class_to_tune = @Adam
optimizer/hyperparameter.weight_decay = 1e-6
optimizer/hyperparameter.lr = (1e-5, 3e-4)
# Encoder params
model/hyperparameter.class_to_tune = @LSTMNet
model/hyperparameter.num_classes = %NUM_CLASSES
model/hyperparameter.hidden_dim = (32, 256)
model/hyperparameter.layer_dim = (1, 3)
tune_hyperparameters.scopes = ["model", "optimizer"] # defines the scopes that the random search runs in
tune_hyperparameters.n_initial_points = 5 # defines random points to initilaize gaussian process
tune_hyperparameters.n_calls = 30 # numbe rof iterations to find best set of hyperparameters
tune_hyperparameters.folds_to_tune_on = 2 # number of folds to use to evaluate set of hyperparameters
In this example, we have the two scopes model
and optimizer
, the scopes take care of adding the parameters only to the
pertinent classes.
For each scope a class_to_tune
needs to be set to the class it represents, in this case LSTMNet
and Adam
respectively.
We can add whichever parameter we want to the classes following this syntax:
tune_hyperparameters.scopes = ["<scope>", ...]
<scope>/hyperparameter.class_to_tune = @<SomeClass>
<scope>/hyperparameter.<param> = ['list', 'of', 'possible', 'values']
If we run experiments
and want to overwrite the model configuration, this can be done easily:
include "configs/tasks/Mortality_At24Hours.gin"
include "configs/models/LSTM.gin"
optimizer/hyperparameter.lr = 1e-4
model/hyperparameter.hidden_dim = [100, 200]
This configuration for example overwrites the lr
parameter of Adam
with a concrete value,
while it only specifies a different search space for hidden_dim
of LSTMNet
to run the random search on.
The same holds true for the command line. Setting the following flag would achieve the same result (make sure to only have spaces between parameters):
-hp optimizer/hyperparameter.lr=1e-4 model/hyperparameter.hidden_dim='[100,200]'
There is an implicit hierarchy, independent of where the parameters are added (model.gin
, experiment.gin
or CLI -hp
):
LSTM.hidden_dim = 8 # always takes precedence
model/hyperparameter.hidden_dim = 6 # second most important
model/hyperparameter.hidden_dim = (4, 6) # only evaluated if the others aren't found in gin configs and CLI
The hierarchy CLI -hp
> experiment.gin
> model.gin
is only important for bindings on the same "level" from above.
Hyperparameters were chosen to be mostly identical to
the HiRID benchmark, to improve
comparability and for reasons of reproducibility. However, we have
chosen to allow for continuous ranges of hyperparameters in some cases,
to improve the performance and functionality of YAIB.
Log-uniform means that the parameters are sampled according to the
reciprocal distribution: