This code aims to simulate our experiments with ensemble-based learning approach that was proposed in the "Ensemble-based Federated Learning" paper submitted to HPE TechCon 2020.
Basically, our approach is based on model training done in a federated way by using model agnostic technique. In the article, we tried to address an open problem which is the possibility of having FL clients that train heterogeneous models, especially focusing on non-neural models (i.e. classical ML algorithms)
We restricted our experiments to cross-silo federated settings but we also not limited to investigate cross-device federation settings by trying to scaling up the number of nodes (Which can be extremely high!).
In our collaborative training routine there are two main events:
- Clients send their models to a server responsible for maintaining and managing an ensemble model.
- Clients receive an ensemble model and judge whether it is better to continue using their own model or use the shared ensemble model.
There are then two levels of model updates, local and global. The global model refers to the ensemble model and the local to client/node models. We always updated local and global models in our simulations when new data arrives at the local nodes. We could thus have different updating policies. For example, the frequency of new data arriving at different clients is different, so might be the frequency of local model updates. The global model could be updated, in turn, always that there is a change in performance in any local model. So, the point is that this can be easily customizable, according to the desired strategy. Because of the simpler/lightweight models the frequency of updates can be higher. Arrival of data can result in continuous updates and tweaking, requiring atomic access to enable consistency of updates.
We recommend using a test environment such as venv, virtualenv or any other to install dependencies.
pip3 install -r requirements.txt
To execute the whole experiment, some options need to be set.
| Option | Description | Default setting |
|---|---|---|
--rounds |
Communication rounds | 10 |
--epochs |
Client training epochs | 5 |
--clients |
Number of participating clients/nodes. For configurations with more than seven clients (ie --clients > 7) model types will repeat. |
3 |
--data-path |
Path to training data | |
--data-split |
Train/Validation/Test data split | 0.7, 0.1, and 0.2, respectively |
--dirichlet-alpha |
Alpha value of Dirichlet distribution of training data | 1 |
--target-feature |
Target feature name to predict | |
--uniform |
Data distribution mode over rounds. If not set the distribution could be unbalanced according to input data distribution. (Warning! As we apply underfitting of data this can compromise the execution of training as it may not guarantee enough data to be distributed among the defined number of clients). | False |
--fedavg |
Sets Federated Averaging Algorithm (FedAvg) | Ensemble-based Learning mode |
--model_type |
Define which NN model will be used for federated task | A |
--lr |
learning rate for optimizer function | 0.01 |
--train_batch_size |
batch size for train NN model | 32 |
--test_batch_size |
batch size for test/val NN model | 64 |
--enable_grouping |
Enables the grouping of models of the same arch | False |
--model_alocation |
Set to modify the beginning of the circular list | 0 |
--seed |
Random seed value | 1 |
-v, --verbose |
Sets verbose mode | False |
--log-dir |
Directory path to save experiment reports |
The setting is 3 clients with normally distributed training data over 10 rounds.
FedEL mode.
python3 main.py --data-path datasets/shelter.csv --log-dir ./reports/shelter --verbose
FedAvg mode.
python3 main.py --fedavg --data-path datasets/shelter.csv --log-dir ./reports/shelter --verbose
