This repository contains scripts for network simulation using the CODES discrete-event simulation framework. Specifically, all scripts here run the binary model-net-mpi-replay
which simulates the behaviour of one or multiple jobs running on an HPC network.
individual-scripts/
- scripts to run one experiment at the timempi_replay/
- Python 3.13 script to run a battery of testsvisualizing_jobs/
- Visualizes each iteration time for all jobs in an experiment
Each experiment directory contains:
conf/
- Configuration files for network topology and simulation parametersresults/
- Output from simulation runs (logs, statistics, performance data)- A python or shell script to run experiments with specific parameters
Feel free to copy individual scripts (or their entire subdirectory) and modify them to run new scenarios. Within mpi_replay/
, you should be able to make a copy of run_mpi_surrogacy_experiments.py
to run a series of experiments.
individual-scripts/dfly-1056/
- Dragonfly topology experiments with 1,056 nodesindividual-scripts/dfly-72/
- Dragonfly topology experiments with 72 nodesindividual-scripts/dfly-8448/
- Dragonfly topology experiments with 8,448 nodesindividual-scripts/torus-64/
- Torus topology experiments with 64 nodes
Once you have got some results, you can visualize how long each iteration took. Simply run:
python3 visualizing_jobs/print-iterations.py path-to/results/exp-XXX/experiment-name/iteration-logs/
For command line help, run python3 visualizing_jobs/print-iterations.py --help
.
Run experiments using the provided script (it assumes you have compiled CODES using the CODES-compile-instructions.sh
script and have downloaded this repo under the same directory that script resides. Please check the script CODES-compile-instructions.sh
at https://github.com/codes-org/codes):
bash run-experiment.sh path-to-experiment/script.sh
# or in the case of mpi_replay
bash run-experiment.sh mpi_replay/run_mpi_surrogacy_experiments.py
# or in case you want to pass arguments to your experiment script, you can simply
bash run-experiment.sh path-to-experiment/script.sh --argument some-file.txt --other-arg
Results are automatically stored in path-to-experiment/results/
.
In case you want to run an experiment with sbatch
, you can use the script run-sbatch.sh
instead of run-experiment.sh
. The run-sbatch.sh
script will run the experiments in a different folder to that of the script. This is because in systems where sbatch is needed, one often stores data in a different folder than the folder one is running the script. Under this new folder, the script will create a results/
subfolder just as run-experiment.sh
does.
Requires the CODES simulation framework to be built and configured. It currently works with commit version @73cdbd5 of CODES.