Skip to content

Conversation

efim-a-efim
Copy link

@efim-a-efim efim-a-efim commented Sep 4, 2025

Description of changes

With this changes, users will be able to specify additional Feature values for ComputeResources. Those Feature-s will be safely added to the Slurm nodes specification for the partition.

Rationale: In many cases, especially with complex cluster setups, it's crucial to have additional Feature-s that can be used to filter nodes for the jobs.

Tests

Added the following tests:

  • test/unit/slurm/test_slurm_config_generator.py::test_generate_slurm_config_with_custom_features - tests config generation with custom features specified. Corresponding test data is in test/unit/slurm/test_generate_slurm_config_with_custom_features.

NOTE: test data for many tests were changed because the new features generation algorithm requires all features in Feature=... in config files to be sorted. So many test files were affected, but the only change that were made to them is features ordering.

References

This PR has a complimentary PR in aws-parallelcluster repo: aws/aws-parallelcluster#6962

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Copy link
Contributor

@demartinofra demartinofra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing with this change. I left a few minor comments if you could take a look.

self.disable_multithreading = compute_resource_config["DisableSimultaneousMultithreading"]
self.custom_settings = compute_resource_config.get("CustomSlurmSettings", {})
# If Features are in CustomSlurmSettings, fetch them and remove from CustomSlurmSettings
self.custom_features = self.custom_settings.pop("Feature", "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of mutating the custom_settings object by removing features I would keep custom_settings aligned with what is injected from config and filter out features when building the custom += f" {param}={value}" string as part of the nodename rendering.

Comment on lines +91 to +93
# this is a simple workaround for empty Features: split will give [''] and we'll discard it
# as well as all empty strings.
features.discard('')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

default to None when reading the Feature string and skip the splitting so that you don't need to discard ''?

Comment on lines +95 to +97
# remove "system" features from configured features, so that they will never interfere
for feat in ('static', 'dynamic', 'gpu', 'efa') :
features.discard(feat)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could consider leaving these unaltered - if for any reason the user passes them in we could just honor them as overrides

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants