-
Notifications
You must be signed in to change notification settings - Fork 107
Merge Release 2.4.0 #338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Merge Release 2.4.0 #338
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The custom node package could be passed as URL pointing to the node archive Signed-off-by: Luca Carrogu <[email protected]>
In this way the cfn_postinstall_args is aligned with the cfn_preinstall_args variable. Signed-off-by: Enrico Usai <[email protected]>
This mount point is wrong when the customer is using multiple ebs volumes because the cfn_shared_dir contains the comma separated list of the mount points. Furthermore the same action is performed in the same script, few lines below, by splitting by comma. Signed-off-by: Enrico Usai <[email protected]>
This is the latest version with Python 2.6 support Signed-off-by: Luca Carrogu <[email protected]>
Signed-off-by: Luca Carrogu <[email protected]>
Signed-off-by: Luca Carrogu <[email protected]>
The jobwatcher now retrieves this value dynamically from the stack parameters Signed-off-by: Francesco De Martino <[email protected]>
Signed-off-by: Francesco De Martino <[email protected]>
The libpmi is now in a separate slurm package see https://bugs.schedmd.com/show_bug.cgi?id=4511 so it needs to be installed explicitly This will solve aws/aws-parallelcluster#1008 Signed-off-by: Luca Carrogu <[email protected]>
The patch will let the script continue also when the following error is returned by the parted command: "Error: Partition(s) Y on /dev/XXX have been written, but we have been unable to inform the kernel of the change, probably because it/they are in use. As a result, the old partition(s) will remain in use. You should reboot now before making further changes." Signed-off-by: Luca Carrogu <[email protected]>
This patch avoids network service restart failures when a configuration file of an old network interface (not present anymore in the current instance launch) was found Signed-off-by: Luca Carrogu <[email protected]>
Signed-off-by: Luca Carrogu <[email protected]>
This fixes the issue with torque on centos 7 Signed-off-by: Francesco De Martino <[email protected]>
…d-init Signed-off-by: Francesco De Martino <[email protected]>
Signed-off-by: Francesco De Martino <[email protected]>
Related bug https://bugs.centos.org/view.php?id=13836#c33128 Signed-off-by: Francesco De Martino <[email protected]>
Signed-off-by: Francesco De Martino <[email protected]>
Skip test if jq is not installed, because for custom ami it is installed during bootstrap time (inside cloudformation userdata) Signed-off-by: Luca Carrogu <[email protected]>
Signed-off-by: Francesco De Martino <[email protected]>
Issue is reported in chef/bento#609 Using custom chef URL instead of default one (https://www.chef.io/chef/install.sh) we are able to skip the error "dpkg: error: dpkg status database is locked by another process" Signed-off-by: Luca Carrogu <[email protected]>
SGE installation folder is mounted from master node The installation is done when "cfn_node_type" is "MasterServer" (at runtime) or is "nil" (at packer time) Signed-off-by: Luca Carrogu <[email protected]>
Slurm installation folder is mounted from master node The installation is done when "cfn_node_type" is "MasterServer" (at runtime) or is "nil" (at packer time) Signed-off-by: Luca Carrogu <[email protected]>
SlurmdTimeout: the interval, in seconds, that the Slurm controller waits for slurmd to respond before configuring that node's state to DOWN. Reducing it in order to have a faster reaction to nodes that are failing. Signed-off-by: Francesco De Martino <[email protected]>
move supervisord start at the end of the user data in a finalize chef recipe. This solves the problem of the nodewatcher that was started before the end of chef recipes and post_install script and therefore the idletime was being mistakenly computed Signed-off-by: Francesco De Martino <[email protected]>
PATH is normally set in cfn userdata. In order to have chef recipes independent from userdata I'm setting explicitly the PATH for this command. Signed-off-by: Francesco De Martino <[email protected]>
…it exceeded" Doc https://docs.aws.amazon.com/sdkforruby/api/Aws/ConfigService/Client.html Signed-off-by: Luca Carrogu <[email protected]>
This will add support for Ubuntu in China NorthWest region (cn-northwest-1) Signed-off-by: Luca Carrogu <[email protected]>
Signed-off-by: Luca Carrogu <[email protected]>
Signed-off-by: Luca Carrogu <[email protected]>
Signed-off-by: Francesco De Martino <[email protected]>
Signed-off-by: Sean Smith <[email protected]>
Signed-off-by: Luca Carrogu <[email protected]>
Installs the EFA drivers. See [1] [1] https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/efa-start.html#efa-start-enable Signed-off-by: Sean Smith <[email protected]>
Once the EFA package is installed, it is not possible to install the openmpi-devel package. Make installation conditional depending on the OS and region Signed-off-by: Luca Carrogu <[email protected]>
Signed-off-by: Luca Carrogu <[email protected]>
Signed-off-by: Francesco De Martino <[email protected]>
* fetch installer during ami build (or at runtime with custom ami) * install only on Compute Nodes Signed-off-by: Sean Smith <[email protected]>
Signed-off-by: Sean Smith <[email protected]>
Signed-off-by: Francesco De Martino <[email protected]>
This reverts commit 8ffcf39.
This reverts commit 14b02a2.
This builds the rpms into the ami, then set the limits at runtime. Signed-off-by: Sean Smith <[email protected]>
Update base_config recipe to perform an unconditional attempt of fsx filesysterm mount, rather than restricting to alinux/centos. Supports cases with custom ubuntu amis with fsx extensions installed. This is a no-op change in the default parallelcluster configuration, as the client also verifies os compatibility during configuration validation. Tidy tcommon call of efs mount from master/compute recipes into base_config along fsx mount.
Signed-off-by: Sean Smith <[email protected]>
Signed-off-by: Sean Smith <[email protected]>
Signed-off-by: Luca Carrogu <[email protected]>
Signed-off-by: Luca Carrogu <[email protected]>
Signed-off-by: Luca Carrogu <[email protected]>
Signed-off-by: Francesco De Martino <[email protected]>
* Sets the max_memory ulimit on the master when EFA is enabled Signed-off-by: Sean Smith <[email protected]>
sean-smith
approved these changes
Jun 11, 2019
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.