Skip to content

Conversation

@himani2411
Copy link
Contributor

@himani2411 himani2411 commented Mar 5, 2024

Description of changes

[DFSM]Using .login_nodes_keys_sync_file to be used during Init and Update phase of the clusters

Bug:
Introduced in #2671 and #2672

The file we create /opt/parallelcluster/shared_login_nodes/.login_nodes_keys_sync_file as part of sync during cluster never gets updated when we Stop-Start the Cluster.
This file needs to be updated or any new Login Nodes which are launched after update of the Cluster, goes through the Init phase and wait for the content to be the latest.

Tests

  • Unit Tests
  • test_create_disable_sudo_access_for_default_user and test_dynamic_file_systems_update [ONGOING]

develop #2678

References

  • Link to impacted open issues.
  • Link to related PRs in other packages (i.e. cookbook, node).
  • Link to documentation useful to understand the changes.

Checklist

  • Make sure you are pointing to the right branch.
  • If you're creating a patch for a branch other than develop add the branch name as prefix in the PR title (e.g. [release-3.6]).
  • Check all commits' messages are clear, describing what and why vs how.
  • Make sure to have added unit tests or integration tests to cover the new/modified code.
  • Check if documentation is impacted by this change.

Please review the guidelines for contributing and Pull Request Instructions.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@codecov
Copy link

codecov bot commented Mar 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 76.48%. Comparing base (d028b5e) to head (d4826a2).

Additional details and impacted files
@@             Coverage Diff              @@
##           release-3.9    #2677   +/-   ##
============================================
  Coverage        76.48%   76.48%           
============================================
  Files               22       22           
  Lines             2220     2220           
============================================
  Hits              1698     1698           
  Misses             522      522           
Flag Coverage Δ
unittests 76.48% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@himani2411 himani2411 changed the title [DFSM]Using cluster-config-version file to be used during Init and Up… [release-3.9][DFSM]Using .login_nodes_keys_sync_file to be used during Init and Update phase of the clusters Mar 5, 2024
default['cluster']['scheduler_queue_name'] = nil

default['cluster']['head_node_home_path'] = '/home'
default['cluster']['shared_dir_compute'] = node['cluster']['shared_dir']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code "[[ \"$(cat #{path})\" == \"#{cluster_config_version}\" ]] || exit 1"
Chef::Log.info("Wait for synchronization file at #{path} to exist")
file path do
action :touch
Copy link
Contributor

@gmarciani gmarciani Mar 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the touch action expected to fail if the file does not exist?
Asking because the Linux touch command succeeds even if file do not exist.
If it fails on non existance then we are good

@himani2411 himani2411 force-pushed the wip/release-3.9 branch 2 times, most recently from f3e162d to 3a36bf7 Compare March 5, 2024 15:27
bash "Wait for synchronization file at #{path} to be written for version #{cluster_config_version}" do
code "[[ \"$(cat #{path})\" == \"#{cluster_config_version}\" ]] || exit 1"
bash "Wait for synchronization file at #{path} to exist" do
code "[[ -e #{path} ]]; echo $? || exit 1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: Can we do:

code "[[ -e #{path} ]] || exit 1"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

hanwen-pcluste
hanwen-pcluste previously approved these changes Mar 5, 2024
@himani2411 himani2411 merged commit 11c0436 into aws:release-3.9 Mar 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants