Skip to content

Conversation

hgreebe
Copy link
Contributor

@hgreebe hgreebe commented Aug 6, 2025

Description of changes

  • Remove device name and ip address from required parameters of configure_nw_interfaces.sh
  • Device name and ip address do not exist for efa-only nw interfaces.
  • This adds a check that is device name and ip address are not included, then it is an efa-only interface so the rest of the script is skipped.
  • We use ec2 instance metadata to get the parameters like ip address, network card index etc. This instance metadata does not have interface_type so we can not use it to see if it is efa-only. This limitation is why we are validating that it is efa-only using device_name and ip addess.

Tests

  • Ran 'test_multiple_nics` integration test

References

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

fi

# Check if this is an EFA-only interface (no device name or IP)
if [ -z "${DEVICE_NAME}" ] || [ -z "${DEVICE_IP_ADDRESS}" ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where it is documented that efa-only interfaces do not have a device name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find official documentation but when you try to run this script with an efa-only interface type it has the following input:

{"DEVICE_NAME"=>"", "DEVICE_NUMBER"=>"1", "NETWORK_CARD_INDEX"=>"1", "GW_IP_ADDRESS"=>"192.168.112.1", "DEVICE_IP_ADDRESS"=>nil, "CIDR_PREFIX_LENGTH"=>"20", "NETMASK"=>"255.255.240.0", "CIDR_BLOCK"=>"192.168.112.0/20", "MAC"=>"02:6e:6f:a7:24:8d"}

So it doesn't input a device name or ip address.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is ok for EFA only, but removing the check on DEVICE_NAME and DEVICE_IP_ADDRESS (lines above) we are accepting invalid values for the other types of interface.

Before your change we required all network interfaces to have both DEVICE_NAME and DEVICE_IP_ADDRESS. Now we don't check anymore. Removing the check makes sense for EFA only interfaces, but it does not for the other types.

A more robust logic should be: require all variables are set, unless both DEVICE_NAME and DEVICE_IP_ADDRESS are unset

@gmarciani
Copy link
Contributor

this change contributes to supporting GB200 instance types. Let' make sure to capture an entry in the changelog


# If one of these is missing but not both, it is an invalid configuration
if [ -z "${DEVICE_NAME}" ] || [ -z "${DEVICE_IP_ADDRESS}" ]; then
echo "One or more environment variables missing"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please specify the name of variables that may be missing.
This script does not have -x, so if we see from the logs the message 'One or more environment variables missing' we do not know if it is the first or second check that is failing.

@hgreebe hgreebe force-pushed the develop branch 2 times, most recently from 9fe8c05 to 81cf5a0 Compare August 8, 2025 14:24
@@ -1,18 +1,29 @@
#!/bin/sh

set -ex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for aligning all Os specific scripts.
Since we are doing it, please use the same execution options: set -ex everywhere to facilitate troubleshooting in case of issues

@hgreebe hgreebe enabled auto-merge (squash) August 8, 2025 15:50
@hgreebe hgreebe merged commit 581530a into aws:develop Aug 8, 2025
28 of 30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants