HA Kubernetes Cluster Not Using Auto-Generated Public IP in Apache CloudStack 4.21.0.0 #11642
Replies: 4 comments 1 reply
-
Beta Was this translation helpful? Give feedback.
-
@hodie-aurora there is a similar upstream issue logged As a workaround pass the following command when executing kubectl command kubectl --insecure-skip-tls-verify=true |
Beta Was this translation helpful? Give feedback.
-
@weizhouapache Since #11579 can reproduce this issue, can you confirm if this is a bug introduced in CloudStack 4.21.0.0? If yes, I'd like to know when it might be fixed—will it be in 4.22, or in a patch minor version? If fixed in the 4.21 series, will the release package be updated? thank you. |
Beta Was this translation helpful? Give feedback.
-
@weizhouapache Following up on my previous comment, I believe that using kubectl --insecure-skip-tls-verify=true only allows symptomatic access to the cluster but doesn't resolve the root cause. The fundamental issue appears to be that during cluster initialization, the Kubernetes API server is configured to point to the internal IP of a single control node VM (e.g., 10.1.0.219:6443) instead of the auto-generated public IP. If the cluster were properly set up to use the public IP (with the load balancer), the kubectl access problems would be resolved naturally, and the cluster would truly achieve high availability—meaning it could tolerate the failure of any number of control nodes up to less than half without the entire cluster going down. Is my understanding of the root cause correct? Thank you for any confirmation or additional insights! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Environment
CloudStack Version: Upgraded to 4.21.0.0
Kubernetes Template: setup-v1.33.1-calico-x86_64.iso
Setup: Using VPC with a subnet, creating HA K8s cluster without specifying an external load balancer IP (expecting auto-generation of public IP with port forwarding and load balancing)
Description
Previously, in older versions of CloudStack, I successfully created HA Kubernetes clusters using the following workflow:
Create a VPC.
Create a subnet within the VPC.
Deploy an HA K8s cluster without filling in the external load balancer IP. This would automatically generate a public IP, configure port forwarding, and set up load balancing.
On a control node, running kubectl cluster-info would show the cluster IP pointing to the auto-generated public IP.
Downloading the kube.conf from the CloudStack UI for the K8s cluster page worked normally for remote access.
This behavior was logical and worked as expected.
However, after upgrading to CloudStack 4.21.0.0, I'm encountering an issue with the same workflow:
The public IP is still auto-generated.
Port forwarding and load balancing rules are created successfully (as shown in the UI).
But when I run kubectl cluster-info on a control node inside the cluster, the IP points to one of the internal control node IPs (e.g., 10.1.0.219:6443) instead of the public IP.
Attempting to use the downloaded kube.conf fails, likely due to certificate verification issues or inability to connect to the server via the public IP.
Running kubectl get pods -A also fails with TLS certificate verification errors: "Failed to verify certificate: x509: certificate is valid for [internal IPs], not [public IP]".
Screenshots attached for reference:
kubectl cluster-info output showing internal control node IP.

CloudStack UI showing auto-generated public IP (192.168.122.109) with port forwarding rules (private port 22 to public ports 2222-2225 TCP, mapping to control nodes at 10.1.0.x) and load balancing setup (api-lb on port 6443 TCP, active, pointing to control nodes at 10.1.0.44, 10.1.0.133, 10.1.0.219).
I'm not sure if this is due to a configuration error on my end, a bug in 4.21.0.0, or if there's a new mechanism/functionality introduced in this version (e.g., changes to the CloudStack Kubernetes Service/CKS plugin, which I read has enhancements for flexible node configurations and hypervisor selection in 4.21).
Questions
Is this a configuration issue? If so, what should I check or troubleshoot? For example:
Network settings in the VPC/subnet?
Kubernetes template compatibility with 4.21?
Any specific flags or options during cluster creation?
Certificate generation or API server config?
Is this due to new features in 4.21? From what I've seen in the release notes and blogs (e.g., ShapeBlue's deep dive), CKS has been updated for better adaptability, including separate templates for worker/control/etcd nodes. If there's a new required step for HA public IP handling, what is the correct procedure to ensure the cluster uses the auto-generated public IP externally?
Workarounds or Fixes? Has anyone else encountered this? Any patches or config tweaks recommended?
Additional Request: Tutorials
On a related note, does anyone have recommendations for comprehensive video or illustrated tutorials on using CloudStack? The official docs (docs.cloudstack.apache.org) cover basics but lack detailed walkthroughs for features like this K8s integration. Official full-series videos/articles would be ideal, but unofficial ones are welcome too.
Beta Was this translation helpful? Give feedback.
All reactions