Skip to content

Changes to the master cloud tags are removing them from Load Balancer and leaving the cluster unreachable #9862

@marianomirabelli

Description

@marianomirabelli

1. What kops version are you running? The command kops version, will display
this information.

We are using kops 1.17.0 version.

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.

We are using kubernetes 1.16.9.

3. What cloud provider are you using?

We are using AWS.

4. What commands did you run? What is the simplest way to reproduce this issue?

We are using kops and terraform. Therefore, the combination of commands used follow the following sequence:

First, we generate the kops-spec.yml file :

kops toolbox template --name ${cluster-name} --values cluster-vars.json --template cluster-template.yml --format-yaml > kops-spec.yml

Then, we replace the state in S3 bucket:

kops replace -f kops-spec.yml --state ${bucket-name} --kops-state --name ${cluster-name} --force

We execute the kops update as follows:

kops update cluster ${cluster-name} --state=${bucket-name} --out=terraform/ --target=terraform

Then we navigate to terraform folder and execute:

terraform plan

Finally, we do:

terraform apply

5. What happened after the commands executed?

When we make a change to the master nodes, such as adding a new cloudLabel, the master nodes are removed from the load balancer. So when we want to run the rolling-update command, the cluster becomes unreachable.

6. What did you expect to happen?

We expect that a change in the master nodes through terraform does not become the cluster unreachable for the rolling-update and the next operations with kubectl.

7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.

kind: Cluster
metadata:
  creationTimestamp: null
  name: pipeline-test.test.almundo.io
spec:
  api:
    loadBalancer:
      type: Internal
      useForInternalApi: true
  authorization:
    rbac: {}
  channel: stable
  cloudProvider: aws
  configBase: s3://pipeline-test.test.almundo.io--kops-state/pipeline-test.test.almundo.io
  etcdClusters:
  - etcdMembers:
    - instanceGroup: master-us-east-1a
      name: us-east-1a
    - instanceGroup: master-us-east-1b
      name: us-east-1b
    - instanceGroup: master-us-east-1c
      name: us-east-1c
    name: main
    version: 3.3.13
  - etcdMembers:
    - instanceGroup: master-us-east-1a
      name: us-east-1a
    - instanceGroup: master-us-east-1b
      name: us-east-1b
    - instanceGroup: master-us-east-1c
      name: us-east-1c
    name: events
    version: 3.3.13
  iam:
    allowContainerRegistry: true
    legacy: false
  kubelet:
    anonymousAuth: false
    authenticationTokenWebhook: true
    authorizationMode: Webhook
  kubernetesApiAccess:
  - 0.0.0.0/0
  kubernetesVersion: 1.16.8
  masterInternalName: api.internal.pipeline-test.test.almundo.io
  masterPublicName: api.pipeline-test.test.almundo.io
  networkCIDR: 10.5.0.0/16
  networkID: vpc-00e1e798cb482b198
  networking:
    calico:
      crossSubnet: true
      majorVersion: v3
      mtu: 8912
  nonMasqueradeCIDR: 100.64.0.0/10
  subnets:
  - cidr: 10.5.10.0/24
    id: subnet-049ff35a1c5756450
    name: utility-us-east-1a
    type: Utility
    zone: us-east-1a
  - cidr: 10.5.20.0/24
    id: subnet-069cadd0600707792
    name: utility-us-east-1b
    type: Utility
    zone: us-east-1b
  - cidr: 10.5.30.0/24
    id: subnet-07a637c1d8ae8de7e
    name: utility-us-east-1c
    type: Utility
    zone: us-east-1c
  - cidr: 10.5.110.0/24
    id: subnet-083e778d6dc8c03be
    name: us-east-1a
    type: Private
    zone: us-east-1a
  - cidr: 10.5.120.0/24
    id: subnet-0c30bbfa43ceb6dfd
    name: us-east-1b
    type: Private
    zone: us-east-1b
  - cidr: 10.5.130.0/24
    id: subnet-019ee23ee6a8efa90
    name: us-east-1c
    type: Private
    zone: us-east-1c
  topology:
    dns:
      type: Public
    masters: public
    nodes: public

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2020-09-02T14:37:46Z"
  generation: 5
  labels:
    kops.k8s.io/cluster: pipeline-test.test.almundo.io
  name: master-us-east-1a
spec:
  cloudLabels:
    bar: test2
    cluster: pipeline-test.test.almundo.io
    env: dv
    foo: test
    k8s-type: master
    solution: k8s
    zem: test3
  image: kope.io/k8s-1.16-debian-stretch-amd64-hvm-ebs-2020-07-20
  machineType: t3.medium
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: master-us-east-1a
  role: Master
  subnets:
  - us-east-1a

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2020-09-02T14:37:46Z"
  generation: 5
  labels:
    kops.k8s.io/cluster: pipeline-test.test.almundo.io
  name: master-us-east-1b
spec:
  cloudLabels:
    bar: test2
    cluster: pipeline-test.test.almundo.io
    env: dv
    foo: test
    k8s-type: master
    solution: k8s
    zem: test3
  image: kope.io/k8s-1.16-debian-stretch-amd64-hvm-ebs-2020-07-20
  machineType: t3.medium
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: master-us-east-1a
  role: Master
  subnets:
  - us-east-1b

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2020-09-02T14:37:46Z"
  generation: 5
  labels:
    kops.k8s.io/cluster: pipeline-test.test.almundo.io
  name: master-us-east-1c
spec:
  cloudLabels:
    bar: test2
    cluster: pipeline-test.test.almundo.io
    env: dv
    foo: test
    k8s-type: master
    solution: k8s
    zem: test3
  image: kope.io/k8s-1.16-debian-stretch-amd64-hvm-ebs-2020-07-20
  machineType: t3.medium
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: master-us-east-1a
  role: Master
  subnets:
  - us-east-1c

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: "2020-09-02T14:37:46Z"
  generation: 1
  labels:
    kops.k8s.io/cluster: pipeline-test.test.almundo.io
  name: nodes
spec:
  cloudLabels:
    env: dv
    solution: k8s
  image: kope.io/k8s-1.16-debian-stretch-amd64-hvm-ebs-2020-07-20
  machineType: t3.medium
  maxSize: 3
  minSize: 2
  nodeLabels:
    kops.k8s.io/instancegroup: nodes
  role: Node
  subnets:
  - us-east-1a
  - us-east-1b
  - us-east-1c

8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.

We attach the log files in this issue.

kops-replace-output.txt
kops-update-output.txt
terraform-plan-log.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    lifecycle/staleDenotes an issue or PR has remained open with no activity and has become stale.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions