Skip to content

service IPs and ports are not released when deleting a service via a finalizer-removing update #87603

@chrischdi

Description

@chrischdi

What happened:

  • Created a service including a finalizer
  • Triggered deletion of the service
  • Removed finalizer
  • Service got deleted by apiserver (not visible anymore via kubectl)
  • Tried to create service again
  • Creation got denied: The Service "foo" is invalid: spec.ports[0].nodePort: Invalid value: 30003: provided port is already allocated
    The apiserver logs the following lines prior to this happening:
    E0128 08:15:36.920788       1 repair.go:145] the node port 30003 for service foo/default is not allocated; repairing
    E0128 08:15:36.920837       1 repair.go:237] the cluster IP 10.0.0.81 for service foo/default is not allocated; repairing
    
  • After about 10 minutes I'm able to create the service, the apiserver shows the following log lines when it is repairing it:
    E0128 08:28:51.429642       1 repair.go:184] the node port 30003 appears to have leaked: cleaning up
    E0128 08:28:51.436350       1 repair.go:311] the cluster IP 10.0.0.81 appears to have leaked: cleaning up
    

What you expected to happen:

  • Service is allowed to get created some seconds after deletion

How to reproduce it (as minimally and precisely as possible):

cd $(mktemp -d)
mkdir etcd
docker run -d -p 2379:2379 --name=kube-etcd -v $(pwd)/etcd:/tmp/ --rm k8s.gcr.io/etcd:3.3.15 /usr/local/bin/etcd --data-dir /tmp/etcd --advertise-client-urls=http://0.0.0.0:2379 --listen-client-urls=http://0.0.0.0:2379
docker run -d --net=host --name=kube-apiserver --rm k8s.gcr.io/kube-apiserver:v1.17.2 kube-apiserver --etcd-servers http://127.0.0.1:2379 --insecure-port 8080 --authorization-mode=RBAC

export KUBECONFIG=$(pwd)/kubeconfig
touch $KUBECONFIG
kubectl config set-cluster etcd-local --server=http://localhost:8080
kubectl config set-context etcd-local --cluster=etcd-local
kubectl config use-context etcd-local

cat <<EOF > service.yaml
apiVersion: v1
kind: Service
metadata:
  name: foo
  finalizers:
  - foo.bar/some-finalizer
spec:
  ports:
  - port: 80
    protocol: TCP
    targetPort: 8080
    nodePort: 30003
  selector:
    app: kuard
  type: NodePort

EOF

for i in {1..200}; do
  echo "[$(date +%Y-%m-%d-%H:%M:%S)] # $i"
  kubectl apply -f service.yaml
  kubectl delete svc foo --wait=false
  sleep 1
  kubectl patch svc foo --type='json' -p='[{"op":"remove","path":"/metadata/finalizers"}]'
  kubectl delete svc foo --ignore-not-found
  sleep 1
done

Example output:

[2020-01-28-08:55:25] # 1                                                                        
service/foo unchanged                                                                                         
service "foo" deleted                                 
service/foo patched                                                      
...
[2020-01-28-08:58:21] # 77
service/foo created
service "foo" deleted
service/foo patched
[2020-01-28-08:58:23] # 78
service/foo created
service "foo" deleted
service/foo patched
[2020-01-28-08:58:26] # 79
The Service "foo" is invalid: spec.ports[0].nodePort: Invalid value: 30003: provided port is already allocated
Error from server (NotFound): services "foo" not found
Error from server (NotFound): services "foo" not found
[2020-01-28-08:58:28] # 80
The Service "foo" is invalid: spec.ports[0].nodePort: Invalid value: 30003: provided port is already allocated
Error from server (NotFound): services "foo" not found
Error from server (NotFound): services "foo" not found
[2020-01-28-08:58:30] # 81
The Service "foo" is invalid: spec.ports[0].nodePort: Invalid value: 30003: provided port is already allocated
Error from server (NotFound): services "foo" not found
Error from server (NotFound): services "foo" not found
...
[2020-01-28-09:07:23] # 5
The Service "foo" is invalid: spec.ports[0].nodePort: Invalid value: 30003: provided port is already allocated
Error from server (NotFound): services "foo" not found
Error from server (NotFound): services "foo" not found
[2020-01-28-09:07:26] # 6
service/foo created
service "foo" deleted
service/foo patched

Anything else we need to know?:

  • This does not always happen / this is a flakyness

  • Also the problem get's auto-resolved by the apiserver after some time (But this may need about 10 minutes):
    E0128 08:01:24.562044 1 repair.go:300] the cluster IP 10.0.0.215 may have leaked: flagging for later clean up

  • Background for us here is: we want to run a custom controller for services of type: LoadBalancer and want to use a finalizer. We did hit this issue sometimes during dev.

Environment:

  • Kubernetes version (use kubectl version):
    $ kubectl version
    Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.6", GitCommit:"72c30166b2105cd7d3350f2c28a219e6abcd79eb", GitTreeState:"clean", BuildDate:"2020-01-18T23:31:31Z", GoVersion:"go1.13.5", Compiler:"gc", Platform:"linux/amd64"}
    Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.2", GitCommit:"59603c6e503c87169aea6106f57b9f242f64df89", GitTreeState:"clean", BuildDate:"2020-01-18T23:22:30Z", GoVersion:"go1.13.5", Compiler:"gc", Platform:"linux/amd64"}
    
  • Cloud provider or hardware configuration: none / locally reproducable / all are affected
  • OS (e.g: cat /etc/os-release):
    $ cat /etc/os-release
    NAME="Ubuntu"
    VERSION="18.04.3 LTS (Bionic Beaver)"
    ID=ubuntu
    ID_LIKE=debian
    PRETTY_NAME="Ubuntu 18.04.3 LTS"
    VERSION_ID="18.04"
    HOME_URL="https://www.ubuntu.com/"
    SUPPORT_URL="https://help.ubuntu.com/"
    BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
    PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
    VERSION_CODENAME=bionic
    UBUNTU_CODENAME=bionic
    
  • Kernel (e.g. uname -a):
    $ uname -a
    

Linux 5.3.0-26-generic #28~18.04.1-Ubuntu SMP Wed Dec 18 16:40:14 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

- Install tools:
- Network plugin and version (if this is a network-related bug):
- Others:

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.priority/important-soonMust be staffed and worked on either currently, or very soon, ideally in time for the next release.sig/networkCategorizes an issue or PR as relevant to SIG Network.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions