[bpf] Connections dropped during felix restart

## Expected Behavior

As usually observed in the iptables data plane, I'd expect a typha restart (as in a regular typha rolling redeployment) not having significant negative impact on the ebpf data plane. 

## Current Behavior

Using the ebpf data plane in a bigger cluster (280 nodes, 1900 services, 18k pods) causes hanging traffic after a typha restart. Seemingly, while a calico-node is reconnecting (despite having similar output as with the iptables data plane), the ebpf data plane is reconfigured and connot properly forward traffic until done. 
I'm specifically seeing timeouts in our ingress controllers / gateways (traefik, istio) are not able to forward traffic to the workload pods. 

In my last production test, I observed effects for up to 20m after the typha redeployment. Old typha pods took up to 5m to terminate (which is our termination grace period, but it seems all connections were handed off before termination). Typha metrics show that all connections were up at ~6m after the start of the shutdown - still I see application impact up to 20m after the restart. 

Noteworthy as well: I see significant drop of conntrack table size during the typha reconnects, which I would not expect. Maybe this even is the core of the problem, that connectiontable state is (partially?) dropped? 

## Steps to Reproduce (for bugs)


1. `kubectl rollout restart -n kube-system deployment calico-typha`


## Your Environment

* Calico version 3.30.1
* Calico dataplane (iptables, windows etc.) ebpf
* Orchestrator version (e.g. kubernetes, mesos, rkt): k8s v1.31.8
* Operating System and version: FlatCar ContainerLinux 4152.2.3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[bpf] Connections dropped during felix restart #10670

Expected Behavior

Current Behavior

Steps to Reproduce (for bugs)

Your Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[bpf] Connections dropped during felix restart #10670

Description

Expected Behavior

Current Behavior

Steps to Reproduce (for bugs)

Your Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions