-
Notifications
You must be signed in to change notification settings - Fork 88
Description
In a dual ToR setup (attempt), with ECMP routes to some node loopback IPs and pod /26 blocks, we tried enabling IP-in-IP (to address a separate bootstrapping issue), and observed that most of the routes did not change to IP-in-IP routes:
ubuntu@lance-ipv6:~$ docker exec kind-worker ip r
default via 172.31.12.2 dev eth1 src 172.31.12.3
172.31.10.0/23 src 172.31.10.4
nexthop dev eth0 weight 1
nexthop dev eth1 weight 1
172.31.11.0/24 dev eth0 proto kernel scope link src 172.31.11.3
172.31.12.0/24 dev eth1 proto kernel scope link src 172.31.12.3
172.31.20.0/23 src 172.31.10.4
nexthop via 172.31.11.1 dev eth0 weight 1
nexthop via 172.31.12.1 dev eth1 weight 1
172.31.21.0/24 via 172.31.11.1 dev eth0
172.31.22.0/24 via 172.31.12.1 dev eth1
192.168.82.0/26 via 172.31.10.3 dev tunl0 proto bird onlink
192.168.110.128/26 proto bird
nexthop via 172.31.11.1 dev eth0 weight 1
nexthop via 172.31.12.1 dev eth1 weight 1
blackhole 192.168.162.128/26 proto bird
192.168.162.129 dev calif2ae67c76cf scope link
192.168.162.130 dev calic1faebfe4cf scope link
192.168.162.131 dev calib66283a5cca scope link
192.168.162.133 dev calid730704be43 scope link
192.168.195.192/26 proto bird
nexthop via 172.31.11.1 dev eth0 weight 1
nexthop via 172.31.12.1 dev eth1 weight 1
In fact, only the non-ECMP routes change to go via the IP-in-IP tunnel device tunl0.
On reviewing the BIRD code, it's clear this is because of lack of support in the BIRD code. There are two places where the code handles EA_KRT_TUNNEL, and in both places the handling is also conditional on RTD_ROUTER - which means "a route with a single path". In both places we would need to add corresponding code for RTD_MULTIPATH.
Does this matter?
Well, it depends if there are any use cases for ECMP routes while IP-in-IP is in use. The dual ToR work unconditionally added "merge paths on" to the BIRD config, which means that any TSEE deployment would program an ECMP route if it received more than one possible path for a given prefix. We should perhaps make that conditional somehow.