Improvement of Egress IP on secondary interface of the OpenShift (OVN-K CNI) node Published on 13 Jun 2024 by Vinu K

First of all, what is an EgressIP?, an EgressIP allows one to ensure that the traffic from one or more pods in one or more namespaces has a consistent source IP address for services outside the cluster network. It uses the namespaceSelector or podSelector to identify the traffic. The OVN-K documentation explains the in-depth traffic flow. When it comes to the EgressIP that has attached to the secondary interface of the OpenShift node, it showed the limitation to communicate to the different subnet. The workaround for the issue was not practical as the node’s rule table with source IP of the pod should be modified.


What went wrong

Whenever a Pod with an EgressIP is created, a policy routing rule with the priority 6000 will be added on the node wherein the EgressIP has been assigned. It can be listed using the command ip rule. The policy routing rule will tell the system which table to use to determine the correct route. For example, the 6000: from 10.128.1.55 lookup 1134 tells the system to use the table 1134 if the source address is 10.128.1.55. Hence, it is often referred to as source routing. However, the table will not be reflected in the /etc/iproute2/rt_tables. Why? The /etc/iproute2/rt_tables file is a human-readable reference that maps numerical table IDs to names. This file is not automatically updated when table is created dynamically by the ovnkube-controller. The ip route show table 1134 command will show the route that needs to be used for the Pod traffic.

EgressIP

Here comes the actual issue. The controller only adds a single line any or default route to the rule table, this will not route the traffic to the network (for example, internet) which does not belong to the interface.

The 4.15.9 code shows the below function to get the route for the secondary link.

func getDefaultRouteForLink(link netlink.Link, v6 bool) *routemanager.RoutesPerLink {
	return &routemanager.RoutesPerLink{Link: link,
		Routes: []routemanager.Route{
			getDefaultRoute(link.Attrs().Index, v6),
		},
	}
}
func getDefaultRoute(linkIdx int, v6 bool) routemanager.Route {
	anyCIDR := defaultV4AnyCIDR
	if v6 {
		anyCIDR = defaultV6AnyCIDR
	}
	return routemanager.Route{
		Table:  getRouteTableID(linkIdx),
		Subnet: anyCIDR,
	}
}

The above generates the route:

default dev <interface>

See the logs from the ovnkube-controller.

I0613 03:59:46.272484    7647 egressip.go:381] Processing Egress IP foo
I0613 04:00:07.368172    7647 egressip.go:591] Adding pod egress IP status: {node.sno4.onp.local 192.168.124.99} for EgressIP: foo and pod: foo/pod/[10.128.0.208/23]
I0613 04:00:10.474732    7647 egressip.go:381] Processing Egress IP foo
I0613 04:00:10.480805    7647 egressip.go:542] Generating config for EgressIP foo IP 192.168.124.99 which is hosted by a non-OVN managed interface (name enp7s0)
I0613 04:00:10.535929    7647 route_manager.go:92] Route Manager: attempting to add routes for link: Route(s) for link name: "enp7s0", with 1 routes:  Route 1: "Table 1134 Subnet: 0.0.0.0/0"
I0613 04:00:10.536104    7647 route_manager.go:102] Route Manager: completed adding route: Route(s) for link name: "enp7s0", with 1 routes:  Route 1: "Table 1134 Subnet: 0.0.0.0/0"
I0613 04:00:10.537000    7647 route_manager.go:145] Route Manager: netlink route addition event: "{Ifindex: 134 Dst: <nil> Src: <nil> Gw: <nil> Flags: [] Table: 1134 Realm: 0}"

See the packet captures from the node when the Pod initiates a ping to google.com and its gateway. The former does not succeed, but the latter does.

15:29:33.393427 5fd314eac04272b P   IP 10.128.1.55 > 142.251.42.78: ICMP echo request, id 15, seq 1, length 64
15:29:33.393912 ovn-k8s-mp0 In  IP 10.128.1.55 > 142.251.42.78: ICMP echo request, id 15, seq 1, length 64
15:29:36.459963 ovn-k8s-mp0 Out IP 192.168.124.99 > 10.128.1.55: ICMP host 142.251.42.78 unreachable, length 92
15:29:36.460344 5fd314eac04272b Out IP 192.168.124.99 > 10.128.1.55: ICMP host 142.251.42.78 unreachable, length 92
16:08:51.191993 5fd314eac04272b P   IP 10.128.1.55 > 192.168.124.1: ICMP echo request, id 18, seq 1, length 64
16:08:51.192063 ovn-k8s-mp0 In  IP 10.128.1.55 > 192.168.124.1: ICMP echo request, id 18, seq 1, length 64
16:08:51.192088 enp7s0 Out IP 192.168.124.99 > 192.168.124.1: ICMP echo request, id 18, seq 1, length 64
16:08:51.192269 enp7s0 In  IP 192.168.124.1 > 192.168.124.99: ICMP echo reply, id 18, seq 1, length 64
16:08:51.192288 ovn-k8s-mp0 Out IP 192.168.124.1 > 10.128.1.55: ICMP echo reply, id 18, seq 1, length 64
16:08:51.192308 5fd314eac04272b Out IP 192.168.124.1 > 10.128.1.55: ICMP echo reply, id 18, seq 1, length 64

The fix

The function has been changed from getDefaultRouteForLink to generateRoutesForLink and the fix has been added in the 4.15.10 release. What it does is, it copies all the route of the interface from the main table and uses it in the new rule table. If the route does not exist in the main table for the interface, it creates one.

The below output compares both the functions from the non-fixed and fixed releases.

$ oc adm release info 4.15.9 --commits | grep 'ovn-kubernetes' | awk '{print $3}' | head -n 1
42b1cc427538a736f8c056171b4de7e6c6a366fb
$ git checkout 42b1cc427
HEAD is now at 42b1cc427 Merge pull request #2074 from tssurya/OCPBUGS-29599
$ git blame go-controller/pkg/node/controllers/egressip/egressip.go | grep -A6 'func getDefaultRouteForLink'
b64825622f (Martin Kennelly   2023-06-27 13:52:40 +0100 1086) func getDefaultRouteForLink(link netlink.Link, v6 bool) *routemanager.RoutesPerLink {
b64825622f (Martin Kennelly   2023-06-27 13:52:40 +0100 1087) 	return &routemanager.RoutesPerLink{Link: link,
b64825622f (Martin Kennelly   2023-06-27 13:52:40 +0100 1088) 		Routes: []routemanager.Route{
b64825622f (Martin Kennelly   2023-06-27 13:52:40 +0100 1089) 			getDefaultRoute(link.Attrs().Index, v6),
b64825622f (Martin Kennelly   2023-06-27 13:52:40 +0100 1090) 		},
b64825622f (Martin Kennelly   2023-06-27 13:52:40 +0100 1091) 	}
b64825622f (Martin Kennelly   2023-06-27 13:52:40 +0100 1092) }
$ oc adm release info 4.15.10 --commits | grep 'ovn-kubernetes' | awk '{print $3}' | head -n 1            
feca446a2e3848f79e533bb28763b2d61074de6e
$ git checkout feca446a2
HEAD is now at feca446a2 Merge pull request #2094 from arghosh93/SDN-4544
$ git blame go-controller/pkg/node/controllers/egressip/egressip.go | grep -A8 'func generateRoutesForLink'
ced7a1c229 (Martin Kennelly   2024-01-29 10:08:36 +0000  620) func generateRoutesForLink(link netlink.Link, isV6 bool) ([]netlink.Route, error) {
ced7a1c229 (Martin Kennelly   2024-01-29 10:08:36 +0000  621) 	linkRoutes, err := netlink.RouteList(link, util.GetIPFamily(isV6))
ced7a1c229 (Martin Kennelly   2024-01-29 10:08:36 +0000  622) 	if err != nil {
ced7a1c229 (Martin Kennelly   2024-01-29 10:08:36 +0000  623) 		return nil, fmt.Errorf("failed to get routes for link %s: %v", link.Attrs().Name, err)
ced7a1c229 (Martin Kennelly   2024-01-29 10:08:36 +0000  624) 	}
ced7a1c229 (Martin Kennelly   2024-01-29 10:08:36 +0000  625) 	linkRoutes = ensureAtLeastOneDefaultRoute(linkRoutes, link.Attrs().Index, isV6)
ced7a1c229 (Martin Kennelly   2024-01-29 10:08:36 +0000  626) 	overwriteRoutesTableID(linkRoutes, getRouteTableID(link.Attrs().Index))
ced7a1c229 (Martin Kennelly   2024-01-29 10:08:36 +0000  627) 	return linkRoutes, nil
b64825622f (Martin Kennelly   2023-06-27 13:52:40 +0100  628) }

The difference between the return values *routemanager.RoutesPerLink and []netlink.Route in the context of the OVN-K’s routemanager and github.com/vishvananda/netlink is:

See the logs of ovnkube-controller.

I0613 06:10:42.946169    6486 egressip.go:426] Processing Egress IP foo
I0613 06:11:05.136532    6486 egressip.go:591] Adding pod egress IP status: {node.sno4.example.local 192.168.124.99} for EgressIP: foo and pod: foo/pod/[10.128.0.137/23]
I0613 06:11:09.176605    6486 route_manager.go:149] Route Manager: netlink route addition event: "{Ifindex: 143 Dst: 192.168.124.99/32 Src: 192.168.124.99 Gw: <nil> Flags: [] Table: 255 Realm: 0}"
I0613 06:11:09.185916    6486 route_manager.go:93] Route Manager: attempting to add route: {Ifindex: 143 Dst: 0.0.0.0/0 Src: 192.168.124.45 Gw: 192.168.124.1 Flags: [] Table: 1143 Realm: 0}
I0613 06:11:09.191761    6486 route_manager.go:110] Route Manager: completed adding route: {Ifindex: 143 Dst: 0.0.0.0/0 Src: 192.168.124.45 Gw: 192.168.124.1 Flags: [] Table: 1143 Realm: 0}
I0613 06:11:09.191837    6486 route_manager.go:149] Route Manager: netlink route addition event: "{Ifindex: 143 Dst: 0.0.0.0/0 Src: 192.168.124.45 Gw: 192.168.124.1 Flags: [] Table: 1143 Realm: 0}"
I0613 06:11:09.191870    6486 route_manager.go:93] Route Manager: attempting to add route: {Ifindex: 143 Dst: 192.168.124.0/24 Src: 192.168.124.45 Gw: <nil> Flags: [] Table: 1143 Realm: 0}
I0613 06:11:09.192082    6486 route_manager.go:110] Route Manager: completed adding route: {Ifindex: 143 Dst: 192.168.124.0/24 Src: 192.168.124.45 Gw: <nil> Flags: [] Table: 1143 Realm: 0}
I0613 06:11:09.195575    6486 route_manager.go:149] Route Manager: netlink route addition event: "{Ifindex: 143 Dst: 192.168.124.0/24 Src: 192.168.124.45 Gw: <nil> Flags: [] Table: 1143 Realm: 0}"

See the packet capture from the node when the Pod initiates a ping to google.com.

15:11:57.599484 5fd314eac04272b P   IP 10.128.1.55 > 142.250.183.110: ICMP echo request, id 14, seq 1, length 64
15:11:57.600066 ovn-k8s-mp0 In  IP 10.128.1.55 > 142.250.183.110: ICMP echo request, id 14, seq 1, length 64
15:11:57.600092 enp7s0 Out IP 192.168.124.99 > 142.250.183.110: ICMP echo request, id 14, seq 1, length 64
15:11:57.635614 enp7s0 In  IP 142.250.183.110 > 192.168.124.99: ICMP echo reply, id 14, seq 1, length 64
15:11:57.635630 ovn-k8s-mp0 Out IP 142.250.183.110 > 10.128.1.55: ICMP echo reply, id 14, seq 1, length 64
15:11:57.636298 5fd314eac04272b Out IP 142.250.183.110 > 10.128.1.55: ICMP echo reply, id 14, seq 1, length 64

References


Did you like it? Feel free to send me an email with your feedback to [email protected]. You can also reach me out on X. Thanks!