Exploring MetalLB's communication with routers Published on 26 Mar 2024 by Vinu K

MetalLB is a must-have Operator in the non-cloud OpenShift/Kubernetes environment. It fecilitates the automated assignment of an ExternalIP to a Service so that it can be accessed from the outside world. It attract external traffic using ARP and BGP protocol in the IPv4 network. In this post, we are discussing the usage of BGP protocol in detail. First of all, for the better understanding, we can divide our topic into three different parts.


How to create a MetalLB + BGP lab environment?

Here, we are using OpenShift Network Playground (ONP) to create our lab environment. The lab environment architecture is shown below.

Lab

The ONP has multiple bridge interfaces. However, we are using the sno0 bridge for the VyOS instance.

Interface

Let’s create our SNO cluster.

onp sno4

Install and configure MetalLB.

cat << EOF | oc create -f -
apiVersion: v1
kind: Namespace
metadata:
  name: metallb-system
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: metallb-operator
  namespace: metallb-system
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: metallb-operator-sub
  namespace: metallb-system
spec:
  channel: stable
  name: metallb-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
EOF
cat << EOF | oc create -f -
apiVersion: metallb.io/v1beta1
kind: MetalLB
metadata:
  name: metallb
  namespace: metallb-system
EOF
cat << EOF | oc create -f -
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  namespace: metallb-system
  name: onp-bgp-pool
spec:
  addresses:
    - 10.0.0.10-10.0.0.20
---
apiVersion: metallb.io/v1beta2
kind: BGPPeer
metadata:
  namespace: metallb-system
  name: onp-bgp-peer
spec:
  peerAddress: 192.168.126.3
  peerASN: 64512
  myASN: 64512
---
apiVersion: metallb.io/v1beta1
kind: BGPAdvertisement
metadata:
  name: onp-bgp-advertisement
  namespace: metallb-system
spec:
  ipAddressPools:
    - onp-bgp-pool
  peers:
    - onp-bgp-peer
  communities:
    - 65535:65282
  aggregationLength: 32
  aggregationLengthV6: 128
  localPref: 100
EOF
cat << EOF | oc create -f -
apiVersion: v1
kind: Namespace
metadata:
  name: onp-metallb
---
apiVersion: v1
kind: Pod
metadata:
  name: echoserver
  namespace: onp-metallb
  labels:
    app: echoserver
spec:
  containers:
  - image: registry.k8s.io/echoserver:1.0
    imagePullPolicy: Always
    name: echoserver
    ports:
    - containerPort: 8080
    securityContext:
      runAsNonRoot: true
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      seccompProfile:
        type: RuntimeDefault
      capabilities:
        drop:
        - ALL
---
apiVersion: v1
kind: Service
metadata:
  name: echoserver
  namespace: onp-metallb
spec:
  ports:
  - port: 8080
    targetPort: 8080
    protocol: TCP
  selector:
    app: echoserver
  type: LoadBalancer
EOF

Next, install and configure VyOS. The NETWORK option is used for the bridge selection. Use the username and password as vyos when it prompts for the first configuration.

onp vyos NETWORK=sno0

Configure VyOS instance.

configure
set interfaces ethernet eth0 address 192.168.126.3/24
set protocols bgp 64512 parameters router-id 192.168.126.3
set protocols bgp 64512 neighbor 192.168.126.2 remote-as 64512
set protocols bgp 64512 neighbor 192.168.126.2 address-family ipv4-unicast
set service ssh
commit
save
exit

What are the flows of communication between the MetalLB and a router?

The MetalLB communicates to the external router using the FRR container in speaker Pod. The speaker Pod uses the host-network of the node. By default, the connection initiates via the 179/TCP port. The FRR container has six states while connecting the peer. Those are Idle, Connect, Active, OpenSent, OpenConfirm and Established. We will see each ones in detail.

Connection

States


How to trace an issue while configuring a BGP peer?

Here, we need to remember the different states that has mentioned previously. Suppose you have configured the BGP peer on both the ends and checking the states of the states, the below will help understand the status of the connection. This is something that I have observed while working with it.

In the Idle state case, we can sniff the BGP packets and confirm the wrong parameters. Then you will probabily see a NOTIFICATION message that has the error information.

The continuous Connect state is something we need to check within the speaker Pod or node.

However, the Active state is the stage wherein the TCP phase has completed and initiating the BGP phase. The BGP phase has four messages. OPEN, UPDATE, NOTIFICATION and KEEPALIVE messages.

OPEN message: This is the first exchange of BGP parameters. If it matches each other, the peer send back the OPEN message with its own parameters.

Open

NOTIFICATION message: Else of the above case, it sends the NOTIFICATION message that shows the error information.

Notification

UPDATE message: This is the actual route exchange message.

Update

KEEPALIVE message: It sends each other to find the failover.

Keepalive

If one enables the BFD for the fast failover, the communication happens in the port 3784. The packets will be the BFD control message with the interval configured in the BFDProfile object.

BFD


Did you like it? Feel free to send me an email with your feedback to [email protected]. You can also reach me out on X. Thanks!