11 minute read

In this post I will show and explain how to configure secondary networks in OVN-Kubernetes connected to the physical underlay of the cluster. The physical underlay will be configured using Kubernetes-NMState.

This blog post assumes you have the following software installed in your Kubernetes cluster:

KubeVirt will allow you to run virtualized workloads alongside pods in the same cluster, multus-cni will allow your workloads to have multiple network interfaces, OVN-Kubernetes will be the CNI implementing the network fabric for both the default cluster, and secondary networks, while Kubernetes-NMState will be responsible for configuring the physical networks in the Kubernetes nodes.

Using a dedicated interface for the secondary networks

This option is available if you have more than one interface available in your nodes.

Let’s assume there is an extra interface available on each of the Kubernetes nodes, and the network admin wants to use a single OVS bridge to connect multiple physical networks - one on VLAN 10, another on VLAN 20. That scenario is available by provisioning the following NNCP.

apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: ovs-br1-multiple-networks
spec:
  desiredState:
    interfaces:
      - name: ovsbr1
        description: ovs bridge with eth1 as a port allowing multiple VLANs through
        type: ovs-bridge
        state: up
        bridge:
          options:
            stp: true
          port:
            - name: eth1
              vlan:
                mode: trunk
                trunk-tags:
                  - id: 10
                  - id: 20
    ovs-db:
      external_ids:
        ovn-bridge-mappings: "physnet:breth0,tenantblue:ovsbr1,tenantred:ovsbr1"

Using the NodeNetworkConfigurationPolicy - a Kubernetes-NMState CRD - we are requesting an OVS bridge - named ovsbr1 - with STP enabled (Spanning tree protocol).

The interface eth1 is configured as a port of the aforementioned OVS bridge, and VLANs 10, and 20 are allowed in that port. Also, VLAN 0 is applied to untagged traffic, as per the Open vSwitch documentation.

Finally, the OVN bridge mapping API is configured to specify which OVS bridges are used for each physical network. It creates a patch port between the OVN integration bridge - br-int - and the OVS bridge you tell it to, and traffic will be forwarded to/from it with the help of a localnet port.

Referring to the NNCP shown above, we are configuring three physical networks:

  • physnet: cluster’s default north/south network. Attached to br-ex.
  • tenantblue: a network whose config will be shown. Attached to ovsbr1.
  • tenantred: a network whose config we will show. Attached to ovsbr1.

After provisioning it, the network admin should ensure the policy was applied:

kubectl get nncp ovs-br1-multiple-networks
NAME              STATUS      REASON
br-ex-multi-net   Available   SuccessfullyConfigured

Configuring the Gateway

I am assuming you’re using a virtualized infrastructure - i.e. your Kubernetes nodes are VMs running on a bare-metal machine. On my environment, those look like:

virsh list 
 Id   Name                      State
-----------------------------------------
 24   multi-homing-ctlplane-0   running
 25   multi-homing-ctlplane-1   running
 26   multi-homing-ctlplane-2   running
 27   multi-homing-worker-0     running
 28   multi-homing-worker-1     running

Quite probably your VMs will be interconnected via a linux (or OVS …) bridge. On my environment, those look like:

ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 34:48:ed:f3:f6:b8 brd ff:ff:ff:ff:ff:ff
    altname enp24s0f0
    inet 10.46.41.66/24 brd 10.46.41.255 scope global dynamic noprefixroute eno1
       valid_lft 40430sec preferred_lft 40430sec
    inet6 2620:52:0:2e29:3648:edff:fef3:f6b8/64 scope global dynamic noprefixroute 
       valid_lft 2591976sec preferred_lft 604776sec
    inet6 fe80::3648:edff:fef3:f6b8/64 scope link noprefixroute 
       valid_lft forever preferred_lft forever
6: virbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 52:54:00:9d:c5:32 brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
       valid_lft forever preferred_lft forever
65: secondary: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 52:54:00:9b:21:f1 brd ff:ff:ff:ff:ff:ff
    inet 192.168.123.1/24 brd 192.168.123.255 scope global secondary
       valid_lft forever preferred_lft forever

The secondary linux bridge shown above (ifindex 65) will act as our gateway; we will need to create a VLAN on top of it for each of the tags we have configured on our NNCP:

ip link add link secondary name secondary.10 type vlan id 10
ip link add link secondary name secondary.20 type vlan id 20

ip addr add 192.168.180.1/24 dev secondary.10
ip addr add 192.168.185.1/24 dev secondary.20

ip link set secondary.10 up
ip link set secondary.20 up

Defining the OVN-Kubernetes networks

Now that the physical underlay on the Kubernetes nodes is properly configured, we can define the OVN-Kubernetes secondary networks. For that, we will use the following NetworkAttachmentDefinition - a Multus-CNI CRD.

---
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  name: tenantred
spec:
  config: |2
    {
            "cniVersion": "0.3.1",
            "name": "tenantred",
            "type": "ovn-k8s-cni-overlay",
            "topology": "localnet",
            "subnets": "192.168.180.0/24",
            "excludeSubnets": "192.168.180.1/32",
            "vlanID": 10,
            "netAttachDefName": "default/tenantred"
    }
---
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  name: tenantblue
spec:
  config: |2
    {
            "cniVersion": "0.3.1",
            "name": "tenantblue",
            "type": "ovn-k8s-cni-overlay",
            "topology": "localnet",
            "subnets": "192.168.185.0/24",
            "excludeSubnets": "192.168.185.1/32",
            "vlanID": 20,
            "netAttachDefName": "default/tenantblue"
    }

As you can see above, we are defining two networks, whose names match the OVN bridge mappings of the NNCPs defined in the previous section: tenantblue and tenantred.

Also notice how each of those networks has a different VLAN, but both those IDs are listed in the trunk configuration on the OVS bridge port - 10 and 20.

It is important to refer each of those networks is using a different subnet, and we had to exclude the gateway IP from the range - otherwise, OVN-Kubernetes would assign that IP address to the workloads.

Defining the Virtual Machines

We will now define four Virtual Machines, all of them with a single network interface. Two of those VMs will be connected to the tenantblue network, while the other two will be connected to the tenantred network.

One of the VMs connected to each of the aforementioned networks will be scheduled in each of the nodes - i.e. one tenantblue connected VM will be scheduled in multi-homing-worker-0, and the other in multi-homing-worker-1. The same is true for the tenantred connected VMs. A nodeSelector is used to implement this.

---
apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachine
metadata:
  name: vm-red-1
spec:
  running: true
  template:
    spec:
      nodeSelector:
        kubernetes.io/hostname: multi-homing-worker-0
      domain:
        devices:
          disks:
            - name: containerdisk
              disk:
                bus: virtio
            - name: cloudinitdisk
              disk:
                bus: virtio
          interfaces:
          - name: physnet-red
            bridge: {}
        machine:
          type: ""
        resources:
          requests:
            memory: 1024M
      networks:
      - name: physnet-red
        multus:
          networkName: tenantred
      terminationGracePeriodSeconds: 0
      volumes:
        - name: containerdisk
          containerDisk:
            image: quay.io/kubevirt/fedora-with-test-tooling-container-disk:devel
        - name: cloudinitdisk
          cloudInitNoCloud:
            networkData: |
              version: 2
              ethernets:
                eth0:
                  dhcp4: true
            userData: |-
              #cloud-config
              password: fedora
              chpasswd: { expire: False }
---
apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachine
metadata:
  name: vm-red-2
spec:
  running: true
  template:
    spec:
      nodeSelector:
        kubernetes.io/hostname: multi-homing-worker-1
      domain:
        devices:
          disks:
            - name: containerdisk
              disk:
                bus: virtio
            - name: cloudinitdisk
              disk:
                bus: virtio
          interfaces:
          - name: flatl2-overlay
            bridge: {}
        machine:
          type: ""
        resources:
          requests:
            memory: 1024M
      networks:
      - name: flatl2-overlay
        multus:
          networkName: tenantred
      terminationGracePeriodSeconds: 0
      volumes:
        - name: containerdisk
          containerDisk:
            image: quay.io/kubevirt/fedora-with-test-tooling-container-disk:devel
        - name: cloudinitdisk
          cloudInitNoCloud:
            networkData: |
              version: 2
              ethernets:
                eth0:
                  dhcp4: true
            userData: |-
              #cloud-config
              password: fedora
              chpasswd: { expire: False }
---
apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachine
metadata:
  name: vm-blue-1
spec:
  running: true
  template:
    spec:
      nodeSelector:
        kubernetes.io/hostname: multi-homing-worker-0
      domain:
        devices:
          disks:
            - name: containerdisk
              disk:
                bus: virtio
            - name: cloudinitdisk
              disk:
                bus: virtio
          interfaces:
          - name: physnet-blue
            bridge: {}
        machine:
          type: ""
        resources:
          requests:
            memory: 1024M
      networks:
      - name: physnet-blue
        multus:
          networkName: tenantblue
      terminationGracePeriodSeconds: 0
      volumes:
        - name: containerdisk
          containerDisk:
            image: quay.io/kubevirt/fedora-with-test-tooling-container-disk:devel
        - name: cloudinitdisk
          cloudInitNoCloud:
            networkData: |
              version: 2
              ethernets:
                eth0:
                  dhcp4: true
            userData: |-
              #cloud-config
              password: fedora
              chpasswd: { expire: False }
---
apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachine
metadata:
  name: vm-blue-2
spec:
  running: true
  template:
    spec:
      nodeSelector:
        kubernetes.io/hostname: multi-homing-worker-1
      domain:
        devices:
          disks:
            - name: containerdisk
              disk:
                bus: virtio
            - name: cloudinitdisk
              disk:
                bus: virtio
          interfaces:
          - name: physnet-blue
            bridge: {}
        machine:
          type: ""
        resources:
          requests:
            memory: 1024M
      networks:
      - name: physnet-blue
        multus:
          networkName: tenantblue
      terminationGracePeriodSeconds: 0
      volumes:
        - name: containerdisk
          containerDisk:
            image: quay.io/kubevirt/fedora-with-test-tooling-container-disk:devel
        - name: cloudinitdisk
          cloudInitNoCloud:
            networkData: |
              version: 2
              ethernets:
                eth0:
                  dhcp4: true
            userData: |-
              #cloud-config
              password: fedora
              chpasswd: { expire: False }

Now you can ping between the VMs in each network, plus access services on the underlay, as long as they are reachable via the GW IP.

Sharing a single physical interface

Another option - pretty much the only one available if your Kubernetes nodes only feature a single physical interface - is to share that interface between your cluster default and secondary networks.

For this, you should provision the following NNCP:

apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: br-ex-multi-net
spec:
  desiredState:
    ovs-db:
      external_ids:
        ovn-bridge-mappings: "physnet:br-ex,externalvlan:br-ex,novlan:br-ex"

One thing to take into account is some platforms - i.e. Openshift - are really conservative in regards to the interface managed by OVN-Kubernetes. You should not - under any circumstance - try to reconfigure that interface.

Luckily, for Openvswitch, if you do not specify any list of trunks on an interface it allows all VLANs through, thus, we can get away simply with specifying which physical networks can tap the underlay.

In the following example, we are configuring a single physical network, but you can add more. These can optionally encapsulate their traffic in VLANs.

Defining the Gateway with a shared interface

The non-VLAN encapsulated traffic will traverse the trunk with VLAN tag 0, and will be output from the OVS bridge without a tag, but you will need to create a VLAN for the externalvlan network - let’s say in uses VLAN ID 80. Let’s also assign it a subnet - 192.168.200.0/24 for instance.

ip link add link secondary name secondary.80 type vlan id 80

ip addr add 192.168.200.1/24 dev secondary.80

ip link set secondary.80 up

Finally, the linux bridge on my bare-metal node connected to the extra interface on the Kubernetes nodes looks like:

ip a
...
65: secondary: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 52:54:00:9b:21:f1 brd ff:ff:ff:ff:ff:ff
    inet 192.168.123.1/24 brd 192.168.123.255 scope global secondary
       valid_lft forever preferred_lft forever
...

We need to keep its subnet in mind for the next step.

Defining the OVN-Kubernetes networks for shared bridge

These are the network definitions for the aforementioned networks:

---
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  name: underlay
spec:
  config: |2
    {
            "cniVersion": "0.3.1",
            "name": "externalvlan",
            "type": "ovn-k8s-cni-overlay",
            "topology":"localnet",
            "vlanID": 80,
            "subnets": "192.168.200.0/24",
            "excludeSubnets": "192.168.200.1/32",
            "netAttachDefName": "default/underlay"
    }
---
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  name: novlan
spec:
  config: |2
    {
            "cniVersion": "0.4.0",
            "name": "novlan",
            "type": "ovn-k8s-cni-overlay",
            "topology":"localnet",
            "subnets": "192.168.122.0/24",
            "excludeSubnets": "192.168.122.1/32",
            "netAttachDefName": "default/novlan"
    }

Defining the Virtual Machines

We will now define four Virtual Machines, all of them with a single network interface. Two of those VMs will be connected to the localnet network, which does not have VLAN encapsulated traffic, while the other two will be connected to the externalvlan network, which encapsulates traffic with the VLAN tag 80.

One of the VMs connected to each of the aforementioned networks will be scheduled in each of the nodes - i.e. one tenantblue connected VM will be scheduled in multi-homing-worker-0, and the other in multi-homing-worker-1. The same is true for the tenantred connected VMs. A nodeSelector is used to implement this.

---
apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachine
metadata:
  name: vm-sharing-br-no-vlan-1
spec:
  running: true
  template:
    spec:
      nodeSelector:
        kubernetes.io/hostname: multi-homing-worker-0
      domain:
        devices:
          disks:
            - name: containerdisk
              disk:
                bus: virtio
            - name: cloudinitdisk
              disk:
                bus: virtio
          interfaces:
          - name: no-vlan
            bridge: {}
        machine:
          type: ""
        resources:
          requests:
            memory: 1024M
      networks:
      - name: no-vlan
        multus:
          networkName: novlan
      terminationGracePeriodSeconds: 0
      volumes:
        - name: containerdisk
          containerDisk:
            image: quay.io/kubevirt/fedora-with-test-tooling-container-disk:devel
        - name: cloudinitdisk
          cloudInitNoCloud:
            networkData: |
              version: 2
              ethernets:
                eth0:
                  dhcp4: true
            userData: |-
              #cloud-config
              password: fedora
              chpasswd: { expire: False }
---
apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachine
metadata:
  name: vm-sharing-br-no-vlan-2
spec:
  running: true
  template:
    spec:
      nodeSelector:
        kubernetes.io/hostname: multi-homing-worker-1
      domain:
        devices:
          disks:
            - name: containerdisk
              disk:
                bus: virtio
            - name: cloudinitdisk
              disk:
                bus: virtio
          interfaces:
          - name: no-vlan
            bridge: {}
        machine:
          type: ""
        resources:
          requests:
            memory: 1024M
      networks:
      - name: no-vlan
        multus:
          networkName: novlan
      terminationGracePeriodSeconds: 0
      volumes:
        - name: containerdisk
          containerDisk:
            image: quay.io/kubevirt/fedora-with-test-tooling-container-disk:devel
        - name: cloudinitdisk
          cloudInitNoCloud:
            networkData: |
              version: 2
              ethernets:
                eth0:
                  dhcp4: true
            userData: |-
              #cloud-config
              password: fedora
              chpasswd: { expire: False }
---
apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachine
metadata:
  name: vm-sharing-br-with-vlan-1
spec:
  running: true
  template:
    spec:
      nodeSelector:
        kubernetes.io/hostname: multi-homing-worker-0
      domain:
        devices:
          disks:
            - name: containerdisk
              disk:
                bus: virtio
            - name: cloudinitdisk
              disk:
                bus: virtio
          interfaces:
          - name: external-vlan
            bridge: {}
        machine:
          type: ""
        resources:
          requests:
            memory: 1024M
      networks:
      - name: external-vlan
        multus:
          networkName: underlay
      terminationGracePeriodSeconds: 0
      volumes:
        - name: containerdisk
          containerDisk:
            image: quay.io/kubevirt/fedora-with-test-tooling-container-disk:devel
        - name: cloudinitdisk
          cloudInitNoCloud:
            networkData: |
              version: 2
              ethernets:
                eth0:
                  dhcp4: true
            userData: |-
              #cloud-config
              password: fedora
              chpasswd: { expire: False }
---
apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachine
metadata:
  name: vm-sharing-br-with-vlan-2
spec:
  running: true
  template:
    spec:
      nodeSelector:
        kubernetes.io/hostname: multi-homing-worker-1
      domain:
        devices:
          disks:
            - name: containerdisk
              disk:
                bus: virtio
            - name: cloudinitdisk
              disk:
                bus: virtio
          interfaces:
          - name: external-vlan
            bridge: {}
        machine:
          type: ""
        resources:
          requests:
            memory: 1024M
      networks:
      - name: external-vlan
        multus:
          networkName: underlay
      terminationGracePeriodSeconds: 0
      volumes:
        - name: containerdisk
          containerDisk:
            image: quay.io/kubevirt/fedora-with-test-tooling-container-disk:devel
        - name: cloudinitdisk
          cloudInitNoCloud:
            networkData: |
              version: 2
              ethernets:
                eth0:
                  dhcp4: true
            userData: |-
              #cloud-config
              password: fedora
              chpasswd: { expire: False }

Now you can ping between the VMs in each network, plus access services on the underlay, as long as they are reachable via the GW IP.

The VLANs and subnets must match what was configured in the previous sections:

Network Name VLAN Subnet
externalvlan 80 192.168.200.0/24
novlan 0 192.168.122.0/24

Also remember the gateway IP for those subnets must be excluded.

Conclusion

In this blog post we’ve seen how to configure secondary networks using an SDN network fabric (implemented by OVN-Kubernetes) connected to the physical underlay of the Kubernetes cluster. To simplify the underlay configuration, we have used Kubernetes-NMState, which provides a policy layer that will apply a desired configuration across the nodes which compose your cluster.

Two types of comprehensive examples were shown: one for clusters with a vacant network interface, and another example for when the cluster features a single network interface that must be shared between the cluster default network, and the secondary networks.

Updated: