Manual Network Setup for LXD Clustering


#1

Introduction

When using LXD as a Juju cloud the shortest path to setting up an appropriate network bridge for machines in a LXD cluster is using the MAAS GUI as per the guide here.

Obviously not everyone will be using MAAS and some network setups will require finer grained control. This post explores some of the options for manually configuring the network of LXD cluster hosts.

MAC VLAN

This looks like an attractive option initially - there is no need to even create a bridge; just tell LXD to use an existing physical interface during LXD init. MAC VLANs can also offer better performance than bridges, which broadcast traffic.

Unfortunately the main shortcoming of MAC VLANs, that they can not do “hair-pinning” makes them unsuitable for use with Juju. The controller, running in a container, needs access to LXD running on the host where it resides, which it cannot reach.

Manual Bridging

Bridging remains the way to do network setup for Juju LXD clouds. Some scenarios for this follow. I still used MAAS nodes as hosts when testing these, but I set up only the physical devices with the MAAS GUI.

Auto-Assigned or Static IP, and Netplan

For a Bionic host the contents of /etc/netplan/50-cloud-init.yaml should look the same whether the IP is static or the device is configured to get an automatically assigned address:

# This file is generated from information provided by
# the datasource.  Changes to it will not persist across an instance.
# To disable cloud-init's network configuration capabilities, write a file
# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
# network: {config: disabled}
network:
    version: 2
    ethernets:
        enp0s25:
            addresses:
            - 10.0.0.9/24
            gateway4: 10.0.0.1
            match:
                macaddress: f4:4d:30:65:2c:b7
            mtu: 1500
            nameservers:
                addresses:
                - 10.0.0.1
                search:
                - maas
            set-name: enp0s25

First I did what the comment suggests and added a new configuration file to disable cloud-init network config. The contents of the file were then changed to look like this:

network:
    version: 2
    ethernets:
        enp0s25:
            match:
                macaddress: f4:4d:30:65:2c:b7
            mtu: 1500
            set-name: enp0s25
    bridges:
        br0:
            interfaces: [enp0s25]
            nameservers:
                addresses:
                - 10.0.0.1
                search:
                - maas
            addresses:
            - 10.0.0.9/24
            gateway4: 10.0.0.1
            parameters:
                forward-delay: 0
                stp: false

This essentially creates a new bridge with our physical device added to it, and all nameserver, address and gateway configuration moved to the bridge.

Following this, simply run “sudo netplan apply”, then use “br0” as the existing bridge when running “lxd init”. You should be able to run “juju add-cloud” using this host as part of an LXD cloud definition and bootstrap to it. Adding machines works fine and the container addresses appear as observed under MAAS’ view of the subnet where they reside.

Auto-Assigned or Static IP, and bridge-utils

On earlier versions of Ubuntu, run “sudo apt install bridge-utils” then get the latest LXD version by running “sudo apt purge lxd” and “sudo snap install lxd”.

Bridging requires editing either /etc/network/interfaces or one of the files under /etc/network/interfaces.d. In the case of a Xenial cloud image, there is a file /etc/network/interfaces.d/50-cloud-init.cfg that looks like this:

# This file is generated from information provided by
# the datasource.  Changes to it will not persist across an instance.
# To disable cloud-init's network configuration capabilities, write a file
# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
# network: {config: disabled}
auto lo
iface lo inet loopback
    dns-nameservers 10.0.0.1
    dns-search maas

auto enp0s25
iface enp0s25 inet static
    address 10.0.0.9/24
    gateway 10.0.0.1
    mtu 1500

To bridge the enp0s25 device, the content becomes:

auto lo
iface lo inet loopback
    dns-nameservers 10.0.0.1
    dns-search maas

auto enp0s25
iface enp0s25 inet manual
    mtu 1500

auto br0
iface br0 inet static
    address 10.0.0.9/24
    gateway 10.0.0.1
    bridge_fd 15
    bridge_ports enp0s25
    bridge_stp off
    hwaddress f4:4d:30:65:2c:b7
    mtu 1500

Restart the machine, setup LXD and the new cloud as in the first example, and bootstrap away.

DHCP and bridge-utils

For Xenial and earlier Ubuntu cloud images, /etc/network/interfaces.d/50-cloud-init.cfg looks like this:

# This file is generated from information provided by
# the datasource.  Changes to it will not persist across an instance.
# To disable cloud-init's network configuration capabilities, write a file
# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
# network: {config: disabled}
auto lo
iface lo inet loopback
    dns-nameservers 10.0.0.1
    dns-search maas

auto enp0s25
iface enp0s25 inet dhcp
    mtu 1500

Upgrade LXD, install bridge-utils, and change the contents to look like this:

auto lo
iface lo inet loopback
    dns-nameservers 10.0.0.1
    dns-search maas

auto enp0s25
iface enp0s25 inet manual
    mtu 1500

auto br0
iface br0 inet dhcp
    bridge_fd 15
    bridge_ports enp0s25
    bridge_stp off
    hwaddress f4:4d:30:65:2c:b7
    mtu 1500

Restart, initialise LXD, add a new cloud, and bootstrap.

DHCP and Netplan

Using DHCP with Bionic on MAAS presented some difficulties. When configuring the network device for DHCP before deploying, the machine appeared to be assigned an address that MAAS was unaware of, and therefore could not be accessed.

Ultimately the machine was provisioned with a static address, /etc/netplan/50-cloud-init.yaml was changed, then the machine physically accessed to check its assigned address. This is the Netplan configuration that worked:

network:
    version: 2
    ethernets:
        enp0s25:
            match:
                macaddress: f4:4d:30:65:2c:b7
            mtu: 1500
            set-name: enp0s25
    bridges:
        br0:
            interfaces: [enp0s25]
            dhcp4: true
            parameters:
                forward-delay: 15
                stp: false

The Juju Show #40 - OpenStack Rocky
Juju 2.5.0 Beta 3 Release Notes
#2

I’ve never been able to get a DHCP-enabled bridge configured on a Bionic cloud instance.


#3

Indeed, we are finding the same issue. This is under investigation.


The Juju Show #41 - upgrading your Ubuntu infrastructure
#4

I’ve been playing with this for a few weeks now and testing out several different deployment scenarios. It’s a really great vendor-neutral option and I’ve seen great success with it. I have a question regarding deployment:

In a ‘remote’ deployment, which implies the operator’s machine is outside the cloud, where is it expected that the controller deploys to? I haven’t managed to get this to work unless the machine I’m calling juju bootstrap from is a member of the LXD cluster, and thus, not remote.


#5

In Juju 2.5 you should be able to bootstrap to a remote lxd cluster from any machine that can reach the lxd api server. When you add-cloud you have the option for LXD and in that it asks you for the URL to the lxd host.

Enter the API endpoint url for the remote LXD server:

From there the Juju client will talk to the API of the Juju controller that’s up. The big thing there is that the LXD cluster needs to be reachable from the operator so that we can make API calls.


#6

To elaborate…

The controller goes into a container on the cloud much like any other substrate, which is why we must use bridges instead of MAC VLANs as mentioned in the original post - that controller needs to access the LXD API, which for at least one of the cluster nodes will be the machine hosting it.


#7

I’m exclusively using bridges, however these are defined in the LXD nodes as br0, and when I attempt to bootstrap the LXD cluster it’s complaining that lxdbr0 doesn’t exist. If I add that bridge, it then tries to address the node with the (host-only) lxdbr0 address.


#8

Can you create a container on the cluster independently of Juju?

The cluster needs a homogeneous environment, so the same bridge must exist on all the cluster nodes, and the default LXD profile needs to have a NIC device with that bridge as the parent.


#9

Ah ok, that makes sense - it’s possible that the cluster isn’t working - I wasn’t sure if the controller container used the default profile, I’d figured it was shipping it’s own in with a hardcoded network device. I’ll give that a check now.