In this post I will be detailing analysis, design and progress with regard to the ongoing work around Juju networking spaces. It can be considered a work-in-progress, with additions and comments welcome.
The intent is to:
- Materialise value from analysis done so far, by disseminating it to the team.
- Explore deficiencies, potential improvements and design decisions.
- Report development progress.
Juju stores addresses in the following locations:
- The machines collection has fields for addresses, sourced from the provider, and machineaddresses, sourced from the local agents.
- The ip.addresses collection has addresses related to entries in the linklayerdevices collection.
- CAAS addresses reside in the collections cloudservices and cloudcontainers.
- The controllers collection contains two documents with host/port entries for controller connection endpoints:
- One with all available endpoints, suitable for use by clients.
- Another with endpoints suitable for use by agents, which may be a proper subset of those for clients if a controller management space has been configured.
The machineaddresses field in the machines collection is populated via the machiner worker. It can be configured to clear addresses on start-up, which will cause machineaddresses to be nil. Otherwise addresses are updated from the results of a call to the standard library’s
net.InterfaceAddrs method. These are only ever set upon worker start-up.
The addresses field is kept up-to-date via the instancepoller worker. It uses the provider implementation of
instance.Addresses in order to source them. On MAAS these addresses are also decorated with a space name and provider space ID where known.
Whenever machine addresses are updated, the PreferredPublicAddress and/or PreferredPrivateAddress fields may be updated.
Link-Layer Device Addresses
The machiner worker also populates link-layer devices and addresses. Each time it runs, it interrogates all network devices on the machine, gathering detailed information (see
params.NetworkConfig). It then calls
SetObservedNetworkConfig on the provisioner API, where the provider network config is obtained and merged with the machine configuration before linklayerdevices and ip.addresses are populated.
Addresses in the cloudservices collection are updated by the caasunitprovisioner’s application worker, by asking k8s about the service (application) directly.
Addresses in the cloudcontainers collection are updated by the same worker when there is a cluster change event. The addresses are sourced from each unit’s pod.
API endoints are set at bootstrap from the initial machine’s provider-sourced instance addresses.
The peergrouper worker maintains these entries.
Link-Layer Device Addresses
These addresses are used by the
network/containerizer package to reason about container spaces, host devices and bridges when configuring networking for containers.
These are watched by machine agents that maintain a local configuration file with endpoints that can be used to communicate with controllers.
Identification of Spaces and Subnets
Spaces are identified by name and subnets by CIDR. This means:
- Issues associated with renaming a space.
- The inability to work with subnets in different networks that have the same CIDR.
Identification of spaces and subnets by unique IDs is the first task to be undertaken as part of the remodelling work.
Address filtering by space will then be changed to work via space IDs rather than names.
Incomplete Device Address Information
When the provisioner API server receives network configuration gathered by the machiner, it gathers provider configuration. This is collected by the provider as
network.IntefaceInfo, converted to
params.NetworkConfig, merged with the machine-sourced data, then converted into
state types for persistence.
At each of the three conversions some of the fidelity is lost.
In order to reconcile incoming link-layer device addresses with the correct subnets, we need to maintain and transport the provider IDs for subnet (and probably network) so that these can be used to relate addresses to the new subnet IDs.
Network spaces are supported by the MAAS and AWS providers.
Only MAAS currently supports controller configuration for juju-ha-space (used for Mongo replica-set communication in HA) and juju-mgmt-space (the management plane on which agents connect to controllers). This is because it is the only provider that decorates addresses in the machines collection with a space name.
After remodelling spaces and addresses, the intent is to:
- Make space support available to other providers.
- Detect and decorate provider-sourced machine addresses with known space IDs, so that the controller configuration options for spaces become generally available.
The API server logic for adding subnets pre-dates the auto-loading of spaces and subnets from the provider.
When adding a subnet, network info is queried to create a cache of spaces and subnets. Subnet data is gathered from the provider. Space data is gathered from state. The incoming request is compared to the cached data to ensure that the entities referred to exist according to the provider.
- Remove the cache logic. It over complicates add-subnet assuming more than one subnet is added at a time by a user.
- Replace the add-subnet command with new commands that allow linking and unlinking of subnets to spaces. Part of the add-subnet functionality is currently replaced by reload-spaces.
- Possibly in future work, investigate how we might allow manual addition of subnets for say, the manual provider, or make auto-loading work there.
Spaces are Identified by ID
Spaces are now stored with a monotonically increasing ID in similar fashion to machines. Migration and upgrade steps are in-place to handle this change.
Subnets are Identified by ID
As with spaces, subnets have a numeric ID. Migration and upgrade steps are in place, and the names package no longer validates subnet tags as a CIDR.