No Obvious Space for Container

manadart · 14 September 2018 10:10

More than one CI test fails with:
ERROR ('0/lxd/1', 'no obvious space for container "0/lxd/1", host machine has spaces: "ha-space", "space-0"')

This will occur randomly based on the node acquired on the finfolk machine.

At the time of writing there are 14 nodes (juju-qa-maas-node-31.maas through juju-qa-maas-node-41.maas). Of these, all have interfaces configured for the 10.0.30.0/24 subnet (space-0), but 3 also have a device in the 192.168.4.0/24 subnet (ha-space). These will be the ones blowing up tests if they happen to be acquired.

Now back in the day, Nicholas (balloons) informed our team that we could use these machines for testing. I myself created a new subnet and space. I suffixed the space name with “manadart” so it was obviously mine, and I have since deleted it.

I wonder if these issues are side-effects from someone’s space and subnet testing.

In any case, we have 2 obvious options:

Write tests to be explicit about the space they bind to when provisioning.
Make the finfolk networking homogeneous, with a single space.

I lean towards option 2, because option 1 couples the CI tests more tightly to the testing environment. Also because the additional spaces look like someone’s testing arrangement that has not been reverted.

simonrichardson · 14 September 2018 13:22

I personally think option 2 sounds like the best path forward, as option 1 then ties us to CI tests, which only gives us potential tech debt if anything else changes in finfolk.

jameinel · 25 September 2018 09:53

Networking is one of those places where we can’t really help but be tied to the environment we are in. The intent of modeling networks is so that at deploy time you can associate artifacts in your bundles into the location that they are being deployed. (what is the space you want for this particular traffic.)

While we could try to keep things as simplified as possible, then you won’t ever be able to test any of the advanced features (when you really do have multiple networks and want applications conversing on different networks).

In talking with you directly, it seems you did go ahead with option (1), to make it possible to add new tests without affecting existing tests.

manadart · 1 October 2018 15:55

When changes to the network-health-maas-2-2 job first landed, the test went green. Since then, all tests targeting finfolk have been failing with:

13:19:54 ERROR juju.cmd.juju.commands bootstrap.go:545 failed to bootstrap model: bootstrap instance started but did not change to Deployed state: instance “node-cee6f43c-71e8-11e5-aa2a-525400c43ce5” is started but not deployed

Watching the MAAS console while tests are running shows that the target node never gets out of the “deploying…” state, but one can provision a node, which comes live in about four minutes.

I will look into this when I can.

manadart · 15 October 2018 07:00

This issue is resolved for:

nw-network-health-maas-2-2, by using a bundle with space bindings.
nw-container-networking-maas-2-2, with a modified test script that accepts a space constraint.