Juju seems to be using the wrong subnet address

conjure-up
help-needed

#1

We have recently changed/updated our network infrastructure which now means that the server runs Juju is now using a completely new IP subnet, using MAAS, we can commission & deploy new machines via MAAS Pod successfully and just fine, however using conjure-up, juju bootstrap fails with ERROR failed to bootstrap model: cannot start bootstrap instance: 44hsbf: Unable to allocate static IP due to address exhaustion:

2019-09-01 21:26:15,525 [DEBUG] conjure-up/charmed-kubernetes - step.py:268 - Executing script: /home/<username>/.cache/conjure-up/charmed-kubernetes/steps/04_enable-cni/before-config
2019-09-01 21:26:15,575 [DEBUG] conjure-up/charmed-kubernetes - telemetry.py:17 - Showing screen: Configure Applications2019-09-01 21:26:16,466 [DEBUG] conjure-up/charmed-kubernetes - maas.py:20 - Found endpoint: http://10.1.1.2:5240/MAAS/ for cloud: <server>-01
2019-09-01 21:26:16,490 [DEBUG] conjure-up/charmed-kubernetes - events.py:52 - Setting MAASConnected at conjureup/controllers/juju/configapps/gui.py:39
2019-09-01 21:26:17,799 [DEBUG] conjure-up/charmed-kubernetes - telemetry.py:17 - Showing screen: Bootstrapping Controller
2019-09-01 21:26:18,086 [INFO] conjure-up/charmed-kubernetes - common.py:74 - Bootstrapping Juju controller...
2019-09-01 21:26:18,087 [DEBUG] conjure-up/charmed-kubernetes - telemetry.py:32 - Juju Bootstrap: Started
2019-09-01 21:26:18,095 [DEBUG] conjure-up/charmed-kubernetes - juju.py:252 - bootstrap cmd: ['/snap/bin/juju', 'bootstrap', '<server>-01', 'conjure-up-<server>-01-d4d', '--default-model', 'conjure-charmed-kubernet-90f', '--config', 'image-stream=daily', '--credential', 'conjure-<server>-01-a3c']
2019-09-01 21:26:18,114 [DEBUG] conjure-up/charmed-kubernetes - events.py:52 - Awaiting Bootstrapped at conjureup/controllers/juju/bootstrap/gui.py:43
2019-09-01 21:26:46,848 [ERROR] conjure-up/charmed-kubernetes - common.py:60 - Error bootstrapping controller: ['Creating Juju controller "conjure-up-<server>-01-d4d" on <server>-01', 'Looking for packaged Juju agent version 2.6.5 for amd64', 'Launching controller instance(s) on <server>-01...', 'ERROR failed to bootstrap model: cannot start bootstrap instance: 44hsbf: Unable to allocate static IP due to address exhaustion.']
2019-09-01 21:26:46,852 [DEBUG] conjure-up/charmed-kubernetes - events.py:52 - Setting Error at conjureup/events.py:149
2019-09-01 21:26:46,872 [ERROR] conjure-up/charmed-kubernetes - events.py:161 - Unhandled exception in <Task finished coro=<BaseBootstrapController.run() done, defined at /snap/conjure-up/1055/lib/python3.6/site-packages/conjureup/controllers/juju/bootstrap/common.py:15> exception=BootstrapError('Unable to bootstrap (cloud type: maas)',)>
Traceback (most recent call last):
  File "/snap/conjure-up/1055/lib/python3.6/site-packages/conjureup/controllers/juju/bootstrap/common.py", line 21, in run
    await self.do_bootstrap()
  File "/snap/conjure-up/1055/lib/python3.6/site-packages/conjureup/controllers/juju/bootstrap/common.py", line 65, in do_bootstrap
    app.provider.cloud_type))
conjureup.errors.BootstrapError: Unable to bootstrap (cloud type: maas)
2019-09-01 21:26:46,890 [DEBUG] conjure-up/charmed-kubernetes - __init__.py:27 - Showing dialog for exception: Unable to bootstrap (cloud type: maas)
2019-09-01 21:26:57,959 [DEBUG] conjure-up/charmed-kubernetes - events.py:52 - Setting Shutdown at conjureup/events.py:145
2019-09-01 21:26:57,962 [DEBUG] conjure-up/charmed-kubernetes - events.py:52 - Received Shutdown at conjureup/events.py:176
2019-09-01 21:26:57,962 [INFO] conjure-up/charmed-kubernetes - events.py:180 - Shutting down
2019-09-01 21:26:57,962 [INFO] conjure-up/charmed-kubernetes - app_config.py:192 - Storing conjure-up state
2019-09-01 21:26:57,966 [INFO] conjure-up/charmed-kubernetes - app_config.py:207 - State saved
2019-09-01 21:26:57,967 [DEBUG] conjure-up/charmed-kubernetes - events.py:200 - Cancelling pending task: <Task pending coro=<BootstrapController.wait() running at /snap/conjure-up/1055/lib/python3.6/site-packages/conjureup/controllers/juju/bootstrap/gui.py:43> wait_for=<Future pending cb=[<TaskWakeupMethWrapper object at 0x7fb5fac4ae28>()]>>

Is there any way to update juju bootstrap configurations to use the right subnet network?


#2

I’m pretty sure Juju just uses the underlying MAAS. Are you sure the IP of your MAAS controller didn’t change? Also, an error like

44hsbf: Unable to allocate static IP

Can you go to:
http://10.1.1.2:5240/MAAS/#/machine/44hsbf

And make sure you don’t have that machine’s network interfaces set to be a static address in the wrong subnet/space ?

The other place to look at would be ~/.local/share/clouds.yaml and possibly ‘credentials.yaml’ but I wouldn’t expect any of them to have configuration wrt what subnets to use for bootstrap. There might be something if it was already bootstrapped, but the since there isn’t a controller running already, there isn’t much other information Juju could be using to decide what to do.


#3

Hi Jameinel,

Thanks for helping out, i can’t access that machine because the moment conjure-up fails with juju’s bootstrap, it shows the log and the machine gets released/deleted

Clouds.yaml:

clouds:
  <server>-01:
    type: maas
    auth-types: [oauth1]
    endpoint: http://10.1.1.2:5240/MAAS/

#4

I updated the maas controller’s ip and updated it’s IP via https://maas.io/docs/troubleshooting#heading--need-to-reconfigure-server-ip-address


#5

One thing we know for sure is that if we reverted the IP, conjure-up would succeed with bootstrap just fine, but we can’t get it to use 10.1.1.x

However using maas to manually create a vm on it’s pod, it would succeed creating a vm on 10.1.1.x


#6

If I understand correctly, when selecting my maas in conjure-up and start to for example conjure-up kubernetes, it should leave the machine subnet to be decided by MAAS or does juju bootstrap has a influence on that decision? If it does, how can I configure that?


#7

I thought the machine ids that MAAS handed out were static to a given machine. You might be able to just look through your registered nodes and see how they are configured (I’m guessing you don’t have 100s of nodes).

You can also use “juju bootstrap --keep-broken” (though I don’t know how you pass that in through conjure-up"). That should prevent bootstrap from destroying the machine when bootstrap fails.


#8

I don’t know how to do that so, i posted a new question https://askubuntu.com/questions/1170215/how-can-i-pass-keep-broken-when-conjure-up-for-juju-bootstrap


#9

You could try just doing “juju bootstrap” without conjure-up, just to see if you can get there.


#10

I tried without bootstrap:

Creating Juju controller "test-controller" on <server>
Looking for packaged Juju agent version 2.6.5 for amd64
Launching controller instance(s) on <server>...
bootstrap failed but --keep-broken was specified.
This means that cloud resources are left behind, but not registered to
your local client, as the controller was not successfully created.
However, you should be able to ssh into the machine using the user "ubuntu" and
their IP address for diagnosis and investigation.
When you are ready to clean up the failed controller, use your cloud console or
equivalent CLI tools to terminate the instances and remove remaining resources.

See `juju kill-controller`.
ERROR failed to bootstrap model: cannot start bootstrap instance: h8f8fp: Unable to allocate static IP due to address exhaustion.

but yet the machine doesn’t stay, I can’t find it in maas, it gets released immediately


#11

I ran the command again, and checked maas quickly before the machine gets removed, and I can confirm it’s still using the old subnet network


#12

That would be a MAAS declaration of the machine, not Juju. So you’ll need to find the associated Node in MAAS (is this from a KVM pod?), and update the definition in MAAS. If it is a pod from KVM, probably you need to update the Pod definition itself?


#13

Interesting, if that’s the case, when I compose a new machine, it uses the new subnet mask.

Yes it is a KVM pod, how can I update it’s definition?


#14

Thanks a lot @jameinel!!! I moved the old subnet to another fabric and vlan and now bootstrap works, I’ll try to conjure up after it finishes now and see how it goes!