Juju controller stuck | Notes suspended since cloud credential is not valid


#1

HI Everyone

I have a controller that’s stuck and literally just not doing anything.

The last thing I tried to do was destroy my openstack model which got stuck getting rid of the machines.

I then tried juju destroy-model openstack-devzero --force --no-wait. That also resulted in Waiting for the model to be removed.

I now see that all my models are suspended.
When I do a juju status on the controller I’m getting “suspended since cloud credential is not valid”

Check out this output below:

root@maas-region-ctl:~# juju controllers
Use --refresh option with this command to see the latest information.

Controller                      Model       User   Access     Cloud/Region        Models  Nodes    HA  Version
SS-DD-azure.juju.devzero.co.za  controller  admin  superuser  azure/centralus          1      1  none  2.5.3  
aws.juju.devzero.co.za          controller  admin  superuser  aws/us-east-1            2      1  none  2.4.3  
google.juju.devzero.co.za       controller  admin  superuser  google/us-east1          3      1  none  2.4.3  
juju-controller                 controller  admin  superuser  maas.devzero.co.za       2      1  none  2.4.6  
juju.maas.devzero.co.za*        controller  admin  superuser                           8      -     -  2.6.3  

root@maas-region-ctl:~# juju models
Controller: juju.maas.devzero.co.za

Model              Cloud/Region        Type  Status     Machines  Cores  Units  Access  Last connection
awx-ansible        maas.devzero.co.za  maas  suspended         4      8  4      admin   29 minutes ago
controller*        maas.devzero.co.za  maas  suspended         1      4  -      admin   just now
default            maas.devzero.co.za  maas  suspended         0      -  -      admin   2019-05-09
devzero-openstack  maas.devzero.co.za  maas  suspended        18      -  24     admin   2 minutes ago
elastic-test       maas.devzero.co.za  maas  suspended         5     14  5      admin   5 minutes ago
ha-awx-ansible     maas.devzero.co.za  maas  suspended         4     16  4      admin   9 hours ago
kuber-core         maas.devzero.co.za  maas  suspended         4     14  8      admin   2019-05-23
openstack-devzero  maas.devzero.co.za  maas  suspended         0      -  -      admin   11 minutes ago

root@maas-region-ctl:~# 
root@maas-region-ctl:~# 
root@maas-region-ctl:~# 
root@maas-region-ctl:~# juju status 
Model       Controller               Cloud/Region        Version  SLA          Timestamp  Notes
controller  juju.maas.devzero.co.za  maas.devzero.co.za  2.6.3    unsupported  03:21:13Z  suspended since cloud credential is not valid

Machine  State    DNS              Inst id  Series  AZ       Message
0        started  165.117.214.185  pagta8   bionic  default  Deployed

The Juju Show #51 - Juju + Open Source Mano : June 18 13:00 UTC
#2

@dvnt,

This means that the cloud credential that you’ve used on the models is no longer valid from the perspective of your cloud provider. For example, it could have expired.

You need to make sure that your credential is valid from cloud provider perspective, for e.g. change password. What you need to do depends on what went wrong with your existing credential.

Then, you’d need to update that credential on your local client using ‘juju add-credential --replace’.
Once done, you need to let your controller know what the updated credential content is using ‘juju update-credential’.

Note that both local and remote credential need to have the same name for this update to succeed. If you do not know the name of the remote credential, first run ‘show-model’ for any of the suspend models. Credential name will be in the output.


#3

Hey Anastasia

So I tried doing that, and I got returned this

root@maas-region-ctl:~# juju update-credential maas.devzero.co.za admin-maas.devzero.co.za
Credential valid for:
  awx-ansible
  kuber-core
  devzero-openstack
  ha-awx-ansible
  elastic-test
  controller
  default
Credential invalid for:
  openstack-devzero:
    couldn't find instance "xbsqgk" for machine 0
    couldn't find instance "cacsd7" for machine 1
    couldn't find instance "qbdggw" for machine 2
    couldn't find instance "efcxrt" for machine 3
    couldn't find instance "fekt8m" for machine 4
    couldn't find instance "cxbg6f" for machine 5
Controller credential "admin-maas.devzero.co.za" for user "admin" on cloud "maas.devzero.co.za" not updated: some models are no longer visible.
root@maas-region-ctl:~#

#4

@dvnt,

This means that we cannot update controller credential because it is not valid for at least one model that uses it.

There are 2 things that you can do based on your requirements.

  1. The credential is not valid for openstack-devzero model because when Juju uses that credential, it can not reach the cloud instances for the listed machines. This is most likely because the instances have been deleted via cloud provider interface? If this is the case and you definitely don’t want to see these machines again, you need to manually force remove these machines "remove-machine [0,1…] --force’. Make sure you are on the right model. Note that this operation may fail depending on what Juju version you are using. However, once the machines are removed, update credential on controller should succeed.

  2. You may change the cloud credential that openstack-devzero uses if you have another valid credential. Use ‘set-credential’ command to achieve this. However, even if your new credential is valid, you may still have issues setting new cloud credential on the model if the cloud instances for these machines have been removed. In this case, you need to either ‘remove-machine --force’ or set credential for this model to “” using db surgery. Once this model does not use the credential you want to update, the update of credential on the controller for other models should succeed.


#5

@anastasia-macmood thanks for suggestions!

Sorry for the long post, but I hope someone finds this useful someday

:scream: I don’t know why this controller ended up being so unhappy.
In the state that it was in, any command I issued was simply being ignored even using –force and –no-wait

What I ended up doing was creating an additional credential for the maas cloud and then switching through each model and telling the controller to use the new credential

root@maas-region-ctl:~# juju add-credential maas.devzero.co.za
Enter credential name: new

Using auth-type "oauth1".

Enter maas-oauth:

Credential "new" added locally for cloud "maas.devzero.co.za".

root@maas-region-ctl:~# juju set-credential -m openstack-devzero maas.devzero.co.za new
Did not find credential remotely. Looking locally...
Uploading local credential to the controller.
Changed cloud credential on model "openstack-devzero" to "new".
root@maas-region-ctl:~#

root@maas-region-ctl:~# juju status
Model              Controller               Cloud/Region        Version  SLA          Timestamp  Notes
openstack-devzero  juju.maas.devzero.co.za  maas.devzero.co.za  2.4.6    unsupported  12:29:17Z  attempt 5 to destroy model failed (will retry):  model not empty, found 2 machines (model not empty)

Machine  State    DNS              Inst id  Series  AZ       Message
3        stopped  165.117.214.184  efcxrt   bionic  default  Deployed

Finally model was destroyed. However all the other models were still suspended, so I cycled through each of them and added the “new” credential

root@maas-region-ctl:~# juju models
Controller: juju.maas.devzero.co.za

Model              Cloud/Region        Type  Status     Machines  Cores  Units  Access  Last connection
awx-ansible        maas.devzero.co.za  maas  suspended         4      8  4      admin   9 hours ago
controller         maas.devzero.co.za  maas  suspended         1      4  -      admin   just now
default            maas.devzero.co.za  maas  suspended         0      -  -      admin   2019-05-09
devzero-openstack  maas.devzero.co.za  maas  suspended        18      -  24     admin   9 hours ago
elastic-test       maas.devzero.co.za  maas  suspended         5     14  5      admin   9 hours ago
ha-awx-ansible     maas.devzero.co.za  maas  suspended         4     16  4      admin   18 hours ago
kuber-core         maas.devzero.co.za  maas  suspended         4     14  8      admin   8 hours ago

root@maas-region-ctl:~# juju set-credential -m awx-ansible maas.devzero.co.za new
Found credential remotely, on the controller. Not looking locally...
Changed cloud credential on model "awx-ansible" to "new".
root@maas-region-ctl:~#

    ===snip ===

root@maas-region-ctl:~# juju models
Controller: juju.maas.devzero.co.za

Model           Cloud/Region        Type  Status     Machines  Cores  Units  Access  Last connection
awx-ansible     maas.devzero.co.za  maas  available         4      8  4      admin   9 hours ago
controller      maas.devzero.co.za  maas  available         1      4  -      admin   just now
default         maas.devzero.co.za  maas  available         0      -  -      admin   2019-05-09
elastic-test    maas.devzero.co.za  maas  available         5     14  5      admin   9 hours ago
ha-awx-ansible  maas.devzero.co.za  maas  available         4     16  4      admin   18 hours ago
kuber-core      maas.devzero.co.za  maas  available         4     14  8      admin   8 hours ago

#6

@dvnt,

Thank you for the write up!

Indeed, once you have set-credential on the ‘faulty’ model to the new one, you could have re-run ‘update-credential’ command for the old credential and that operation would have succeeded. It would have saved you from ‘set-credential’ individually on all other models.