What are your tips for running "Juju in production"?

Some leading questions perhaps, hopefully this will be the start of a useful discussion though?

  • How granular are your models? Do you restrict access via
  • When should you enable HA controllers?
  • How frequently do you make backups? What do you backup? Have you tested a restore?
  • Have you enabled any specific firewall rules?

You should enable HA controllers in any production environment. Single controllers are fine for testing and playing, but I’d say if you are in a production environment, then HA controllers is essential.


We are just discovering this at Scania atm. We have been up and running for some 3 months, but still need more experience.

We have 3 units of controllers in ha behind 2xhaproxy.

A haproxy charm with ‘let’s encrypt’ is well needed since we need to set that up manually at the moment.

Really, it’s just weird that it doesn’t exist bundles for this purpose already in the charm store.

1 Like

Hey, you might be interested in the Let’s Encrypt proxy charm that I’m designing right now, then. It is something that I have a need for so I started work on it 4 days ago. I’m thinking I should be finished fairly soon. Maybe within a week, but no promises.

There is an existing SSL proxy charm, but I don’t know whether or not it is scalable. I thought I had read something that said it wasn’t scalable, but now I can’t find it.

The charm I’m working on will allow you to scale the proxy to any number of units while making sure that the certs are replicated to each unit and that only the leader is actually generating the certificates, so that you don’t exceed the Let’s Encrypt rate limit.

Then you just point your DNS at the proxy servers and it will generate required certs while routing the traffic to the Juju applications that are connected based on the incoming domain.


You should connect with @lasse and @martin-hilton aswell.

I’ve used ssl-termination-proxy for nextcloud previously.

1 Like

I’m curious why you run Juju controllers behind HA Proxy. All of the juju clients already know about the multi controller nature, and will automatically fail over to another controller if the one they are currently connected to dies, and while connected they maintain the list of active controllers, so a new one coming up should get added to the list of possible contacts.
The only failure mode that I’m aware of is if you are killing controllers fast enough that you’ve rotated all controllers away in the time that a client has been disconnected. (eg, you have 3 running today, client disconnects, then by next week you have killed and started a 3 new controllers, a client connecting won’t actually know any of the new controllers to discover the rest.)

@jameinel did some great work on creating a bundle setup to help with getting monitoring setup on your controllers as well as a pre-made dashboard ready to go. I’ve updated the post around monitoring controllers with that setup and have a bundle in the store that is a bit of plug-n-play (well almost) that I encourage folks to check out.


@rick_h I had actually forgotten that you’ve already compiled a great list of resources!


In order for Juju to be applicable to large production environments a lot of work on RBAC and authentication is needed. There is a huge lack of commonly used auth backends (ex. ldap, oauth, saml etc) or integration with cloud providers auth (ex. MaaS, OpenStack, etc). On top, we have very limited RBAC and it is impossible to have user groups, per application access, per command access (ex. allow a user to change charm config but not run commands on units) etc.

Another required feature would be to allow Juju controller services decomposition, or containerization. Scalability is an issue and being able to use an external, maybe optimized, MongoDB would help or being able to spawn a Juju controller on k8s.

1 Like

Here are some useful threads:

How to put controllers behind HTTPS

Controllers exposed to the Internet should (at a minimum) be backed by TLS.

How to use an external identity provider

Look into Juju’s internals

Production users benefit from an understanding of how Juju gets its work done. Internally, Juju is a network of a software agents (jujud processes) in a star typology. The central node is the controller.

To create a report of any given agent, juju ssh into the machine, then run juju_engine_report:

$ juju ssh <machine-id>
$ juju_engine_report 

Under Kubernetes, juju ssh is unavailable. Use kubectl exec to access the operator pod (which is where the relevant agent is executing). You will also need to include the scripts to your session with source.

$ kubectl -n <model> exec -ti <application>-operator-<unit-number> bash
$ source /etc/profile.d/juju-introspection.sh
$ juju_engine_report 

The juju_engine_report provides valuable diagnostics. A useful periodic task is to run an engine report for each jujud process on each machine. Some tooling has been developed to help isolate problems and aid debugging:


I’d like to second @soumplis comment here and from our view there the RBAC and auth options are lacking. Regarding the MAAS side of things, it advertises the following features:

Authentication and Identity
Integrate with LDAP, Active Directory or SAML for central identity management and single-sign-on across multiple MAAS regions.

in addition to RBAC if you pay for support. However there is no docs on what those features are or how they actually work. See some old posts in their discourse (1, 2) asking for details, but without much response. There is a blog post about multi tenancy in maas which talks a little about the RBAC features, but that is really not enough to go on. It also talks about using Candid for LDAP.
Why no documentation? Put simply, I won’t buy unless I know what I am buying.

Speaking of Candid, what is the state of that project and what is its scope both for MAAS and Juju? I understand that it can be an external identity provider for Juju from some of the posts above, but where is the documentation?