The controller model should host an apt-cache

Bootstrapping and installing applications is very slow. That’s because every layer of every charm, then every unit of every application runs apt-get update then apt-get upgrade.

It we installed an apt-cache on the controller model, then pointed units to that, we would see an immediate speed up in installation times.

This is important because juju bootstrap is the “first impression” of Juju as a product. This process should be as fast as possible.

3 Likes

I definitely agree, although I do think that it should be fairly easy to disable / use an external cache. Alternately, it would be cool if, when using MAAS, the new Juju controller would identify MAAS’ apt-cache and configure clients to use that!

The main issue with this is that if you are “bootstrapping” then the controller node is torn down and created again each time, which means you’re not benefiting from the cache, because you’re creating a new cache. It isn’t hard to do something like:

clouds:
  lxd:
    type: lxd
    auth-types: [certificate]
    endpoint: https://10.210.24.1:8443
    config:
      apt-http-proxy: http://192.168.0.50:3142
      apt-https-proxy: http://192.168.0.50:3142
      enable-os-upgrade: false

Where in this case 192.168.0.50 is a machine running apt-cacher-ng for my entire network. Meaning every bootstrap + teardown, + apt update on the host machine all is using the same proxy, and thus getting all the same packages cached. And while it is plausible that you’re deploying and redeploying more often than you are bootstrapping, it is a lot more common that you’d rather have the cache persist outside the lifetime of the controller.

As a slight counter example, the Ubuntu OpenStack CI (UOSCI) is constantly redeploying models, and generally keeps a controller per Jenkins slave around for ~1 week.

But it still gives you better response if you use an external apt cache that persists outside of the lifetime of a controller. Without adding complexity to Juju for managing yet-another-thing.

1 Like

The reason that I thought the controller model would be a good home for this is that location would be most helpful for the novice user.

I’m not familiar enough with the specific internals to comment on the complexity argument.

My naive view is that apt-cache would live as a visible application in the controller model. We then set the relevant environment variables to point to that application on any subsequent deployments

It’s not, if you have already bought into Juju. If you are evaluating it, our current behaviour comes across as lethargic.

You could certainly “juju deploy apt-cacher -m controller” and then update model-config/model-defaults after the fact. (AFAIK, apt-cacher doesn’t exist yet as a charm, though it doesn’t seem like a particularly hard one to write.)
But its certainly something that I would be happy for people to plug in as a charm, rather than something we do out of the box.

1 Like

The charm doesn’t exist. Part of the rationale for starting the thread is to gather an impression about whether it would be taken up

My 2 cents -

The public clouds provide apt caches local to the az. To verify this, deploy something to gce or aws and cat your apt sources.list.

MAAS can be configured to run a local apt-cache which you can configure juju to use.

This leaves a few cloud types without native apt-cache options; openstack, lxd and manual.

I agree with @timClicks reasoning, but I feel maintaining the apt-cache on the controller and as a part of juju could be more of a PIA then its worth. Some of my apt-caches grow to 100+GB and experience heavy iops and network load (depending). I would hate to see the resource usage of the apt-cache interfere with the operational capability of the juju controller in any way. For this reason alone, I don’t feel that running an apt-cache on the controller is a good idea.

As far as speeding up the lxd bootstrap, possibly we could provide a pre-baked lxd juju-controller image instead of downloading/installing it all at bootstrap.

Just to drop in I’m with @jamesbeedy in that it’s something that depends on where you’re doing things. I’m generally -1 on adding things that a user might be required to manage (disk space issues due to a cache, network, refresh of updates, ESM impacts, etc) that they don’t know is there. I’m with @jameinel that a charm that does this with a tutorial on setting it up is a great way to go. There’s precedent for this with things like monitoring/etc. On the one hand, the fact that it doesn’t exist speaks a little bit to the impact but maybe there is something existing that someone’s not tossed out into the community and so this is a nice call to open up something someone is playing with in private?

I’m curious how I would set this up @jamesbeedy to speed up installation of some mellanox software… What do I need to configure to make this happen?