Juju caching CaaS images in MicroK8s

kos.tsakalozos · 25 April 2019 16:05

Hi juju people,

The juju install hook detects the existence of MicroK8s and pull the images it needs cached into containerd.

@knkski got the following error while working on a Travis job:

$ sudo snap install juju --classic --channel edge
error: cannot perform the following tasks:
- Run install hook of "juju" snap if present (run hook "install": 
-----
/snap/bin/microk8s.kubectl
/snap/bin/microk8s.ctr
Since Juju 2 is being run for the first time, downloading latest cloud information.
Fetching latest public cloud list...
Your list of public clouds is up to date, see `juju clouds`.
Going to cache images: docker.io/jujusolutions/jujud-operator:2.6-rc1 and docker.io/jujusolutions/juju-db:4.0.
Pulling: docker.io/jujusolutions/jujud-operator:2.6-rc1.
ctr: failed to dial "/var/snap/microk8s/common/run/containerd.sock": context deadline exceeded

That job was snap installing microk8s and then juju. The reason for this error is that containerd in MicroK8s takes some time to start and juju install hook is not waiting for a proper initialization of MicroK8s. For your convenience we have introduced the microk8s.status --wait-ready command that will block until the cluster is up (use the --timeout flag so that you do not wait forever). However, microk8s.status is not available in all MicroK8s channels so you need to manually check that that MicroK8s is up like so: https://github.com/ubuntu/microk8s/blob/master/microk8s-resources/wrappers/microk8s-status.wrapper#L55

Also you should consider making the cache populating operation optional since there is are number of other things that may go wrong in
https://github.com/juju/juju/blob/develop/snap/hooks/install
For example, what would happen if docker.io is down, or you are behind a proxy, or the cached image is unavailable, or MicroK8s is not configured/running properly, you would rather not fail the juju installation, right? Could you also revise this PR https://github.com/ubuntu/microk8s/pull/398 under this context?

Finally, I am not sure if this plays any role in the pull operation, but I should mention that the images Kubernetes is pulling into containerd go under the k8s.io namespace whereas here we cache the images to the default namespace (see microk8s.ctr namespaces).

cory_fu · 25 April 2019 16:07

@wallyworld @kelvin.liu

veebers · 25 April 2019 23:51

Hi @kos.tsakalozos, sorry about that. You’re right the install hook should ignore any caching failures.

The edge snap config was out of date so I’m updating it now, the next build should have the fix.

We have a bit of a weird setup with our snapcrafts where they live in different places for different purposes (we have a release one, an edge one and the the one in github.com/juju/juju for instance).
We’re in the process of cleaning this up as well