How to avoid the "waiting for container" message

merlijn-sebrechts · 17 April 2019 13:25

I’m trying to create a “proxy charm” in Kubernetes. The charm doesn’t create a container or anything, it’s just there to represent an external service so other charms can relate to it and receive info on how to connect to the external service.

The issue is that status updates of type active seem to be overwritten by the message “waiting for container”, even though the charm doesn’t try to create a container.

The reactive file:

#/usr/bin/env python3
from charms.reactive import when, when_not
from charmhelpers.core.hookenv import (
    status_set,
    config,
)

from charmhelpers.core import hookenv

@when('config.changed.base-url')
def consumer_active():
    cfg = config()
    base_url = cfg.get("base-url")
    if base_url:
        status_set('active', '({})'.format(base_url))
    else:
        status_set('blocked', 'Please set the "base-url" config option.')

This results in the following status messages. I set the “base-url” config option around 17 Apr 2019 14:56:47+02:00. The blocked message is shown, but the active one isn’t.

Time                        Type       Status      Message
17 Apr 2019 14:56:20+02:00  juju-unit  allocating  
17 Apr 2019 14:56:20+02:00  workload   waiting     waiting for container
17 Apr 2019 14:56:45+02:00  juju-unit  executing   running start hook
17 Apr 2019 14:56:46+02:00  workload   blocked     Please set the "base-url" config option.
17 Apr 2019 14:56:46+02:00  juju-unit  executing   running leader-settings-changed hook
17 Apr 2019 14:56:47+02:00  juju-unit  executing   running config-changed hook
17 Apr 2019 14:56:48+02:00  juju-unit  idle        
17 Apr 2019 14:56:56+02:00  juju-unit  executing   running config-changed hook
17 Apr 2019 14:56:57+02:00  juju-unit  idle

I thought this might be something set by a lower layer, but this doesn’t seem the case.

layer.yaml

includes:
  - "layer:caas-base"

metadata.yaml

name: sse-endpoint
summary: Example SSE endpoint
maintainers:
  - Merlijn Sebrechts <merlijn.sebrechts@ugent.be>
description: |
  Example SSE endpoint k8s charm
tags:
  - database
series:
   - kubernetes

Is there any way to show the actual status messages instead of waiting for container? k8s charms are perfect to create proxy charms because their footprint is very low, but the current behaviour makes this use-case more or less impossible.

rick_h · 17 April 2019 13:57

@merlijn-sebrechts this is really interesting as a platform for proxy charms. Can you file this as a bug and we can look into it.

Thanks!

merlijn-sebrechts · 17 April 2019 15:08

Done; thx!

wallyworld · 17 April 2019 21:56

When deploying any charm, whether for vm based clouds or k8s, Juju starts out by setting workload status to “waiting for container/machine”. After that, whatever the charm sets as status will stick. The example seems to set “blocked” which correctly is reflected in the status history. You say that “active” is overwritten by “waiting for container” but I can’t see that’s the case from the status log (and Juju is written to never set “waiting for…” a second time. My guess would be that base_url is never set so the charm never sets “active” status. Can you add an extra line of debugging to the consumer_active() method to ensure that “active” i actually being set?

merlijn-sebrechts · 18 April 2019 08:28

Yeah, I changed the code to use the blocked type and it worked. To make sure there is no confusion, I changed the code to this:

@when('config.changed.base-url')
def consumer_active():
    cfg = config()
    base_url = cfg.get("base-url")
    if base_url:
        # Using blocked state to send messages since active states seem to be overwritten
        # by the "waiting for container" message.
        # https://discourse.jujucharms.com/t/how-to-avoid-the-waiting-for-container-message/1369
        status_set('blocked', 'active ({})'.format(base_url))
        status_set('active', 'active ({})'.format(base_url))
        log("set status to 'active ({})".format(base_url))
    else:
        status_set('blocked', 'Please set the "base-url" config option.')

status-log

Time                        Type       Status      Message
18 Apr 2019 10:22:59+02:00  workload   waiting     waiting for container
18 Apr 2019 10:22:59+02:00  juju-unit  allocating  
18 Apr 2019 10:23:08+02:00  juju-unit  executing   running start hook
18 Apr 2019 10:23:09+02:00  workload   blocked     Please set the "base-url" config option.
18 Apr 2019 10:23:09+02:00  juju-unit  executing   running leader-elected hook
18 Apr 2019 10:23:10+02:00  juju-unit  executing   running config-changed hook
18 Apr 2019 10:23:11+02:00  juju-unit  idle        
18 Apr 2019 10:24:44+02:00  juju-unit  executing   running config-changed hook
18 Apr 2019 10:24:45+02:00  workload   blocked     active (sse.example.com)
18 Apr 2019 10:24:45+02:00  workload   waiting     waiting for container
18 Apr 2019 10:24:45+02:00  juju-unit  idle

juju debug-log

application-ep2: 10:24:44 INFO unit.ep2/0.juju-log Reactive main running for hook config-changed
application-ep2: 10:24:44 DEBUG unit.ep2/0.juju-log tracer>
application-ep2: 10:24:44 DEBUG unit.ep2/0.juju-log tracer: hooks phase, 0 handlers queued
application-ep2: 10:24:44 DEBUG unit.ep2/0.juju-log tracer>
tracer: ++   queue handler ../../application-ep2/charm/hooks/relations/sse-endpoint/provides.py:11:broken:sse-endpoint
application-ep2: 10:24:45 INFO unit.ep2/0.juju-log Invoking reactive handler: reactive/sse-endpoint.py:13:consumer_active
application-ep2: 10:24:45 INFO unit.ep2/0.juju-log set status to 'active (sse.example.com)
application-ep2: 10:24:45 INFO unit.ep2/0.juju-log Invoking reactive handler: ../../application-ep2/charm/hooks/relations/sse-endpoint/provides.py:11:broken:sse-endpoint
application-ep2: 10:24:45 INFO juju.worker.uniter.operation ran "config-changed" hook

Also note that I tested changing the status message again at a later stage and the effect was the same.

wallyworld · 18 April 2019 23:47

Looking into this further, there is a corner case here we need to cater for. The current implementation was done under the assumption that all k8s charms would create workload pod(s).

Right now there’s no way for the charm to introspect the workload status. The charm can explicitly report blocked when it is waiting for a relation to be created, or there’s some other reason why it can’t proceed. Or it case report maintenance when it is busy etc. Juju will reflect this status verbatim.

When a charm starts (or is upgraded), it is expected to generate a pod spec to send to Juju. Juju will use that spec to create the k8s service and deployment for the workload. But the charm cannot currently introspect the workload status so the best it can do right now is report status of active once it has sent the pod spec to Juju. ie the charm thinks it has done all it needs to do to start the workload.

Juju will display a status derived from what the charm reports and the state of the workload pod. If the charm reports active, Juju will first ensure that the workload pod is running before displaying active. If there’s an error, that will be displayed. If the pod is not yet running, the status reported by Juju will be waiting. And as mentioned previously, if the charm explicitly reports blocked or maintenance, that is reported directly.

So the issue here is that there will not ever be a pod as the charm never asks for one. Current currently sees the lack of a pod as the waiting for container state.

One option would be to introspect the charm metadata, and if there’s no OCI image resources defined at all, then the charm is known not to spin up a workload, and so the status reporting can e adjusted accordingly.

We’ll look to get a fix done for rc1.

merlijn-sebrechts · 19 April 2019 08:39

This would solve my current use-case, but it makes the assumption that a charm with an oci image resource can only be active if there are running pods. A more flexible solution might be to only do the workload status reporting if the current pod spec is non-empty. This could enable a number of interesting use-cases, like a database charm that spins up a pod for every related client. No relations == no pods, but it’s still ready.

wallyworld · 19 April 2019 09:30

There’ still a window where the charm may have a pod spec to send to Juju, but has not yet done so, and the user runs status. In that case:

if the charm has not set any status yet, juju could still display “waiting” (the default initial status).
if the charm has set active status, juju would display “active”

Then if the charm does send a pod spec, Juju would revert to displaying “waiting”.

I think that’s effectively what you’re suggesting. Just wanted to ensure this corner case was covered.

merlijn-sebrechts · 19 April 2019 09:49

Yes, that makes sense, thanks!