Demo of Prometheus and Grafana Operators on k8s using Operator Framework

mmaglana · 10 March 2020 01:21

I finished the first iteration of the operators for prometheus and grafana on k8s and I’m submitting it here for the community to scrutinize: https://youtu.be/bf-YClFjANw

For those not familiar with the Operator Framework, it’s a new framework for writing Juju Charms that places importance on ease of development and taking advantage (rather than replacing) the Pythonic way of software development. The video linked to above shows two charms that have been developed on top of that framework.

Github repos of charms used in the video:

stub · 10 March 2020 05:45

I am curious why http_interface.Client wraps the entire endpoint rather than a single relation. Some work I’m doing is very similar, except the signature to the Client accepts the relation rather than the relation_name. It seemed cleaner to let the charm decide how to handle an endpoint with multiple relations and have a client that deals with a single relation, rather than a client dealing with an entire endpoint (0 or more relations) and a more complex API. I think the code as it stands will not support relating Grafana to multiple Prometheus servers, which may well be fine for your use cases. But may be worth reconsidering if this is to become shared code. I can’t provide a counter example yet, as I’m not as far advanced as you are, but hopefully this week.

mmaglana · 10 March 2020 06:16

Thanks for the input @stub! That’s a very good question and my best answer would be because I was (and am) still trying to wrap my head around the full concept of Juju interfaces. I’d love to see your counter example once it’s ready (do post it here).

jameinel · 10 March 2020 06:28

I think that either here or in the Charming topic is appropriate. (It seems a bit redundant to have K8s + charming and Charming + K8s.)

It is nice to see the pieces in action. Hopefully some constructive feedback:

It would be nice to have a bit more text in your post, than just a link to a video. I certainly understand that this is a start, but as we start fleshing out more charms that are written, we’ll want to have a bit of a text guide to help people understand what it is and what it does.
It would be more normal to just call the charm by the application it is deploying rather than ‘charm-k8s-*’. So you would do:
juju deploy grafana
If you look at microk8s.kubectl -n lma get pods we already have grafana-operator vs grafana. The extra ‘charm-k8s’ is just noise in the deploy.
The kubectl ... port-forward is also interesting, as I would have thought that juju expose grafana would be the way to interact with it. Certainly a goal would be to not have to know the exact ports that the charm uses, but have things like juju expose to do the work for you. Also hopefully expose would mean that you don’t have to rerun the command when a new unit comes up.

It is great to see the rest of the infrastructure working, and config flowing smoothly between the charms as part of juju relate and seeing grafana reconfigured with prometheus. We just want to make that last bit of wiring up to the user as smooth as possible.

jameinel · 10 March 2020 06:41

It is a good question to ask. Certainly Charms can currently already define limit:1 in their metadata.yaml for a given interface if they only support 0 or 1 relations on a given endpoint. (eg, It generally doesn’t make a lot of sense to have multiple databases for an app, as it only has one central authority.)

It isn’t something that we have designed the Operator Framework around, as you don’t register for events for just one relation-id, but for all events on that relation. (eg if you do observe(charm.on[db].relation_changed) you will get events for any relation-id). So while you can give it the context of a single relation, you would need to do the delegation in other code (or have the event handler do if event.relation.id != self.relation.id: return.)

I can see an advantage of it, as it does make writing the component easier. (In fact, a lot of components that are actually written will only support a single relation, and will just go into some sort of Blocked condition if they have >1 relation on a given endpoint.)

If we do want to drive further, we would certainly want to model that in the framework. Being able to receive events on a relation-id rather than on a relation endpoint.
The main problem is things like ‘joined’ signify that there is a new relation being established. And unless you have something that does listen to db_relation_joined, it never has a chance to instantiate the new relation handler. Obviously you can do that in the charm code. Certainly a pattern could be that only your Charm ever receives events, but your components just take dicts/etc. The pattern that we’re currently driving is that a component handles all aspects of a given Relation, and registers and receives events on that relation, and then drives new events for the Charm. (eg a postgresql or mysql component has a database_changed event which is distilled from the relation_joined, relation_changed, relation_departing events.)

You could alternatively have one component which you provide that handles relation-joined and proxies the rest of the data for relation-changed et al to other classes.

mmaglana · 10 March 2020 08:47

Thanks for the helpful comments, @jameinel! I’ve converted some of your comments into issues for now:

zicklag · 10 March 2020 16:11

Does Juju enforce this? I can’t find it now, but I was pretty sure the docs had stated that the limit wasn’t respected and that this was a low priority bug. If that works, it would be very useful to my partner who is currently developing a charm ( not with the operator framework ) and manually adding the logic to go into a blocked state when more than one apps are related to a certain relation.

mmaglana · 11 March 2020 01:01

@stub I created an issue from your earlier comment in the meantime: https://github.com/relaxdiego/charm-k8s-grafana/issues/4

timClicks · 11 March 2020 02:33

Responding to the meta-question of where this thread should be hosted… I think #charming is probably a better home because we’re discussing the operator framework.

mmaglana · 11 March 2020 02:45

@timClicks thanks! Moved.

stub · 16 March 2020 03:10

Yes, last I heard limit: was not respected and was purely informational.

I’ve spiked with a class that implements the Relation (single relation, not the endpoint dealing with multiple relations). It puts a lot of burden on the main charm if it does want to enforce constraints like ‘only 1’, putting things into a blocked state. It is not easy when you start handling all the edge cases. I’m thinking that a relation handling the Endpoint is indeed the way to go. The charm can easily instantiate it in init, not needing to wait for a relation-joined event before it is possible to know the relation id. The Endpoint implementation observes the relation events, emitting events as required. If the charm needs to enforce relation limits to improve the user experience, it is easier. So I’m now heading in that direction. Pretty much what @jameinel stated.