I believe there’s been questions in the past about the best use of leader_set/leader_get and the documentation (Implementing leadership and Leadership howtos) is also very nice to read. However, I’m looking to better understand how Juju hooks work and think about scalability.
The end goal I want to achieve is:
- principal_app:exporter-endpoint <-> remote_subordinate_app
- physical_node:juju-info <-> remote_subordinate_app:juju-info (places the subordinate in all the physical nodes of an environment)
On step (3), the subordinate app shares peer information that will be used by each of the units to build data share with the “principal_app” on step (1). “principal_app” will build all the shared config on a single configuration file (so the file needs to be rebuilt on every change).
The procedure above will:
1.a) trigger “peer-endpoint-relation-joined” every time new peers are added
1.b) trigger “peer-endpoint-relation-changed” because the new peer units will call relation_set to share their details
1.c) trigger “exporter-endpoint-relation-changed” for every subordinate unit. Each one shared a built information of their relation with the other peer units.
2.a) trigger “peer-endpoint-relation-departed” every time peers are removed, which will also trigger “exporter-endpoint-relation-changed” because the subordinate units will update their shared data.
2.b) trigger “exporter-endpoint-relation-departed” for each removed peer
It seems like a better approach would be for the “principal_app” to receive a single “exporter-endpoint-relation-changed” instead of Nx where N is the number of physical nodes. To achieve this, the leader of the remote_subordinate_app should be the only one sending the data to the “principal_app”, while peer subordinates should share their details via leader_set (which would trigger the “leader-settings-changed” hook).
A question I have is if the “-relation-changed” hook identifies which of the units has triggered it. Right now, when that hook is triggered, all the related units are parsed. It would be less expensive to only run relation_get against the unit that triggered it.
On the other hand, if “leader_set” is used, I think it would be less expensive to also manage local data (via the unitdata db) to limit calls to “leader_get”. Otherwise, let’s say 200x nodes could make the leader unit become a bottleneck with 199 “leader_get” calls.
What do you think about it? In case it helps, the use case above is prometheus using the http interface to receive data from the “prometheus-blackbox-exporter” service running on each compute-storage node of a cloud. The blackbox exporter uses the “icmp” module to probe all its peers are reachable (and all its peers do the same, so any config change needs to rebuild the 1-N relation for pings). Note that each node may have a different networking setup, so all the available networks are shared with their peers so they can decide which ones can be used to test pings (some may configure 2-3 different ping probes, others may only be able to configure one - this to say that the 1-N relation needs to be built by each peer subordinate unit).
FWIW, a disadvantage in Prometheus is that a single prometheus.yml file can be configured for scrape_configs. A different use case is Nagios and the nrpe-charm subordinates. The current nagios-charm builds every config on a single charm.cfg file but Nagios allows to have multiple config files, so each nrpe unit could create its own file (less disk writes, as charmhelpers’ hosts.write_file checks the content before it writes). In this Nagios case, it would very much help if the triggered -relation-changed would identify the unit that triggered it.
Sorry for the long post, and let me know if there’s anything I may have forgotten to understand the use cases. The fear is when 200 subordinate peers exist, and how would it be less time (and resource) consuming to process: 1) all subordinate changes 2) a single subordinate change. For (1), it seems leadership would be best. For (2), I’m not so sure.