WIP 19.04 proposal: model generations

rick_h · 2 October 2018 15:31

Users need to be able to roll changes to applications in a safe guided processes that controls the flow such that not all units of an HA application are hit at once. This also allows some manual canary testing and provides control over the flow of changes out to the model.

We propose to allow operators to manage this using model generations.

What is a model generation?

A model generation can refer to the current generation (definition) of the model or a next generation that is being defined and rolled out with control.

Basic use case walk through

A typical use case would be to update a charm revision, update the config for the new charm revision, and to roll that out such that you can validate it behaves correctly before rolling that change to all units. In our example let’s update Keystone.

$ juju add-generation
target generation set to next
$ juju upgrade-charm keystone
Added charm "cs:keystone-285" to the model.
$ juju config keystone debug=true

With those changes now awaiting rollout to the model in the next generation the user can push it out to units selectively.

$ juju set-generation keystone/0

Additional changes can be made to the next generation.

$ juju config ceph-mon loglevel=10

Note: this change will be visible to keystone/0 immediately, but not to the other units.

And these can be pushed out at the application level as well. The user may also specify multiple targets to the set-generation command.

$ juju set-generation ceph-mon keystone/1 keystone/2

Once the changes in the next generation are all pushed out the generation is deemed to be live and it the target generation becomes current.

$ juju set-generation keystone/3
all changes in next generation complete, target generation set to current

Which model operations are tracked via generations?

Not all changes to the model are tracked in this way.

upgrade-charm
config
attach

These are the operations that can be pinned to a future generation. This might change over time, but anything outside of those operations are considered out of scope for this specification.

Targeting options for pushing to the next generation

The set-generation command takes one or more model targets which may consist of application or unit names. In this way you might update a few related applications, update a select unit of each, and validate your changes are solid.

$ juju add-generation
target generation set to next
$ juju upgrade-charm keystone
Added charm "cs:keystone-285" to the model, queued in next generation.
$ juju upgrade-charm percona-cluster
Added charm "cs:percona-cluster-270" to the model, queued in next generation.
$ juju set-generation keystone/0 percona-cluster/0

*manual validation of the updated charms here*

The user can then push out to the rest of the units by just using the application as the target.

$ juju set-generation keystone percona-cluster
all changes in next generation complete, target generation set to current

Making multiple changes to the next generation

If you set-generation to a unit and later make another change that affects that unit it will be instantly sent out. The idea is that once you set-generation that unit is tracking the next generation and will get live updates.

$ juju add-generation
target generation set to next
$ juju config ceph-mon loglevel=1
$ juju set-generation ceph-mon/0
$ juju config ceph-mon loglevel=10

At this point the ceph-mon/0 unit will receive a second config-changed event with the loglevel value of 10.

Making changes that occur immediately and are not related to future generations

Users may switch between the current and next generations.

$ juju switch-generation current
 target generation set to current

And now changes go out live as they normally would without any need to push them selectively.

$ juju config ceph-mon loglevel=1

Commands can also be scripted to target generations specifically.

$ juju config --generation=current ceph-mon loglevel=1

Cancelling a next generation

If a next generation is not going to be rolled out you can cancel it using the command cancel-generation. This will abort anything in the next generation and return the active target to the current generation.

$ juju add-generation
target generation set to next
$ juju config ceph-mon loglevel=1
$ juju cancel-generation
changed dropped and target generation set to current

Note that you cannot cancel a generation with changes that are partially pushed out to only select units. You’ll need to set a value and push to the units involved so that there’s consistency across the units. You can cancel changes that have not been pushed out. Take the following example:

$ juju add-generation
target generation set to next
$ juju config keystone debug=true
$ juju config ceph-mon loglevel=1
$ juju set-generation keystone/0
$ juju cancel-generation
ERROR: cannot cancel generation, there are units behind a generation: keystone/1, keystone/2
$ juju set-generation keystone/1 keystone/2

At this point all of the Keystone units have the new config but the ceph-mon change has not been pushed to any units. Since they are consistent we can cancel the changes and return to the current generation.

$ juju cancel-generation
changed dropped and target generation set to current

Tracking units not on the up to date generation

When a new generation is added and changes come in status will indicate units that are behind a generation. These units need a set-generation command to get the latest details and become up to date.

$ juju add-generation
target generation set to next
$ juju config keystone debug=true
$ juju config percona-cluster tuning-level=safest
$ juju set-generation keystone/0 percona-cluster/0
$ juju status
Model    Controller  Cloud/Region     Version  SLA          Timestamp
default  jujudemo    google/us-east1  2.4.2    unsupported  13:09:59-04:00

Unit                Workload     Agent      Machine  Public address  Ports     Message
keystone/0*         blocked      executing  0        35.229.100.32   5000/tcp  Missing relations: database
keystone/1          blocked      executing  1        104.196.99.233  5000/tcp  [old] Missing relations: database
keystone/2          blocked      executing  2        35.227.24.239   5000/tcp  [old](start) Missing relations: database
percona-cluster/0*  active       idle       3        35.227.60.108   3306/tcp  Unit is ready
percona-cluster/1   maintenance  executing  4        35.237.241.58             [old] Waiting 30 seconds for operation ...
percona-cluster/2   maintenance  executing  5        35.231.140.154            [old] Waiting 60 seconds for operation ...
...

Note that the units that are on the previous generation are noted as being [old]. These are targets that need to be updated before the transition to the next generation is complete. Note that the indication is on the unit Message field regardless of what the changes made are. A config change, charm upgrade, or resources change will all indicate in the same way. The details about what is different can be investigated using the diff-generation command noted below.

Showing the difference between the current and next generations

Users can see what changes are made to the next generation using the command diff-generation. It will display only those changes that were made since the generation was added. It will not indicate which have been pushed or not. Only the changes made themselves.

$ juju add-generation
target generation set to next
$ juju config keystone debug=true
$ juju config ceph-mon loglevel=1
$ juju diff-generation
applications: 
  keystone:
    config:
      debug: true
  ceph-mon:
    config:
      loglevel: 1

<note: this output format should look like an overlay or something that might be in common with diff-bundle work and such>

jameinel · 4 October 2018 12:15

Note: I made a couple of small typo fixes (not awaiting => now awaiting)

juju push-generation keystone/0 does not read clearly to me. I believe this is “set the generation of keystone/0 to ‘next’” but I don’t feel like the command clearly indicates that.

With those changes not awaiting rollout to the model in the next generation the user can push it out to units selectively.

Did you intend for the manual validation to be a hidden comment? It seems that markdown defaults to hiding it from the HTML view.

rick_h:

**<note: I personally started with abort-generation but I'd prefer to avoid adding a new word to the Juju vocabulary when we already have *cancel* and the phrase doesn't feel quite comfortable>**

I’m personally happy with cancel-generation. I’m not sure that I’m happy with the idea that in order to ‘cancel’ a generation, you first have to push it to all units of a given application. That feels against the idea of testing something out and then wanting to cancel it.

Side note: I personally find commenting on a Google Doc a much easier way to collaborate on things like this. Being able to suggest changes without doing them inline, and being able to attach comments rather than having to reply and quote some of the text. I suppose I can adapt, though. I do like having it here for long term documentation purposes.

rick_h · 4 October 2018 12:54

Hmm, in my head it was “push the changes in next out to this unit”. I don’t want to use set as that’s used in the set-generation call for focusing on current vs next. Maybe update?

I almost like, except I don’t like how next isn’t a verb, juju next-generation keystone/0

No, this was just me getting messy with the formatting. Thanks for the catch.

I agree. I like having the formatting options here, especially for the code format, however the discussion bit isn’t as direct and easy. I’m trying it out and seeing how I can best handle it, but it has room for improvement.

rick_h · 9 October 2018 11:44

With cancel we need a way of showing the diff with things that are and are not applied yet…

who’s on next and who’s yet to be applied for things…but how does that output scale to the 100’s cases.

should set-generation get into helping show things left to be done/updated?

thumper · 11 October 2018 01:07

I agree with John a lot on the problems of commenting on the spec with discourse. I do feel that google docs is still a better medium for the early phase discussions around new work.

I have many comments that I’d like to make on this doc, but just now I’m going to limit myself to two.

Firstly, there is no way to back-out the changes. There has to be.

Relatedly, I find this quite awkward:

$ juju set-generation keystone/0 percona-cluster/0

perhaps we have instead:

$ juju set-generation next keystone/0 percona-cluster/0
# which allows for...
$ juju set-generation current keystone/0
# to roll back keystone/0 to the current generation

set-generation by itself feels incomplete.

michele-mancioppi · 2 August 2021 10:57

Generations, with an addition of “Who did it” and “Why”, would actually solve in Juju the provenance problem for monitoring: when a system is complex enough, why settings like alert rules are in place, is lost to the sands of time and fallible human memory. This is a problem known in the Bootstack team, so they started to specify alert rules in a separate Git repository, which is fetched by the prometheus charm, so that they could leverage git blame to explain why an alert rule exists in its current form by looking at the commit messages. If we added to generation why some changes are applied and by whom, we solve the Bootstack use-case without stepping out of the Juju model.

Now, in terms of solutionizing, this also means we need to add the “why” as an option provided to juju config and other commands that accept configurations (like juju deploy). The who, AFAICT, comes for free with the Juju user, although to be effective, will require more fine-grained user setups than what people may do with the “one admin user, shared by all administrators”, but that’s a different issues

manadart · 19 August 2021 15:26

Since this proposal, some of the semantics changed so that generations are referred to as branches.

Run:

export JUJU_DEV_FEATURE_FLAGS=branches
juju help add-branch

The associated commands are listed under the “See also” section.

For config, this is completed and operational. The work to bring charm and resource versions within the scope of branches was pushed out on account of the switch the CharmHub.

I expect it to be on a future road-map.

michele-mancioppi · 23 August 2021 12:46

@manadart, to the best of my understanding, with respect to the use-case I mentioned, we currently lack the possibility of documenting the “why” a change is performed by means, e.g., of a message passed to juju commit.

Also, we should change the default branch name to main and implement it for container substrates.