Upgrade Step before API server is running

jameinel · 7 October 2019 07:47

Joe brought up an interesting problem as he is trying to put together the upgrade steps for the new space ids changes.

The particular issue is that the API server is failing to start, because it wants to populate the model cache, but that fails because we haven’t applied the changes to the database schema yet. There are hacks we can put in place (handle both the old and new format data in the model cache logic).

But really what we want is a way to run database schema changes before we need the API server running. I thought we had that functionality in the past.
Looking at the code, we have stuff like “Upgrade.Target” which can be set to DatabaseMaster which gives access to the database level “State” object.
However, UpgradeStepsWorker explicitly depends on APICallerName which establishes a connection.

Almost all of our actual upgrade steps that I see are Target(DatabaseMaster), and don’t actually do anything with the API connection. I thought we had this problem in the past and did have an answer for it. And the Target(DatabaseMaster) code in upgrades/upgrade.go PerformUpgrade does look like it is trying to pull of a StateContext vs an APIContext. But we connect to the API way before that. So while it looks like it might be lazy and not connect to the API until we have an upgrade step that needs it, it is instead done very early before any upgrade step is done.

Depedency Engine also states we can’t evaluate our dependencies outside of our Start function, so it isn’t that we could still depend on APICallerName and just not evaluate it until we know that we actually need it.

Thoughts @wallyworld @thumper?

jameinel · 7 October 2019 08:02

Another thought came up. While we might occasionally have upgrade steps that are run on non-controller machines/units, those are quite rare. Wouldn’t it be better to have a ControllerUpgradeWorker that doesn’t get started on things that aren’t controllers?

thumper · 8 October 2019 01:02

I’ve been thinking about this since @rick_h brought this up this morning.

I do think the simplest solution here is to split the upgrade steps. So yes, adding a specific controller upgrade worker for the purpose of updating the database. This would then only need to depend on the state and not the api server / api caller.

There is logic inside the upgrade runner just now for controller synchronisation. I think this would only need to be in the controller upgrader because anything that updates the DB needs the synchronisation, but if you are only dealing with “local” data or the API, then I don’t think the synchronisation is needed between the controllers.

I did think that an alternative approach may be that the all watcher just doesn’t emit addresses or spaces that don’t match the expected storage format on the knowledge that the upgrade step will rewrite them all, but this feels more fragile IMO.

I feel that the “break the existing upgrade-runner worker into two” is the easier, more robust, and more correct way forward.

wallyworld · 8 October 2019 01:10

We used to have non-controller upgrade steps which I think were done by a different worker in the unit agent. That was all removed after 1.25 when we went to Juju 2.0 I think. So reintroducing such an approach seems sensible.