Upgrade steps - keeping the DB in sync with code

thumper · 27 March 2019 01:42

Juju is not a trivial system, and as we add functionality, and more documents and collections to the database, we need to ensure that existing systems, as they get upgraded, now have the new docs, or new fields as if they had been created with the new code.

Many changes for existing fields can be done in such a way that if the value was missing, the empty Go value is both valid and the default. This way we don’t actually need an upgrade step.

A common example of adding documents is adding additional settings or status types. These are updated in the code in the creation ops, and remove ops of the entity that is being added. Most of the code then expects these documents to be there. Anything created with the existing code, like tests, will all have the new fields. However models and other entities that existed in an earlier version don’t get these new documents unless we add explicit upgrade steps for them.

The upgrades package contains the definition of the various upgrade steps that need to be executed for any particular version. The database changes are generally kept in state/upgrades.go, along with state/upgrades_test.go.

It seems that we have lost a bunch of institutional knowledge around direct database access and upgrade steps.

runForAllModelStates

// runForAllModelStates will run runner function for every model passing a state
// for that model.
func runForAllModelStates(pool *StatePool, runner func(st *State) error) error {
   ...
}

Some upgrade steps are much easier to deal with when you have an actual State instance for the models that you are looking at. The runForAllModelStates is a helper function that allows just that. This is found at the top of the state/upgrades.go file.

A benefit of using this is that it allows you to just think in the case of a single model, and use the model scoped database functions like GetCollection.

db.GetRawCollection vs db.GetCollection

In the state package we have an abstraction that sits on top of the mongo collections provided by mgo.

The GetCollection method returns a smart scoped collection to a particular model (if it isn’t global). This means that it automatically prefixes the model-uuid + ":" to the start of the IDs, and adds in the model-uuid field to the document.

When using GetRawCollection to use for selects or inserts, or in general using raw transactions, this checking or prefixing needs to be done manually.

Many of the functions used for ID generation for the status and settings collections call their methods global, but this isn’t a good word. I think it came from the time before multiple models stored in the database. These generated IDs are only part of the fully scoped “_id” used by mongo.

localID := machineGlobalModificationKey(machine.Id)
docID := ensureModelUUID(machine.ModelUUID, localID)

The ensureModelUUID does pretty much what you’d think. It makes sure that the specified model UUID prefix exists on the specified ID string.

Idempotency

There is always the possibiltiy of upgrade steps getting interrupted for whatever reason. A general rule we have is that the upgrade steps need to be able to be run multiple times without error.

What this means is that the upgrade step needs to take into account the possibility that some of the changes that it is adding may have been applied already.

There are now many good examples of this. These are born out of many earlier poor examples.

Beware factory functions in tests

The testing Factory is extrememly useful for creating many entities for tests. Not so for the migration tests. The factory will create entities with the new code, so isn’t going to have the database objects in the state as they would have been from an upgrade. Often you need to manually create documents that exhibit the structure and values you need to test.

However, be careful that you are still creating database entities that closely match reality. This means

The _id field should be prefixed with the model-uuid like normal
Remember that core.life values are different to state.Life values
Insert partial real docs, not just bson.M or bson.D as this will enforce correctness with types

jameinel · 28 March 2019 04:19

This seems like a good place to also mention commands like “juju dump-db” that can give you the raw content of the database in a YAML form, that you can then use to compare. (eg, bootstrap 2.4, deploy a bundle, upgrade to 2.5. dump-db. Then bootstrap 2.5, deploy a bundle, dump-db, and compare whether the final documents actually match where it matters.)
Maybe just this short message will at least let people know that they can dig further for more information.

thumper · 28 March 2019 22:22

For my tests it was even a bit simpler, avoiding the second bootstrap.

deploy older controller
deploy some stuff in the default model
juju export-bundle > bundle.yaml
upgarde controller
juju add-model new-version
juju deploy ./bundle.yaml
juju dump-db -m default > old.yaml
juju dump-db -m new-version > new.yaml
diff old.yaml new.yaml using meld or kdiff3

you need developer-mode as a value in JUJU_DEV_FEATURE_FLAGS environment varible for the dump-db command to be available.