Racy Tests in ProvisionerSuite


#1

I came across this one a couple of times today in different places, and also tripped it once locally:
https://bugs.launchpad.net/juju/+bug/1790091

The CI race job is also has failures in the same suite, across the branches. For example:
http://10.125.0.203:8080/view/Unit%20tests/job/RunUnittests-race-amd64/690/testReport/junit/github/com_juju_juju_worker_provisioner/TestPackage/

A cursory look at the code shows that these tests are racy. We are starting a worker that we hope is done before we start making assertions on data that it writes.

There are multiple approaches in the code-base to syncing on worker events, but it feels to me like we add excessive test-only logic to accommodate these quirks, and that it chips at the edges instead of being declarative and forthright. Is it worth working towards a uniform approach here?

Perhaps some of the suites can be modified to use mocks instead of spinning up Mongos.

I have some notions on refining the approach, but I think what I have done for the upgrade-series worker is worth looking at - it is a stab at being declarative about how the the test will behave when creating the worker, passing “behaviours”.
https://github.com/juju/juju/blob/develop/worker/upgradeseries/worker_test.go

One of these behaviours is how many invocations of loop watchers will be triggered. There is a test-level synchronisation on the provision of the notifications, which is working well in the tests.

I am hoping to play around with some more ideas prior to the Brussels engineering sprint, but if anyone has opinions on that code, I’d welcome the discussion.


#2

I proposed a patch to the upgrade-series worker tests linked above.

It demonstrates further leaning on the “behaviour” idea to do compose groups of mock expectations into more than one scenario. It is a little contrived in this case, but the idea is there.

https://github.com/juju/juju/pull/9155