Is the Reactive Framework Making Juju Slow? ⏱ - My Experiences With Juju So Far

charming

#1

Hello everybody,

In this post I want to go over a bit of my experience with Juju and ask some questions.


Brilliant!, Brilliant!, Brilliant! :tada:

I found Juju recently and have since been doing some experimentation with it and have started on writing a basic charm. The first thing that I will say is that the idea is brilliant. I have experience with Chef, Docker, Packer, Terraform, Rancher, and Docker Swarm and I’ve told my partner before that I felt like there still needed to be a tool that facilitates communication between the applications that are being provisioned. Juju is that tool. Juju allows for a level of orchestration that practically allows you to create your own “orchestrator” that manages your applications in a way that Kubernetes or Swarm would never let you. It is a GAME CHANGER.

The Problem :neutral_face:

Anyway, as I started to test Juju a little bit, one of the first things that I noticed was that it seems to be somewhat unresponsive or slow when it comes to building out the application stacks. For instance, even with a simple app consisting of an app charm and a Postgresql Charm when I make a relation between that app and Postgres it can take a couple minutes before the app even responds to the new relation. Also, spinning up a Kubernetes cluster can take about an hour.

To clarify, I understand that the charms are doing a lot: they have to download, install, and configure the applications, but what seems to be taking a lot of time is not the installation or configuration of the apps, but the interaction between the apps over the relations. I’m not 100% sure this is the case yet, but I wanted to bring it up and see what other people’s experiences have been with it.

Initially, I just accepted the fact that things were going to be slower to spin up in Juju than they would be in something like Docker Swarm. It is a different environment and a different kind of way to have the apps interact. My problem came, though, when I started to try to write my own charm, and it started taking a long time to debug the charm because I would end up just waiting for the system to do something after I made a change, and I didn’t know what it was waiting for.

The charm I was trying to make was a super simple Docker charm that would run the CodiMD Docker container and connect to the Postgresql charm for the database. As I made deployments of my charm and looked at the juju debug-log I noticed that most of the time that the Postgresql charm spent doing stuff was actually spent printing huge stacks of logs like this:

tracer: starting handler dispatch, 166 flags set
tracer: set flag apt.installed.postgresql-10
tracer: set flag apt.installed.postgresql-client-10
tracer: set flag apt.installed.postgresql-client-common
...

This seems to be where Juju is spending a lot of its time. To me it looks like the reactive framework is slowing Juju down, maybe even slowing it down drastically. My suspicion is that the hooks are firing relatively quickly, but the reactive framework is limiting the speed that the layers can actually do their work with all of the flags that it is setting.

I’m not sure if Python itself is the bottleneck, or if it is just an inefficient implementation of the Reactive framework, or if I’m missing the source of the delays entirely, but I’d love to get some feedback.

In order to test whether or not it is the reactive framework slowing things down, I’m going to do some testing with using just the normal hooks, skipping the reactive system, to see if there is a big difference in the responsiveness. I haven’t done any tests yet but I’ll post any further experience here once I do.

Idea! :bulb:

As long as the Juju hooks do run in a timely fashion without delays I was entertaining the idea of creating a Rust framework for writing charms that would serve essentially the same purpose as the existing Python reactive framework except it would be written in Rust and use Cargo for package management instead of the existing charm layers.

If the reactive framework is the performance bottleneck, then writing the framework in efficient Rust code should solve that. A lot of people might not want to write Rust, which is totally understandable, but Rust is by far my favorite language and I think it could work well for me and my team. Also, because the way that you interact with relations in Juju stays the same, charms written with a Rust reactive framework would be 100% compatible with charms written with plain hooks or with the Python reactive framework. You could, if there was value in it, even create Python bindings to the Rust reactive framework to get the extra performance but still be able to use the familiar language.

What I love about Juju is the fact that I’m free to design my own charm building framework if I need/want to.

Summary

These are just my thoughts after starting out with Juju and I would like to get feedback on whether or not I’m completely misinterpreting what Juju is doing or whether or not that is just a limitation of the system somehow. My Rust idea is mostly brainstorming and may not may not make sense yet, but I figured I’d throw it out there in case anybody had any thoughts.

I’m really loving Juju so far and I think that me and my team will be able to do some pretty awesome stuff with it. Can’t wait to see what we can accomplish.


#2

Absolutely love this enthusiasm @zicklag.

I think that you’re right that Reactive Charming is slow. Flags being set everywhere doesn’t help. There also some other issues. Because they’re implemented as independent packages, every layer from every charm calls apt update and apt upgrade. This can make things painful if you’re not deploying to somewhere with a fast apt mirror.

RIIR? Not sure. Don’t get me wrong, I love Rust (otherwise I wouldn’t be publishing a book on it). The problem is that what’s really needed is a simple mechanism for charming. Saying “Welcome to Juju, please learn Rust to get started” won’t fly I think. Yet, writing charms is too difficult currently. In a sense, the space for innovation here is wide open. If you can create something that make it easy for people to write a charms and that gains adoption - then give it a go.

I’ve been wondering whether charming could be more like web frameworks. There’s no single way to write a web app, perhaps there should be multiple ways to write charms? As long as the interfaces and endpoints are well documented - the internal charming details should be irrelevant.

Once upon a time long ago—and you’ll see some of our older docs reflect this—charms were written in “any language”. That caused lots of confusion because people didn’t really know what Juju was (is it just calling shell scripts?) and spurred a generation of charms that were poorly implemented and difficult to maintain.

So there’s a real tension between creating your own charming framework for yourself vs creating a framework for a whole community.

Some others have explored this area and/or have some thoughts:


#3

@zicklag thanks for the feedback. Possibly what you perceive as “slow” could just be lag due to the way you have implemented things, like a handler not being triggered correctly so it doesn’t run until the next hook invocation so to the user it appears as if things are moving “slow”. I have experienced this happening before and I could see it giving the impression that things are slow. I have been using Juju/reactive for a long time and haven’t ever heard of or seen anything “slow” in or around the reactive framework. I would be interested in knowing what you think is slow about it? There are no computationally intensive tasks that reactive touches, so I’m not sure the language would have anything to do with the speed of anything here. Many people appreciate the logs because it gives us information about what is going on in the charm. Logs can be turned down on a per model basis if the user intends to quite them down.


#4

As @timClicks mentioned, Canonical is investing in a new python framework for writing charms. It will become our “go to way of writing charms”.

Juju isn’t changing in its handling of the hook executions, so people are always free to write other frameworks, but there should be a simple guide that new people can follow that is the “blessed” way forward.

We should have something to show in more detail in the coming weeks. This framework is still very new, and we are ironing out some initial kinks with the initial charms that are being written in the new framework.

A key here is to have a method where you can produce a simple charm very simply, and gradually introduce complexity while having something that works. The intent is to avoid the very steep learning curve of the current reactive framework.


#5

I actually have added a layer on top of charms.reactive in order to simplify and clean the resulting code a bit by allowing maximum reusability, so I’ve been using reactive for some time. Basically, I have separated the logic parts into “actions” that are being called from the reactive parts. Therefore I can create lists of actions to execute for particular events which allows me to simplify the reactive parts. It’s mainly because I have logical paths that cross, so it became very hard to do using flags only (doable, but not that clean).

The upgrade-charm calls for apt update and apt upgrade are indeed sometimes very slow and I have been thinking about disabling those calls for my charms in favor of only doing one during the install hook. I actually created an apt-update and an apt-upgrade action that can be automated using external systems or run manually.

The only issue that I had with upgrade-charm is that it’s called whenever a file is attached. It kinda makes sense to execute apt update and apt upgrade when you actually upgrade a charm.

Although, each cases are unique.


#6

I agree that that would not be a good general approach. Rust is definitely a barrier to some people and I wouldn’t want to make everybody use it.

With Juju hooks as it is you really don’t have to use any specific framework to write charms, which is absolutely to Juju’s credit. In a sense it already is similar to web frameworks: Juju provides an execution environment and an API ( or in this case CLI, in the hook environment ) that the framework uses to interact with the world, very similar to how a browser does. I think that the biggest difference is that there just isn’t as large a community for Juju yet so there aren’t a lot of mature options for it.

Yes, I actually just finished walking through that old doc that just used Bash and the Juju hooks, but the thing about it was that it instantly made quite a bit more sense to me than the new reactive approach.

I completely see the value in the reactive model and I actually like it quite a bit conceptually. The problem I had with the reactive framework from a developer experience standpoint was that there seemed to be a little too much “magic” that I felt like I didn’t have a good explanation of. For example, I was a little confused where the pgsql argument to this function in hello-juju came from. I also found that while bash was supported in the reactive framework, there wasn’t enough examples for me to figure out how to interact with the Postgresql interface ( I figured it out later after a lot of investigation ). Still, most of that was just a need for documentation.

One thing I do want to point out is that I think that writing Juju charms using the hooks alone is still a perfectly valid way to write charms. The docs made me feel like it was a sort of “second class” way to do it, but as a DevOps engineer who has years of experience writing Docker containers and automating things with shell scripts, writing hooks in bash was a very natural way to approach writing a Docker-powered charm ( I haven’t finished it yet, so maybe I’ll run into problems with it :man_shrugging: ).

The layer-based approach of the reactive framework is still a great model and it definitely handles the code-reuse problem better, but I don’t think we should play off the hook-based model as an “old” or “out-of-date” way to do it because it can actually be easier to understand and approach, I think, regardless of your background. That isn’t to say that I don’t think we should look for a better “default” way to write charms.

I see your point. At first I was thinking that me and my team would likely just write something that would work well for us, because everybody else does have the reactive framework to use, but I do think it would be beneficial to the community if we did try to design it with other people’s use-cases in mind as well.That way we could help grow the Juju community and get some valuable comparison with existing frameworks while hopefully learning more about patterns and how to write charms well.

We were thinking of maybe writing the framework itself in Rust but providing bindings for at least Python ( something I have some experience in ) and a CLI for use in Bash scripts. Then we could possibly provide bindings for JavaScript/WASM later. You could even throw Ruby in there if it was useful to enough people.

That way we could support Python, etc. while still being able to use Rust ourselves if we wanted to.

What seems slow is that it can spend around 15 seconds ( rough estimate, I might time it later to make sure ) logging the fact that it is setting 166 flags, at least once every time any hook for the Postgesql charm is run. It would seem to me that the setting of those flags should take a fraction of a second. I’m still working to understand what Juju is doing at different times, but as I wait for my apps to come up and I am watching the logs, it spends a lot of time just printing the fact that it is setting flags.

I have absolutely nothing against the debug logging or how verbose it is. That makes total sense and it is great that Juju will tell you that much about what is going on. What it seemed like was that the majority of the time my app was coming up was spent while those “tracer: set flag apt.installed.postgresql-10” logs were going and it seems like it shouldn’t be taking that long to set flags.

I’m doing my testing on AWS instances with 7Ghz cores and 1GB of ram so the CPU speed should not be a problem.

I don’t have any solid evidence yet, but I could probably get the Juju logs with timestamps in them to analyze. I could be misinterpreting where the time is spent and maybe I’m a little too impatient but I’m used to Docker Swarm where things can be changed quickly with very little downtime and I’m trying to get as close to that as I can in Juju.

I’m very glad to hear both of those. Having a blessed way forward is important to uniting the community, and for giving newcomers something they can be confident in learning initially. Preserving the ability to write your own frameworks is important, too, because it means that Juju is much less likely to be insufficient for people with special needs. For example, my team will most likely use Docker heavily to power our Juju charms, so it may prove valuable to us to create a Docker-specific charm-writing framework that allows us to create our own “Docker orchestrator” through Juju.

Sounds great. :+1:

That sounds like a pretty good idea. It kind of combines the more procedural approach with the reactive approach. At this point I’m very interested in different ways to model charms and I’ll probably try that out. :slight_smile:

Not that everybody should use Docker, but that is one case where Docker containers work very well. Each container image and each version ( tag ) of that image is isolated and the apt upgrades and updates are run during the container build phase so that when you need to run or upgrade the application all you have to do is pull the Docker image and it will have what you need in it.

I’m hoping that that will help the deployment speed a bit.


#7

Just picking up this thread (will respond to some of your other points when I get the time)… some “Juju driving Docker” experience has been written up:


#8

Another cool thing about that is that I usually will create a base layer that includes all the actions and 1 or more charm layers that will use that base layer that will override some actions to change how it works. That way I have less maintenance to do for charms like MariaDB + Galera MariaDB or Redis + Redis w/ Sentinel. It allows me to quickly create variants and reuse most of the code. It also keeps the code a lot cleaner since there are only the Galera or Sentinel-specific code for the charm layers. I also have a few “global” actions that I usually use everywhere, so I only have to change it once for it to apply to all my charms.


#9

Thanks, I’ve read through that and started experimenting with it a bit.


I talked to my partner and we decided that, for us, it will make the most sense to create a charm framework that is specific to helping your write Docker powered charms. Docker solves for some common problems that you might have when making a charming framework, such as the apt update and apt upgrade scenarios. You skip the need for most charms to install applications on the host because the software is already prepared inside of the Docker image. It also makes it easier to co-locate apps on a server without having to worry about the software for each application conflicting with each-other. We realized that pretty much every charm that we will make will most-likely be powered by Docker and that there will be a large overlap between the requirements of each of those charms.

What that will mean for the community is that while we will not be providing a fully general purpose charming framework, we will still be providing a charming framework that anybody can use to easily create Docker powered charms.

Everything is in flux right now, but we are thinking that we will start out with a more hook-based approach to writing the charm code and focus on letting you write the charm code in bash, or any other executable format, just like Juju does for its hooks. Bash will likely be second-nature to anybody who writes Docker containers so it will be a natural fit for our target audience.

I will probably be creating some documentation to outline our design plans as we work through research and development. I’ll post the documentation link here once it is up.


#10

I just finished the first draft of the design documentation for our “Lucky” charm framework. The repository is on GitHub:

https://github.com/katharostech/lucky

The documentation is here:

https://katharostech.github.io/lucky


#11

Love the name. Well done getting a first iteration released so quickly!


#12

Something worth noting here is that I just found in the docs:

I didn’t realize that, and I had been running both my app charm and PostgreSQL charm on the same host in most of my tests. That still doesn’t defeat the motivation behind making the Lucky charm framework, but it is a good point to realize that can effect the speed at which your charms get updated.


#13

s/system/unit/ - I think the wording is incorrect in the docs.


#14

Have updated the docs.


#15

Ah, OK, that makes more sense to me. Cool.


#16

There are places where unit agents will acquire a machine lock. For example, only one charm can execute apt commands at a time.