Deploying vault fails in "service not running" on fresh deployments

I don’t know where to begin on this, but here I go.
I started using vault for secrets and certificates handling recently and have had no luck.
I’m deploying OpenStack and adding all relations for certificates and secrets to vault to handle.
For whatever reason, I can’t get vault to run.
I’d appreciate any assistance.

Really sorry to hear that you’re having difficulty. When you say “can’t get vault to run”, what are you seeing?

Could you please paste the output of the following commands to https://paste.ubuntu.com?

  • juju status --format=yaml
  • juju export-bundle

Thank you so much for your time.
Here is the code you requested: juju status --format=yaml = Ubuntu Pastebin

juju export-bundle = Ubuntu Pastebin

As for can’t get vault to run, well juju status vault/1
I get “Vault service not running”

So I ssh to the unit.

ubuntu@7of0:~$ sudo systemctl restart vault
ubuntu@7of0:~$ sudo systemctl status vault
● vault.service - HashiCorp Vault
   Loaded: loaded (/etc/systemd/system/vault.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Tue 2020-02-25 16:42:22 UTC; 5s ago
  Process: 3729 ExecStart=/snap/bin/vault server -config /var/snap/vault/common/vault.hcl (code=exited, sta
 Main PID: 3729 (code=exited, status=1/FAILURE)

Feb 25 16:42:22 7of0 systemd[1]: vault.service: Service hold-off time over, scheduling restart.
Feb 25 16:42:22 7of0 systemd[1]: vault.service: Scheduled restart job, restart counter is at 5.
Feb 25 16:42:22 7of0 systemd[1]: Stopped HashiCorp Vault.
Feb 25 16:42:22 7of0 systemd[1]: vault.service: Start request repeated too quickly.
Feb 25 16:42:22 7of0 systemd[1]: vault.service: Failed with result 'exit-code'.
Feb 25 16:42:22 7of0 systemd[1]: Failed to start HashiCorp Vault.

I attempted to redeploy to a different node, same issue. “Vault service not running” Attempting to restart the service on the node makes no difference.
FWIW, this deployment only had the SQL relation added.
I’m using this guide: https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/latest/app-vault.html

Perhaps one of the Openstack guys like @james-page could provide some input?

Thank you, but I’m not sure it’s an “openstack” issue. On a fresh deployment vault fails to start.
Im doing some digging, going to try some previous versions and see if the issue persists.

I think I may have solved my problem, I’ll test more and update.
I moved vault to another node, it seams that vault was not able to connect to mysql if it was on the same node(same ip?).

Well, not sure what’s going on here. New error: It appears that vault is not allowed to connect to msql. fresh deployment, vault on it’s own node.

$ sudo /snap/bin/vault server -config /var/snap/vault/common/vault.hcl
Error initializing storage of type mysql: failed to check mysql schema exist: Error 1130: Host '10.41.0.5' is not allowed to connect to this MySQL server

Just a thought… if you’ve done this manually, it’s possible that the relations weren’t updated. To “move” an application, you can remove the original unit and add another unit with a placement directive (add-unit --to)

by move, I should have clarified. I took everything down and redeployed choosing different nodes.
The issue is with mysql denying the connecting, i’m almost certain. I don’t know why this is happening nor how to fix it. Juju charms is supposed to take care of the mysql permissions when you setup the relations, is that correct?
Using the following:
vault | Juju vault version 35
percona cluster | Juju percona-cluster version 284

or
vault | Juju vault version 32
percona cluster | Juju percona-cluster version 281

results in failure for me.
Looking for help with troubleshooting this.

I’m going to try this with the “mysql” charm and see what happens.

This is what I’m seeing on vault

ubuntu@6of0:~$ sudo systemctl vault status
Unknown operation vault.
ubuntu@6of0:~$ sudo systemctl status vault
● vault.service - HashiCorp Vault
   Loaded: loaded (/etc/systemd/system/vault.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Sat 2020-02-29 15:13:37 UTC; 1h 18min ago
 Main PID: 46168 (code=exited, status=1/FAILURE)

Feb 29 15:13:37 6of0 systemd[1]: vault.service: Service hold-off time over, scheduling restart.
Feb 29 15:13:37 6of0 systemd[1]: vault.service: Scheduled restart job, restart counter is at 5.
Feb 29 15:13:37 6of0 systemd[1]: Stopped HashiCorp Vault.
Feb 29 15:13:37 6of0 systemd[1]: vault.service: Start request repeated too quickly.
Feb 29 15:13:37 6of0 systemd[1]: vault.service: Failed with result 'exit-code'.
Feb 29 15:13:37 6of0 systemd[1]: Failed to start HashiCorp Vault.
ubuntu@6of0:~$ sudo systemctl restart vault
ubuntu@6of0:~$ sudo systemctl status vault
● vault.service - HashiCorp Vault
   Loaded: loaded (/etc/systemd/system/vault.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Sat 2020-02-29 16:32:21 UTC; 22s ago
  Process: 402843 ExecStart=/snap/bin/vault server -config /var/snap/vault/common/vault.hcl (code=exited, s
 Main PID: 402843 (code=exited, status=1/FAILURE)

Feb 29 16:32:21 6of0 systemd[1]: vault.service: Service hold-off time over, scheduling restart.
Feb 29 16:32:21 6of0 systemd[1]: vault.service: Scheduled restart job, restart counter is at 5.
Feb 29 16:32:21 6of0 systemd[1]: Stopped HashiCorp Vault.
Feb 29 16:32:21 6of0 systemd[1]: vault.service: Start request repeated too quickly.
Feb 29 16:32:21 6of0 systemd[1]: vault.service: Failed with result 'exit-code'.
Feb 29 16:32:21 6of0 systemd[1]: Failed to start HashiCorp Vault.

I ran a quick test.

mysql -u vault -p -h 10.41.0.71 vault
Enter password:
ERROR 1130 (HY000): Host '10.41.0.6' is not allowed to connect to this MySQL server
MYSQL server is denying the connection. I'm going to check the settings there.

This is all default deploy, nothing changed in the charm.

Yes, that is correct.

Pinging @ec0. James have you ever seen behaviour like this with Percona Cluster and Vault before?

I’d like to add my comments form another post I made here:

This thread is where my vault journey started.

@timClicks - I haven’t seen this specific behaviour before, which is fairly puzzling.

@nathan-flowers - it’s entirely possible you’ve hit on a charm bug here - I’d be interested in seeing if when logged in to MySQL, if ‘SHOW GRANTS;’ showed you anything interesting? I’ve seen MySQL handle access differently between IP-based authorisation and hostname-based and I’m curious if that’s a factor here, especially after your comment that it didn’t work when deployed to the same unit (which would likely require an explicit grant for localhost usage, which the charm may not be implementing).

Ultimately, it is the charm’s responsibility to handle all authorisation of related services though - so if that’s not happening, a bug for this against percona-cluster is very likely in our future.

Did you end up observing different behaviour with the MySQL charm, rather than the percona one?