Openstack "ceph" network-binding


#1

The glance charm has a “ceph” network binding. This allows the glance service to be configured to have an ip on the network space where ceph lives, such that glance can talk to the ceph-mon and subsequently the ceph-osd.

The nova-compute, cinder, and ceph-radosgw charm need this same network binding. Without this there is an inconsistency where glance can be configured to talk to ceph, but cinder, nova-compute, and radosgw cannot.

@openstack-charmers - SOS

This seems like such an obvious and incredibly important bit here… I’m wondering if there is a reason those services don’t have the ceph endpoint binding? Possibly I’m just confused and missing something? Maybe its in next somewhere?


#2

I’m thinking this could easily be worked around by making the internal network space the same as the ceph public space (I think this is how I have gotten around it before). Then everything (including the services that need to talk to ceph) would get an ip and routing table entry in the space where ceph can be accessed and that space could be used for the internal api traffic too. In this way, the things that need to talk to ceph would be able to.

For nova:
I’ve considered it common knowledge (because no one tells you any better and you just have to conceptually understand that the two things need to talk to eachother) that the machine(s) running nova-compute need to have an interface configured (outside of the context of the openstack charms and juju) on the network that ceph access lives on in order for the rbd access to take place.

I feel that the primary reasons we need the “ceph” binding on the other charms is for a) consistency (so you know nova, cinder, radosgw have routability to the storage), and b) so ceph’s access and os-internal-api can be disjoint spaces.

Thoughts?


#3

Generally you don’t have to be in a space to be able to talk to a space. That is what routing tables are for.

However, I can see where setting up firewall rules, etc, is easier if you say “you’re allowed to talk to your local subnet only”.

I don’t think that will scale very well in the long run, because then all the services that talk to ceph need to end up collocated on the subnet where ceph is.

That said, you may still want a ceph binding if it makes it easier for the charm to configure which interface the packets are supposed to go out. For a machine with 2 network cards, it may be that you want the data traffic kept separate from the API traffic. We’ve generally done that by configuring Ceph to be exposed in a particular space, and then other machines end up with routes to that space via a specific network device. (Via static route definitions in MAAS.)

We have talked about adding route information to spaces, such that as an Operator you can define “if you’re in space X wanting to talk to space Y, use this gateway”. But a lot of the fine details are still in early design phase.


#4

Makes a lot of sense when you put it that way.

In another light it makes sense to not have all the things talking to the big storage traversing a single gateway. This is where I was just thinking switch level routing makes a a lot of sense.

Switch level routing was totally my next step here. I’m not sure that gateway style routing is necessarily the best option though per^.

+1 Switch level routing lays far outside the context of juju. Modeling this with network bindings would bring things full circle in modeling the network connectivity of openstack components with juju spaces.

Switch level and/or gateway level routing may be other great alternatives which should be available to configure outside the context of juju.


#5

I do think the intent was that each subnet within a space would have its own gateway to the other space. So that you would effectively end up with something like switch level routing. (On rack A, use the top-of-rack switch to get to rack B’s Ceph data.)


#6

I have implemented switch level routing for the things that need to talk to ceph such that ceph and thing X need not have interfaces on the same subnet. This was my missing link. Thanks @jameinel for your insight!