Description
Hey guys,
I see that the image created from this repo is running every node together in one single container. While it's helpful for users to test and try out Druid, it isn't particularly useful when it comes to deploying a production cluster.
I've been working on using docker to deploy a truly distributed Druid cluster lately. I have had something working and I'd like to share it and contribute back. I'm wondering if it will be valuable though. Take a look at this fork to see what I've done so far. While the settings in that fork are set specifically for my team's research purpose, they can be generalized and made extendable. I'm still in an exploration stage on docker deployments, so I may have made stupid mistakes.
Due to its lightweight, Docker philosophy encourages running only one service in a container and having containers talk to each other, rather than running all dependencies and services in one container. Therefore, I defined separate images for each type of Druid node(druid-broker
, druid-coordinator
, etc.) as well as dependencies(druid-zookeeper
, druid-mysql
, druid-kafka
, etc.) I also packed the jvm and runtime configurations into a separate image(druid-conf
).
I had a discussion earlier with @cheddar on deployment stuff. He made it clear that deployment in general should only have 3 steps:
- Download the artifact.
- Download the configurations.
- Run the artifact with the configurations.
I followed this guideline: when running a specific druid node, say broker, all you need to do is:
- Pull
druid-broker
image - Pull
druid-conf
image - Run
druid-conf
in a container, and then link it as a volume provider(using--volumes-from
) for thedruid-broker
container.
Containers on different nodes can freely communicate with each others as long as they are within a same overlay network. I leverages docker-machine
to manage/provision remote nodes, and docker swarm
for container orchestration. Running a broker node for example is just as simple as:
docker run -itd --volumes-from=node-1-conf --name=broker --network=p-my-net --env="constraint:node==p-node-5" druid-broker:latest