Swarm is a container orchestration framework. It handles creating a network and bringing up container across a series of hosts/nodes.
$ docker info | grep Swarm
# if it's not enabled
# => Swarm: inactive
# if it's enabled
# => Swarm: active
$ docker swarm init
# Swarm initialized: current node (7v2wlaljlfumj9keuw3opxjc7) is now a manager.
#
# To add a worker to this swarm, run the following command:
#
# docker swarm join --token SWMTKN-1-3xq68hwg60xfigzsnbx2qhel6jpirf6pzvqpmss2xx54aoh5pi-763xowura116eykki60453hfg 192.168.65.3:2377
#
# To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
$ docker service create alpine ping 8.8.8.8
# qg2ivwufif1nzzg5k54wbew7e
This outputs the ID of the service
$ docker service ls
# ID NAME MODE REPLICAS IMAGE PORTS
# qg2ivwufif1n sad_poincare replicated 1/1 alpine:latest
$ docker service ps sad_poincare
# ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
# v0pw5a8plw3t sad_poincare.1 alpine:latest linuxkit-025000000001 Running Running 5 minutes ago
You can also look at the containers directly. Notice how the name is augmented to include the service name and replica number.
$ docker container ls
# CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
# 3826f800d56e alpine:latest "ping 8.8.8.8" 5 minutes ago Up 5 minutes sad_poincare.1.v0pw5a8plw3torc0mygpfsuow
$ docker service update qg2ivwufif1n --replicas 3
# qg2ivwufif1n
# overall progress: 3 out of 3 tasks
# 1/3: running [==================================================>]
# 2/3: running [==================================================>]
# 3/3: running [==================================================>]
# verify: Service converged
$ docker service ls
# ID NAME MODE REPLICAS IMAGE PORTS
# qg2ivwufif1n sad_poincare replicated 3/3 alpine:latest
$ docker service ps sad_poincare
# ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
# v0pw5a8plw3t sad_poincare.1 alpine:latest linuxkit-025000000001 Running Running 13 minutes ago
# rswiwj2yupzp sad_poincare.2 alpine:latest linuxkit-025000000001 Running Running 3 minutes ago
# wo6yva0hs936 sad_poincare.3 alpine:latest linuxkit-025000000001 Running Running 3 minutes ago
Notice after the update, the number of replicas is now set to 3.
If you remove one of the service containers manually, Swarm will bring it back up.
$ docker container ls
# CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
# 864b580b7f9f alpine:latest "ping 8.8.8.8" 11 minutes ago Up 11 minutes sad_poincare.3.wo6yva0hs936zz5496rh4dlrh
# 331014cab26d alpine:latest "ping 8.8.8.8" 11 minutes ago Up 11 minutes sad_poincare.2.rswiwj2yupzpg6fa98wyl2hhd
# 3826f800d56e alpine:latest "ping 8.8.8.8" 21 minutes ago Up 21 minutes sad_poincare.1.v0pw5a8plw3torc0mygpfsuow
$ docker container rm -f 864b580b7f9f
# 864b580b7f9f
$ docker service ps sad_poincare
# ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
# v0pw5a8plw3t sad_poincare.1 alpine:latest linuxkit-025000000001 Running Running 21 minutes ago
# rswiwj2yupzp sad_poincare.2 alpine:latest linuxkit-025000000001 Running Running 11 minutes ago
# b3mbljpdpoyp sad_poincare.3 alpine:latest linuxkit-025000000001 Running Running 1 second ago
# wo6yva0hs936 \_ sad_poincare.3 alpine:latest linuxkit-025000000001 Shutdown Failed 6 seconds ago "task: non-zero exit (137)"
Notice that the ps command outputs history for the shutdown and restarted container sad_poincare.3
.
$ docker service rm sad_poincare
This removes the service and the related containers.
If you want to play with swarm locally, you can use the docker-machine
command to create nodes with VirtualBox.
$ docker-machine create node1
# Running pre-create checks...
# Creating machine...
# (node1) Copying /Users/Elliot/.docker/machine/cache/boot2docker.iso to /Users/Elliot/.docker/machine/machines/node1/boot2docker.iso...
# (node1) Creating VirtualBox VM...
# (node1) Creating SSH key...
# (node1) Starting the VM...
# (node1) Check network to re-create if needed...
# (node1) Waiting for an IP...
# Waiting for machine to be running, this may take a few minutes...
# Detecting operating system of created instance...
# Waiting for SSH to be available...
# Detecting the provisioner...
# Provisioning with boot2docker...
# Copying certs to the local machine directory...
# Copying certs to the remote machine...
# Setting Docker configuration on the remote daemon...
# Checking connection to Docker...
# Docker is up and running!
# To see how to connect your Docker Client to the Docker Engine running on this virtual machine, run: docker-machine env node1
Assuming you've created 3 nodes.
$ docker-machine ls
# NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS
# node1 - virtualbox Running tcp://192.168.99.103:2376 v18.09.1
# node2 - virtualbox Running tcp://192.168.99.104:2376 v18.09.1
# node3 - virtualbox Running tcp://192.168.99.105:2376 v18.09.1
$ docker-machine rm node1
$ docker-machine ssh node1
$ docker-machine env node1
# export DOCKER_TLS_VERIFY="1"
# export DOCKER_HOST="tcp://192.168.99.103:2376"
# export DOCKER_CERT_PATH="/Users/Elliot/.docker/machine/machines/node1"
# export DOCKER_MACHINE_NAME="node1"
# # Run this command to configure your shell:
# # eval $(docker-machine env node1)
You can also run the following command which sets the above environment variables in the current shell causing the docker commands to point to the node1
node:
$ eval $(docker-machine env node1)
To unset this:
$ eval $(docker-machine evn -u)
Create the first swarm manager on node1
$ docker-machine ssh node1
node1> $ docker swarm init
# get manager join token command for the other nodes
node1> $ docker swarm join-token manager
# To add a manager to this swarm, run the following command:
#
# docker swarm join --token SWMTKN-1-4mm6iio54sj78z3f87og30e3ygrzrijy9iqciyp7drsqolsghm-6hy6fefvtzr6f0dn8k9704zpu 192.168.99.106:2377
Add node2 and node3 as manager nodes in the swarm:
$ docker-machine ssh node2
node2> $ docker swarm join --token SWMTKN-1-4mm6iio54sj78z3f87og30e3ygrzrijy9iqciyp7drsqolsghm-6hy6fefvtzr6f0dn8k9704zpu 192.168.99.106:2377
node2> $ exit
$ docker-machine ssh node3
node3> $ docker swarm join --token SWMTKN-1-4mm6iio54sj78z3f87og30e3ygrzrijy9iqciyp7drsqolsghm-6hy6fefvtzr6f0dn8k9704zpu 192.168.99.106:2377
node3> exit
Start a service on the swarm:
$ docker-machine ssh node1
node1> $ docker service create --replicas 3 alpine ping 8.8.8.8
This will gracefully stop the VM:
$ docker-machine stop node1
$ docker-machine start node1
$ docker-machine rm node1
If you're creating VPSs to host the swarm, you need to install Docker:
# ssh into server
$ curl -fsSL https://get.docker.com -o get-docker.sh && sh get-docker.sh
log into node 1 and initalize the swarm:
# using the public IP of the server (should this be the private local IP instead?)
node1> $ docker swarm init --advertise-addr 157.230.152.10
# Swarm initialized: current node (mm3ph4phd9owg3wn8d0my9rj6) is now a manager.
#
# To add a worker to this swarm, run the following command:
#
# docker swarm join --token SWMTKN-1-4avreivdahazi6mqxgswtxdvlzr2cqhk23x0cricza1usbcxva-c1x9e5xjb7epv16wy5pnu7wkl 157.230.152.10:2377
#
# To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
Then log into the other servers/nodes and add the join command:
node2> $ docker swarm join --token SWMTKN-1-4avreivdahazi6mqxgswtxdvlzr2cqhk23x0cricza1usbcxva-c1x9e5xjb7epv16wy5pnu7wkl 157.230.152.10:2377
node3> $ docker swarm join --token SWMTKN-1-4avreivdahazi6mqxgswtxdvlzr2cqhk23x0cricza1usbcxva-c1x9e5xjb7epv16wy5pnu7wkl 157.230.152.10:2377
Then on node 1, you can list out the nodes in the swarm:
node1> $ docker node ls
# ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS ENGINE VERSION
# mm3ph4phd9owg3wn8d0my9rj6 * node1 Ready Active Leader 18.09.1
# 5mmqipd3k3qqstm7etqb6irxo node2 Ready Active 18.09.1
# gfh833f1lewdkzek0i06myxdm node3 Ready Active 18.09.1
Notice that node1 is the leader.
You can have multiple leaders. To migrate node2 to a leader:
node1> $ docker node update --role manager node2
Now lets create a new service to run a process and replicate it 3 times:
node1> $ docker service create --replicas 3 alpine ping 8.8.8.8
# st83v3oy9agngbpgrwkfl2x96
# overall progress: 3 out of 3 tasks
# 1/3: running [==================================================>]
# 2/3: running [==================================================>]
# 3/3: running [==================================================>]
# verify: Service converged
node1> $ docker service ls
# ID NAME MODE REPLICAS IMAGE PORTS
# st83v3oy9agn blissful_davinci replicated 3/3 alpine:latest
node1> $ docker service ps blissful_davinci
# ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
# oytfv72hzo0z blissful_davinci.1 alpine:latest node3 Running Running 58 seconds ago
# wopizygovj0a blissful_davinci.2 alpine:latest node1 Running Running 57 seconds ago
# gfra5nvh7r4r blissful_davinci.3 alpine:latest node2 Running Running 57 seconds ago
Notice that it spread the service containers across the 3 available nodes.
# Build image
$ docker build -f Dockerfile.production -t elliotlarson/onehouse-website:production .
$ docker push elliotlarson/onehouse-website:production
$ eval($docker-machine env ohweb)
# Deploy to the swarm
$ docker stack deploy --with-registry-auth -c docker-stack.yml ohweb
$ docker service ls
# ID NAME MODE REPLICAS IMAGE PORTS
# 9ffl0bmlyh9a rtest_database replicated 1/1 postgres:11.1-alpine
# jzpb3277447w rtest_db-creator replicated 0/1 elliotlarson/myapp_web:prod
# tm8telxthliq rtest_db-migrator replicated 0/1 elliotlarson/myapp_web:prod
# zk83eabezdv6 rtest_web replicated 1/1 elliotlarson/onehouse-website:production *:80->3000/tcp
$ docker service ps ohweb_web
# ID NAME IMAGE NODE DESIRED STATE CURRENT STATE ERROR PORTS
# ed2k9qzn6j3x rtest_web.1 elliotlarson/onehouse-website:production rtest Running Running 2 minutes ago
# axrycw0uadax \_ rtest_web.1 elliotlarson/onehouse-website:production rtest Shutdown Shutdown 2 minutes ago
# 7h7uoy7fjptd \_ rtest_web.1 elliotlarson/onehouse-website:production rtest Shutdown Shutdown 14 minutes ago
# oc1gb5iego9z \_ rtest_web.1 elliotlarson/onehouse-website:production rtest Shutdown Failed 14 minutes ago "task: non-zero exit (1)"
# ngyu5rmw9zdx \_ rtest_web.1 elliotlarson/onehouse-website:production rtest Shutdown Failed 14 minutes ago "task: non-zero exit (1)"
$ docker service logs ohweb_web
# Once happy undo the env var setting
$ eval $(docker-machine env -u)
# Make sure you have the env vars set so you talk to the server with Docker commands
$ eval($docker-machine env ohweb)
# See what services are running
$ docker service ps
# ID NAME MODE REPLICAS IMAGE PORTS
# iubo5lg3z5lx rtest_app replicated 1/1 elliotlarson/onehouse-website:production *:80->3000/tcp
# wolxt036jpxi rtest_database replicated 1/1 postgres:11.1-alpine
# 5csm97zz4ifh rtest_db-creator replicated 0/1 elliotlarson/onehouse-website:production
# idtgsg3apftb rtest_db-migrator replicated 0/1 elliotlarson/onehouse-website:production
# Checkout the logs for a service
$ docker service logs rtest_app
# See what containers are running
$ docker container ps
# CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
# 7fb12e44f183 elliotlarson/onehouse-website:production "bin/rails s -b 0.0.…" 2 minutes ago Up 2 minutes rtest_app.1.pu4h2tflon9jj9kcjbsqj74c9
# 06fc7a7124e8 postgres:11.1-alpine "docker-entrypoint.s…" 19 minutes ago Up 19 minutes 5432/tcp rtest_database.1.yrv0bgfzqr57frwn3ogm9okht
# Log into the container
$ docker exec -it rtest_app.1.pu4h2tflon9jj9kcjbsqj74c9 sh
# Get a list of the containers
$ docker container ls
# CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
# 95c853b76ca2 elliotlarson/onehouse-website:production "bin/rails s -b 0.0.…" 5 minutes ago Up 5 minutes rtest_app.1.h3smda9yk95u0w8adkyrbv2cc
# 06fc7a7124e8 postgres:11.1-alpine "docker-entrypoint.s…" 42 minutes ago Up 42 minutes 5432/tcp rtest_database.1.yrv0bgfzqr57frwn3ogm9okht
# Get on the Rail console
$ docker exec -it rtest_app.1.h3smda9yk95u0w8adkyrbv2cc rails c