After the fantastic DockerCon Europe and the recent releases of Docker
0.5.2 and Swarm
1.0.1 I finally have all the missing bits to automatically deploy a suite of Atlassian products to a swarm cluster without supervision:
- Docker Network and Docker Swarm are now production ready.
- Docker Compose finally works in a multi-host configuration.
- Swarm is capable to handle 1000+ hosts and 50,000 containers as demonstrated live on stage.
This is the dream I have – and we probably all have as an industry: to describe our software components, describe how they are linked together and let the infrastructure automatically arrange itself to match our needs. It’s here! It has been cooking for a while and depending on the technology stack maybe it is already there for you. Nonetheless the Docker suite of tools have reached that moment for me. And it’s glorious.
Let me show you an example of the possibilities.
This is the end result I have in mind, where my setup does not mention any hard-coded IP address:
As a prerequisite I need an account on an IaaS provider, this time around I choose Digital Ocean but any other of the Docker Machine drivers will do. I create an authenticated
API_TOKEN and this allows me to create nodes at will using “
Install and run discovery server
The new multi-host capabilities of Compose and Swarm require a more complete discovery service than the basic Docker Hub Swarm tokens, so in this piece I will use Consul, a discovery server and key/value store from HashiCorp.
- First step, create the consul node using docker-machine:
docker-machine create -d digitalocean --digitalocean-access-token=$DO_TOKEN --digitalocean-region "ams2" consul
This specifies the “ams2” region, passes my token and names this machine “consul“.
Running pre-create checks... Creating machine... Waiting for machine to be running, this may take a few minutes... Machine is running, waiting for SSH to be available... Detecting operating system of created instance... Provisioning created instance... Copying certs to the local machine directory... Copying certs to the remote machine... Setting Docker configuration on the remote daemon... To see how to connect Docker to this machine, run: docker-machine env discovery
- After the machine is ready, switch our
dockerenvironment to run commands on that instance by evaluating:
eval "$(docker-machine env consul)"
- Finally run the consul server in a simple non redundant configuration with:
docker run -d -p 8400:8400 -p 8500:8500 -p 8600:53/udp -h consul progrium/consul -server -bootstrap
- Test it by curling:
curl $(docker-machine ip consul):8500/v1/catalog/nodes
Setup a 3-node Swarm cluster
Now we can create a cluster of 3 machines, with slightly different requirements.
- Let’s start with the Swarm master, which will control our entire cluster:
docker-machine create -d digitalocean --digitalocean-access-token=$DO_TOKEN --digitalocean-image "debian-8-x64" --digitalocean-region "ams3" --swarm --swarm-master --swarm-discovery=consul://$(docker-machine ip consul):8500 --engine-opt="cluster-store=consul://$(docker-machine ip consul):8500" --engine-opt="cluster-advertise=eth0:2376" demo-master
Take note we chose a specific Debian 8.2 image
debian-8-x64, the default Ubuntu image that Docker Machine chooses on Digital Ocean won’t work because it has an older kernel that does not work with Docker overlay networks. We also pass
cluster-advertiseto the Docker engine on this new machine with information on how the swarm can store keys and values of the infrastructure we are building. Those are stored on the
consulinstance we readied before.
- Next create a machine with 2Gb of RAM to run Bitbucket Server:
docker-machine create -d digitalocean --digitalocean-access-token=$DO_TOKEN --digitalocean-image "debian-8-x64" --digitalocean-region "ams3" --digitalocean-size "2gb" --swarm --swarm-discovery=consul://$(docker-machine ip consul):8500 --engine-label instance=java --engine-opt="cluster-store=consul://$(docker-machine ip consul):8500" --engine-opt="cluster-advertise=eth0:2376" node1
We require the machine to have 2GB of RAM and tag this machine with
labeljava so that we can deploy our application based on labels.
- Third, create a machine to host the PostgreSQL database:
docker-machine create -d digitalocean --digitalocean-access-token=$DO_TOKEN --digitalocean-image "debian-8-x64" --digitalocean-region "ams3" --swarm --swarm-discovery=consul://$(docker-machine ip consul):8500 --engine-label instance=db --engine-opt="cluster-store=consul://$(docker-machine ip consul):8500" --engine-opt="cluster-advertise=eth0:2376" node2
We tag this machine with
labeldb so that we can deploy our application based on labels.
- Check that the machines have been created:
NAME ACTIVE DRIVER STATE URL SWARM consul - digitalocean Running tcp://220.127.116.11:2376 cluster * digitalocean Running tcp://18.104.22.168:2376 cluster (master) node1 - digitalocean Running tcp://22.214.171.124:2376 cluster node2 - digitalocean Running tcp://126.96.36.199:2376 cluster
- Connect our local
dockercommand to the entire Swarm:
eval $(docker-machine env --swarm cluster)
Containers: 15 Images: 12 Role: primary Strategy: spread Filters: health, port, dependency, affinity, constraint Nodes: 3 cluster: 188.8.131.52:2376 └ Containers: 2 └ Reserved CPUs: 0 / 1 └ Reserved Memory: 0 B / 519.2 MiB └ Labels: executiondriver=native-0.2, kernelversion=3.16.0-4-amd64, operatingsystem=Debian GNU/Linux 8 (jessie), provider=digitalocean, storagedriver=aufs node1: 184.108.40.206:2376 └ Containers: 10 └ Reserved CPUs: 0 / 2 └ Reserved Memory: 0 B / 2.061 GiB └ Labels: executiondriver=native-0.2, instance=java, kernelversion=3.16.0-4-amd64, operatingsystem=Debian GNU/Linux 8 (jessie), provider=digitalocean, storagedriver=aufs node2: 220.127.116.11:2376 └ Containers: 3 └ Reserved CPUs: 0 / 1 └ Reserved Memory: 0 B / 519.2 MiB └ Labels: executiondriver=native-0.2, instance=db, kernelversion=3.16.0-4-amd64, operatingsystem=Debian GNU/Linux 8 (jessie), provider=digitalocean, storagedriver=aufs CPUs: 4 Total Memory: 3.075 GiB Name: c5e1ce85f79a
Multi-host Docker Compose configuration
Next on the list is to write the multi-host configuration in a
docker-compose.yml, which will take care of starting both our Java application and our database in the proper order. It will also create a transparent overlay network between the cluster nodes involved.
The interesting points of the setup are:
- We do not specify any IP addresses for the physical infrastructure.
- We allocate applications to nodes using label constraints.
- We create a data only container with Bitbucket Server licensing information.
- We only use official images from the Docker Hub.
This is the complete
bitbucket: image: atlassian/bitbucket-server ports: - "7990:7990" - "7999:7999" volumes_from: - license user: root privileged: true environment: - "constraint:instance==java" db: image: postgres ports: - "5432:5432" environment: - "POSTGRES_PASSWORD=somepassword" - "constraint:instance==db" license: build: .
License data-only was built from a
Dockerfile written like this:
FROM alpine RUN mkdir -p /var/atlassian/application-data/bitbucket/shared COPY ./bitbucket.properties /var/atlassian/application-data/bitbucket/shared/bitbucket.properties VOLUME /var/atlassian/application-data/bitbucket CMD ["/bin/true"]
And the only file it stored in reality is a single
bitbucket.properties file with this:
setup.displayName=Bitbucket Server setup.baseUrl= http://localhost:7990 setup.license=<fill your license> setup.sysadmin.username=admin setup.sysadmin.password=admin setup.sysadmin.displayName=<User Name> setup.sysadmin.emailAddress=<Email Address> jdbc.driver=org.postgresql.Driver jdbc.url=jdbc:postgresql://orchestration_db_1:5432/postgres jdbc.user=postgres jdbc.password=somepassword
To start everything we can now invoke
docker-compose, making sure we turn on the multi-host networking and specify we want to use an overlay network:
docker-compose --x-networking --x-network-driver=overlay up -d
The result is our application deployed to the cluster:
docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 0f6adc9a14bb atlassian/bitbucket-server "./bin/start-bitbucke" 2 hours ago Up 2 hours 18.104.22.168:7990->7990/tcp, 22.214.171.124:7999->7999/tcp node1/orchestration_bitbucket_1 0a305957925f postgres "/docker-entrypoint.s" 2 hours ago Up 2 hours 126.96.36.199:5432->5432/tcp node2/orchestration_db_1
Note that the Java application “Bitbucket Server” was deployed to the instance with 2GB of RAM labelled
java as planned, and the PostgreSQL onto
node2 which was labelled
While creating the setup above I ran into a whole set of issues, partially due to the novelty of the tools and partially due to my hastiness.
- Proper orchestration only works with a fully fledged discovery service like consul, not the default
tokenyou get when running the basic `docker swarm create`.
- In the flag
eth1but that is dependent on the specific machine and provider used. In the case of Digital Ocean the correct interface is
eth0. I tracked that down by looking into
/var/log/upstart/docker.logwhere I found this bit:
Error starting daemon: discovery advertise parsing failed (no available advertise IP address in interface (eth1:2376))
To understand what happened I even went looking into the source code.
- At one point I got a very cryptic failure on vxlan interface creation, like the following:
ERROR: Cannot start container 774f639d4275af7f53dd8c8f3d65387d053c8000ab96ce3c6765b982428c3a2d: subnet sandbox join failed for "10.0.0.0/24": vxlan interface creation failed for subnet "10.0.0.0/24": failed in prefunc: failed to set namespace on link "vxlana389573": invalid argument
Turns out that to get the full blown multi-host support in compose and swarm, you need at least a
3.15+Linux kernel (as explained here), and the default Digital Ocean Ubuntu image had an older one:
Linux node3 3.13.0-68-generic #111-Ubuntu SMP Fri Nov 6 18:17:06 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
To make things work I had to add
--digitalocean-image "debian-8-x64"to my
- To find the proper Digital Ocean image I installed a neat tool called
tugboat, which is a command line tool to provision DO images:
gem install tugboat
tugboat authorize tugboat images | grep ubuntu
12.04.5 x64 (slug: ubuntu-12-04-x64, id: 10321756, distro: Ubuntu) 12.04.5 x32 (slug: ubuntu-12-04-x32, id: 10321777, distro: Ubuntu) 15.10 x64 (slug: ubuntu-15-10-x64, id: 14169855, distro: Ubuntu) 15.10 x32 (slug: ubuntu-15-10-x32, id: 14169868, distro: Ubuntu) 15.04 x64 (slug: ubuntu-15-04-x64, id: 14169884, distro: Ubuntu) 15.04 x32 (slug: ubuntu-15-04-x32, id: 14169999, distro: Ubuntu) 14.04.3 x64 (slug: ubuntu-14-04-x64, id: 14530089, distro: Ubuntu) 14.04.3 x32 (slug: ubuntu-14-04-x32, id: 14530129, distro: Ubuntu)
I tried all Ubuntu images and they all failed, including
15.04,so I had to use an image for Debian 8.2, that had the proper kernel version and didn’t crash.
- Whenever I needed to restart the containers, something went wrong with the overlay network creation, the vxlan network gave me a problem. To remove the vxlan configurations I used:
sudo umount /var/run/docker/netns/* && sudo rm /var/run/docker/netns/* && start docker
The source of the above configurations can be found on Bitbucket.
This for me was the first magical step into having an entire suite of Atlassian tools deployed and run automatically onto a Docker Swarm. Stay tuned for the next chapter in the series. If you found this interesting and want more follow me at @durdn or my awesome team at @atlassiandev.