Skip to main content

Containers Part 2: What is docker-compose?

Dmitry Melanchenko
May 23 - 8 min read

This is part 2 of a 3 part series on containers. Check out the background intro in case you missed it.

Up to this point, we mainly focused on working with a single container. However in reality an application doesn’t usually run in isolation. To run, it needs supporting services like database, queue service, and so on. It’s fine to run all of these services by calling docker run for every container and to script these sequences in a shell script, but there’s already a tool for this task: docker-composedocker-compose is a python script that reads configuration from a YAML file (docker-compose.yaml). It describes how to configure and run a group of containers, which virtual networks they should be bound to, which volumes should be mounted, and more.

In addition to this, docker-compose supports a subset of commands from the original Docker CLI but applies them to the containers referenced by name in the YAML configuration.

Check the docs for more information about docker-compose.

And check the docs for more information about docker-compose.yaml.

The typical structure of a docker-compose.yaml file is:

version: '2'
services:
webapp:
image: nginxnetworks:
myapp:

volumes:
mydata:

The standard for the YAML file changes with every release, but the standard is versioned and every next release of the tool is backward compatible. The first field in the file is version. It defines which version of the standard should be used to parse the file.

How to define services

The next key node in the docker-compose.yaml file is services. This node defines all containers to run as part of the configuration. The structure of this section is very simple:

services:
<service 1 name>:
<map of container 1 properties>

<service 2 name>:
<map of container 2 properties> ...

Service names in the sections are important because if you want to manipulate a single container using docker-compose, you use the service name in the command. For example, if you want to start a container named boo you should run docker-compose up boo.

See the docs for the full list of container properties supported by docker-compose.

How to define networks

The next top-level section in a docker-compose.yaml file is for network configuration. The section looks like this:

networks:
<network-id>:
<map of settings for the network>
<another-network-id>:
<map of settings for the other network>

There are several configuration parameters for networks, but normally the name is enough:

networks:
some-name: {}

It’s important to understand how docker-compose names networks. Let’s assume directory structure of our project is:

~/Workspace
|--> my_project
|--> docker-compose.yaml

And a docker-compose.yaml file has a section like:

networks:
my-network:

In this example, the final name of the network (what Docker displays when you call docker network ls) is myproject_my-network. This note is important because the network name defines the default domain name. The default domain name is part of the FQDN returned on reverse DNS lookups for a container.

The name of a project is part of the network name so if you clone a project from somewhere else then it’s important to clone it into a folder with the same name as the name of the project. The author of the project might have some code or configurations based on that assumption.

More interesting is what happens if name of the network is “my.network”? The answer is obvious: the domain name is <HOST>.my.network.

A FQDN after a reverse lookup for a service named “my-service” and a network named “my.network” is myproject_my-service_1.myproject_my.network. The 1 in the FQDN is a sequence number for an instance of the service. If the service is scaled up, the next instance will have 2 in the FQDN.

Check out more information about network configuration.

Connect a service to a network

When one or more networks are defined as part of docker-compose.yaml, it’s important to connect services to the networks. Only services within the same network can see each other. If you want to separate services into some logical segments, assign them to different networks. A service assigned to two networks is visible to services from both of the networks. Different networks can be used, for example, to simulate different kerberos realms.

Another useful feature is that services can have aliases on a network – like CNAME DNS records.

To connect a service to a network you should add something like this into your docker-compose.yaml:

services:
my-service:
image: my-service-image:v1
networks:
- network1

networks:
network1: {}

To connect the same service to two networks, simply use something like this:

services:
my-service:
image: my-service-image:v1
networks:
- network1
- network2

networks:
network1: {}
network2: {}

To set an alias to a service on a network, use something like this:

services:
my-service:
image: my-service-image:v1
networks:
network1:
aliases:
- the-best-service
- service.my.domain

networks:
network1: {}

How to define volumes

Another important section of the docker-compose.yaml file is volume configuration. The top-level key in the section is volumes, and all environment-level volumes go under this section.

The section looks like this:

volumes:
<volume-id>:
<map of settings for the volume>
<another-volume-id>:
<map of settings for the other volume>

The official documentation describes several parameters to configure the volumes. However most of the time all you need is the simplest version of the section:

volumes:
zk-data: {}
hbase-data: {}

Why would a cluster need some env-level volumes? There are two major reasons for that: if you want to share a volume among several containers or persist data during environment stops and starts.

Newly created volumes follow the same naming pattern as described for networks. Names of volumes aren’t as critical as they are for networks, but still pay some attention to what folder you clone a project to your computer.

Mount local files to a container

It’s not uncommon to need to inject some locally stored files into a container. It might be a TLS certificate, a kerberos keytab, a configuration file, a script, or something else. There are few simple rules to do that:

  • If the files are significant for the project, it’s better keep them either in a context folder or in one of the subfolders. Doing so prevents creating extra dependencies for your colleagues to reproduce if they decide to use the project.
  • Avoid mounting any files that could create a security threat: real ssh private keys, docker.sock, passwords, etc.
  • Try to mount files in read-only mode whenever possible.

Here’s an example of how to mount a file. Let’s assume we’re running zookeeper as part of our cluster and we’re using an image called jplock/zookeeper. As of now (commit f42ab44), the image expects a zoo.cfg to be placed here: /opt/zookeeper/conf/zoo.cfg. We keep our version of the file in a project subdirectory called zk:

version: "2"services:
zk:
image: jplock/zookeeper
volumes:
- ./zk/zoo.cfg:/opt/zookeeper/conf/zoo.cfg:ro

Mount volume to a container

Next we’ll talk about mounting environment-level volumes into a container. Let’s continue with the zookeeper example: we want to add a persistent volume to keep ZK data when the container restarts. To do this, we need to enhance our example a little bit. Let’s say our version of zoo.cfg leaves dataDir the same as in a zoo_sample.cfg (from a zookeeper package): dataDir=/tmp/zookeeper

version: "2"services:
zk:
image: jplock/zookeeper
volumes:
- ./zk/zoo.cfg:/opt/zookeeper/conf/zoo.cfg:ro
- zk-data:/tmp/zookeeper

volumes:
zk-data: {}

How to define dependencies between containers

Usually a local dev environment for an application a team is a multi-service setup. So in some cases, it can be useful to have a startup sequence for services: Apache Kafka depends on Apache Zookeeper, HBase Master node depends on HBase Region server, and so on. It’s very common for applications to exit with an error if some critical dependency is unavailable.

Our goal is to spin up the whole environment or any parts of it using one command without babysitting every individual container. To achieve this goal we need to define those dependencies and instruct docker-compose when to start the next service.

There’s a special parameter to set the dependencies for services. It is called depends_on. A value of the parameter can be either a list or a condition. Let’s look at an example of the list version:

services:
zk:
image: jplock/zookeeper

kafka:
image: wurstmeister/kafka
environment:
- KAFKA_ZOOKEEPER_CONNECT: zk:2181
depends_on:
- zk

The list version of the parameter tells docker-compose that when docker-compose up kafka is called – it really means docker-compose up zk kafka. It’s easy to run the services directly without any dependencies and that works fine if there are only 2 services. When the number of services grows, it’s easier to use dependencies.

Versions 2.1 – 3.0 of the docker-compose also support a “condition” version of the parameter. Here, a value of the parameter is a map, where keys are names of other services and values are condition: <state> telling in what state the services are expected to be to run the current one.

services:
zk:
image: jplock/zookeeper

kafka:
image: wurstmeister/kafka
environment:
- KAFKA_ZOOKEEPER_CONNECT: zk:2181
depends_on:
zk:
condition: service_healthy

There are two possible values for the condition: service_healthy and service_started.

After docker-compose version 3.0 the condition version was deprecated and the tool’s authors suggest to either implement retries in your application or use other tools to hold a startup process for an application while critical dependences are unavailable. You can try to use tools like wait-for. Alternatively, if bash runs your entry point script, you can use something like this:

while [ $( $(echo > /dev/tcp/zk/2181) >/dev/null 2>&1 && echo "up" || echo "down") != "up" ]; do
sleep 1;
done

How to expose a port from a container

Normally containers list all the ports a user should care about using the EXPOSE instruction. But what if someone needs an extra port to be exposed to get an access to an admin port locked in a container, connect to a remote debug port or some other reason.

Docker-compose exposes additional ports from a container. There are 2 parameters for that: expose, which makes a port available to other containers, and ports, which maps a port to a host so the port can be accessed from the host.

The syntax for expose is simple:

services:
zk:
image: jplock/zookeeper
environment:
JMXPORT: 1234
expose:
- "1234"

This example enables and exposes the JMX port from the container.

Let’s say we want to debug the zookeeper:

services:
zk:
image: jplock/zookeeper
environment:
JVMFLAGS: "-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5005"
ports:
- "15005:5005"

Remote debugging is available on port 15005 on the host computer, and all traffic from that port is proxied to port 5005 inside of the container.

How to build a container from the docker-compose

Docker-compose allows us to build containers for services listed in the docker-compose.yaml file. To do so, you need to add a build parameter into a service:

services:
myapp:
image: my-app:v1
build: ./../my-app

If both build and image parameters are set then docker-compose build myapp will build a new image using Dockerfile at ../my-app folder and will tag the newly built image as my-app:v1.

Be aware that the build command only does a docker build. If, for example, the build process copies an output jar file for your application into a container and you don’t use a multistage Dockerfile, then it’s your responsibility to call something like mvn package to generate the jar file.

Get more details about customizing build parameters in the docs.


Continue reading part 3, “How to run existing applications in containers” and let me know your feedback on this part in the comments.

Related DevOps Articles

View all