Skip to main content

Containers Part 1: What is a Container?

Dmitry Melanchenko
May 23 - 9 min read

This is part 1 of a 3 part series on containers. Check out the background intro in case you missed it.

Without going into too much detail, let’s say that a container is an application and all the libraries it needs to run, packaged together with a single entry point. There are different implementations of containers, but we’re focusing on Docker containers in this document.

Runtime for Docker on Linux and Mac/Windows

Docker containerization is based on Linux Containers — an OS-level virtualization that allows us to limit what resources an application running in the container can access. A key word in this definition is Linux. Docker containerization works natively on Linux.

Docker containerization on Linux is a process or processes on host machine with some level of isolation from resources of the host machine.

MacOS needs to run a Linux kernel underneath as a virtual machine and the host OS shares some resources with the VM.

Docker containerization on macOS is a process or processes inside of a Virtual Machine and host resources are shared with the process via sharing them with the VM first.

Windows 10 Anniversary Edition supports Hyper-V containers and Docker uses this technology to run containers on Windows.

Where container images come from

When we talk about running container it’s important to understand the difference between images and containers.

An image is a copy of a multilayered file system available via standard API with a manifest attached.

A container is a running tree of processes on a host machine. Processes from tree shares resources among themselves like CPU, RAM quotas network stack. All the resources are virtualized from the actual host machine. Container is always based on an images, even if the image has nothing in it.

A common practice these days is to put everything into an image once and run multiple containers using the same image. An image can be built and stored locally or it can be shared via registries. The default registry is hub.docker.com, but the system also supports custom registries.

Names of all images stored in a registries follow a standard pattern. Let’s take a look at an example: www.my-site.com/my/image:v1

The name can be broken into 3 components:

  1. Name of a registry — here, www.my-site.com/ is the web site where you can find the image. If you don’t give one, the registry will be hub.docker.com/.
  2. Path to the image in the registry — the example path is my/image.
  3. Tag — name of a tag attached to a specific version of the image. In the example, the tag is :v1. If you don’t provide a tag, it defaults to :latest.

A registry can keep multiple versions of the same image and every version is identified by a SHA256 hash code. Hash codes are based on content of an image so they can’t be changed. Tags, on the other hand, can be reassigned from one version to another. For example, if a team agreed to run in production containers based on an image with a tag :release then part of a release process can be to move the tag from a previous release version to a new one. It’s important to note that the same version of an image can have several tags attached, and it can be found by any one of the tags.

Build an image using a Dockerfile

The first step in building a new image is to decide on the base image. Containers don’t run their own Linux kernel but everything else is in the image: system libraries and utilities, runtimes, etc. Everyone is free to choose an existing base image, or even build a new one. Here are few most popular images:

  • scratch — An empty base image that contains only the components you add.
  • busybox — A minimalist base image for statically compiled applications. None of the tools located at /usr/bin or /usr/sbin are available in this image. It doesn’t support package repos so everything should be copied into a new image at build time. It does support some basic shell scripts.
  • alpine — A minimalist version of linux. It is based on busybox and supports package management.
  • ubuntudebiancentos, … — These images include libraries and utilities from the popular flavors of Linux including package managers like yum or apt.
  • openjdk — This base image includes a JDK or JRE in addition to basic system utilities and libraries.

After you’ve decided on a good base image, there are 2 options. First, you can create a container from the base image, copy everything into the container and then create an image from the container. Or, you can script the process using a Dockerfile. All instructions supported inside of a Dockerfile can be found online. Here’s an example of a simple file:

FROM scratch
COPY hello /
CMD ["/hello"]

This script tells Docker that the new image is based on a scratch image. Next, it says that a hello file should be copied from the local file system into the image. The last instruction runs the hello file when a new container based on the image starts.

To build an image from a Dockerfile in the current directory simply run docker build ., or the newer version docker image build .. To give the image a specific name, add the -t <image name> flag. It’s also possible to tag or re-tag the image later using command docker tag <source tag or hash> <target tag>.

Learn more about docker build.

How to add a file to an image

There are 2 commands that add files to an image:

  • COPY <src>... <dest> or COPY ["<src>",... "<dest>"] – this command copies file(s) from a local file system into the image. When docker builds an image it runs in some directory, normally the same directory where Dockerfile is located. All files available to COPY should be either in the directory on in one of the subdirectories.
  • ADD <src>... <dest> or ADD ["<src>",... "<dest>"] – despite the same semantics as COPY, ADD can insert files not only from a local file system but it can also download files from unsecured URLs, unpack tar archives with gzip, bzip2 or xz compression into a folder. The unpack feature is available only when an archive is copied from a local file system.

How to run a command in the image

Often it is important to run a command while building an image, to do so call RUN <command> or RUN ["executable", "param1", "param2"]. This will run a sequence of commands in a container which is used to build the new image. I.e. in this example:

FROM openjdk:8-jre
RUN ln -s ${JAVA_HOME} /usr/java

An image based on openjdk:8-jre will be created and a path defined by an env variable JAVA_HOME will be sym linked to /usr/java.

How to define container startUP COMMANDS

When you build an executable image it’s important to define which command will be executed when a container based on the image starts. Docker provides 2 ways to define that:

  • ENTRYPOINT ["executable", "param1", "param2"] or ENTRYPOINT command param1 param2 – This instruction defines what should be executed at container startup.
  • CMD ["executable","param1","param2"] , CMD ["param1","param2"] or CMD command param1 param2 – This instruction is mostly the as ENTRYPOINT. The only difference is the second form. When both ENTRYPOINT and CMD are defined, the final command looks like: entrypoint_executable entrypoint_params ... cmd_params ...

To illustrate how startup commands are used in practice, let’s consider an example. Our container should execute a command with the following semantics:

app [command] [arguments]
Command:
help - prints this message
check-config - checks provided configuration
run - runs the application

And the image is built using a Dockerfile like this one:

FROM scratch
COPY app /
ENTRYPOINT /app

If we run a container created from the image as docker run —rm app-image help, this command executes the app with a help command. It’s important to note that everything after image in a docker run command is interpreted as CMD.

If the same container is run as docker run -d app-image run, it runs the application and keeps the container alive while the app is running.

How to expose ports

Often, applications need to listen for requests over TCP or UDP sockets. Containers don’t accept inbound communication by default; all ports are locked inside of a container, even if an application binds to 0.0.0.0:<some port>. To enable inbound communication to your application the port needs to be exposed. Dockerfiles support a directive EXPOSE <port> [<port>/<protocol>...] to do that.

How to set default user

Containers carry multiple features to sandbox and isolate applications running in a container from a host machine, but it’s always possible to make mistakes and allow access from a container to critical resources on the host.

The default user in a container is root. If user namespaces aren’t used and when a container runs, / or /dev are mounted from a host to the container. Anyone with remote execution in the container can get full access the host.

To minimize the chances of hitting this problem, Dockerfiles provide the USER <user>[:<group>] or USER <UID>[:<GID>] instructions. These instructions indicate that all following instructions will run under the User and Group. Returning back to our example Java container:

FROM openjdk:8-jre
RUN ln -s ${JAVA_HOME} /usr/java
USER nobody:nogroup
CMD /bin/bash

This new line instructs Docker to run an ENTRYPOINT command in a container based on the image under nobody:nogroup.

How to set default directory

Let’s imagine that relative paths are used everywhere in an entry point script that comes with an application: path to a config is ./config/my.conf, command line to run the application is java -jar ./app.jar and so on. To workaround such assumptions the container should somehow switch to a home folder of the application before executing the entrypoint script. To do that Dockerfile provides instruction WORKDIR /path/to/workdir. This instruction tells that all instruction after it should be executed from the directory.

Other instructions

Dockerfiles support several other instructions, and this list changes often with new releases of Docker. Check out the complete list with more details, warnings, and notes is found here.

How to run a single docker container

After an image is scripted and built using docker build, it’s time to run it. Let’s assume that the name of our image is image:v1.

To run the image all we need is to call docker run image:v1. This command creates a container from the image and executes it.

There are several useful flags for the docker run command. All of them should be placed between run and the name of an image:

  • -it – Runs a container in an interactive mode and creates tty for the container. In other words, if you want to run bash in ubuntu, use docker run -it ubuntu /bin/bash
  • --rm – Removes a container after a main process exits. When you play with containers, running this or that just to check something, pieces of containers accumulate on the host machine and eat disk spac.. If you don’t need anything from the container when you’re done with it, run it with this flag.
  • -d – Runs a container in the background. If you run a non-interactive container and need your shell back, run it with this flag.
  • --name – Assigns a name to a container. This flag is useful when you want a single instance of a container to run. Imagine you run a container with -d at the end of the day and try to run it again the next day. Without this flag, Docker just starts another instance of the container, but with this flag it tells you that container with the same name is already running.
  • -v – Mounts something from a local file system into the container. For example, imagine you’re trying different settings for an application running in a container. Instead of rebuilding the image after every change, you can change the config file locally and only restart the container to apply the latest changes, or maybe even send commands to the application to re-read the configuration.
  • -e – Sets environment variable in a container.
  • -p – Exposes an additional port from the container. For example, a combination of -e and -p is useful when you want to enable remote debugging of a Java application running in a container.
  • --entrypoint – Redefines a container entrypoint. This flag can be useful if you want to try some changes or workarounds.

docker run supports many other flags.

The newer version of this command is docker container run. This command is standard now, while docker run is kept for backward compatibility only.

How to connect to a running container

When a container is up and running, you might need to check something in the container. For example, say you want to look at what’s in a config file generated by an entrypoint script or what processes are running in the container. Docker allows you to connect to a container using the docker exec or docker container exec command. For example, docker exec —it <container sha256 hash> /bin/bash runs bash in the container and gives you access. By default, the command will runs under a default user (root or defined by the last USER instruction in a Dockerfile). If you want to run it under a different use, simply add --user <uid> to the command.

Check out more information about docker exec.

If your container is running in the background and you want to check the latest logs from the container then you can use docker logs <container id> or docker container logs <container id>. With a -f flag, the command prints updates as they come.

Check out more information about docker logs.


Continue reading part 2, “What is docker-compose?” and let me know your feedback on this part in the comments.

Related DevOps Articles

View all