I recently attended the PyStok#23  meeting. After
the event, me and my colleague got into an idea to write a small write-up about
Dockerfile best practices. This article is divided into several headers, each describes one advice for proper
Docker image creation. Enjoy!
Reduce number of layers
First of all, The less layers you will create, the smaller your image will be. Of course, you need to find readability / ascetic balance. Actually, it is perfect to have only one layer, but indeed, it is really rarely seen and possible.
Wrong way. You do not care about
FROM centos RUN yum install -y httpd RUN yum install -y curl
Good way. You do keep in mind that reducing layers is good to have
FROM centos RUN yum install -y httpd curl
Let's assume a typical situation. You would like to deploy a fully operational
Tomcat based application with
CentOS operating system libraries underneath. How will you do that?
Wrong way. You build a big
Dockerfile with bunch of dependencies basing on
FROM centos RUN yum install -y java-1.7.0-openjdk tomcat COPY app.war /usr/local/tomcat/webapps/app.war
Good way. You use official image as a base. Then, you will create
a fully operational container with
Java. After that, you will be able to create a
Tomcat image and your last step will be to create an
Application image with your build
application container on
Tomcat build basis.
FROM registery.example.com/centos:7 RUN yum install -y java-1.7.0-openjdk
FROM registery.example.com/centos7-java:1.7.0 RUN yum install -y tomcat
FROM registery.example.com/centos7-java170-tomcat:7 COPY app.war /usr/local/tomcat/webapps/app.war
Well, few words of explaination. You need to think for the future. Assume you will need to run a second application that will be based on the same dependencies. Wouldn't it be cool to have a template for
Tomcat based applications? Assuming your developer would like to run a simple JAR as an application? Wouldn't it be cool to have a
Java focused image? Isn't it much more readable to have one purpose for each image and then build a stack? Think about that.
Always add a maintainer and some metadata. There is no documentation for your
Dockerfile besides this
Dockerfile and metadata. So do not be afraid of adding many labels such
- project name.
- project description,
- release date,
and many others.
Ok. One of most famous security issues is to keep
.git directory in
HTTP server. Making long story short - use
avoid copying data that might cause problems or is not a requirement
to run your application. **Wrong way. **You do not use
.dockerignore and copy everything what is in application root tree.
**Good way. **You copy data with caution. An example of a well-defined
.dockerignore is attached below:
.git .gitignore LICENSE VERSION README.md Changelog.md Makefile docker-compose.yml docs
Make commands readable
You need to remember, you are not the only one who will be using your
Dockerfile. So, instead of enjoying serial features lines, I really encourage you to try to keep your code clean.
Wrong way. You create an extremely long run commands.
FROM ubuntu:xenial RUN apt-get install -y tar git curl nano wget dialog net-tools build-essential
Good way. You split your commands into multiline parts.
FROM ubuntu:xenial RUN apt-get install -y tar \ git \ curl \ nano \ wget \ dialog \ net-tools \ build-essential
Best way. You split your commands into multiline parts and sort them
FROM ubuntu:xenial RUN apt-get install -y build-essential \ curl \ dialog \ git \ nano \ net-tools \ tar \ wget
Well, it is highly recommended to:
- use absolute paths inside
It is much more readable to the absolute values and one, defined working
directory then bunch of
Clean after yourself
If you make a mess, you need to clean after yourself, so instead of
leaving it for later, do it as soon as it is possible. What is the
reason? The same. Image size. **Wrong way. **You do not clean after
FROM centos:7 RUN yum install -y httpd
Good way. You clean after yourself at all.
FROM centos:7 RUN yum install -y httpd ADD . /srv RUN yum clean all
Best way. You clean after yourself as soon as you can.
FROM centos:7 RUN yum install -y httpd && \ yum clean all ADD . /srv
Keep your container clear
Basically, to run your application you do not need to have unit-testing
libraries installed. The same thing with integration and performance
testing packages as well. My tip: do not install build-essentials
packages to the image that will run your application. **Wrong way. **You
install everything and your Docker images are larger than they should
be. **Good way. **You install only needed and nescessary packages to run
your code. **Best way. **You keep your containers well-defined.
One purpose. One process. You keep your installation list to a minimum.
The only good way. Update your packages.
FROM centos:7 RUN yum -y update && \ yum install -y httpd && \ yum clean all ADD . /srv
Do not use
latest tag is like a
Maven familiar people. The best
way is to precise your base image version and use this only one. Wrong
way. You use latest tag.
Good way. You precise image version.
Be aware of order
Remember. If your layer has changed, every next layer will be rebuild.
Wrong way. You do not care about the order of your commands.
Good way. You think about the build process. Let me present you an example.
You have a
Django based application. The
Python packages are
requirements.txt. A wrong example can look like the
FROM centos:7 ... WORKDIR /srv ADD . /srv/app RUN pip install -r /srv/requirements.txt
What is wrong with this code? It is obvious, your code is changed any
time you run your build. If it is not, your artifacts should not change.
Anyway, if your code has changed, the first layer is changed by every
build. It causes that your application requirements will be reinstalled
by every build. And imagine your requirements.txt contains numpy
package. The better alternative is to create the following code.
FROM centos:7 ... WORKDIR /srv ADD ./requirements.txt /srv/requirements.txt RUN pip install -r /srv/requirements.txt ADD . /srv/
As you can see, this
Dockerfile contains one additional line and the
layers from the previous example.
HEALTHCHECK is your friend
Starting a process is not all. We want to be sure container is up and
running. I would highly recommend to add
HEALTHCHECK option to all
your containers. It convince you** **that your
HTTP server is able to
handle new connection, that your
Redis database is up and running. An
HEALTHCHECK is attached:
HEALTHCHECK --interval=5m \ --timeout=3s \ CMD curl -I -L -f https://localhost/ || exit 1
It is not all
Have you noticed I gave you over 10 tips? :-) Anyway, of course, I have not mentioned bunch of security cases, written about the complementary of build, mentioned stability of your applications and storage considerations. I will write a big article about security topic in April this year. I hope this article was helpful for you. If you are interested in Docker images security, I will be happy to see a feedback. I really encourage you to read the following references: , , , . My above considerations were based on them and my own practice and experience as well. They will definitely improve your Docker images understading.