part 2 - customizing containers with Dockerfiles
What is a Dockerfile?
Dockerfile is a configuration that contains all the information about setting up a container. If your application runs on nodejs - you probably need to install it first.
How do I create a Dockerfile?
Let’s start with:
In Dockerfiles the first word is always a command and is followed by its arguments. Here we’re stating, that we want to base our container on
You can base your container on any image. You can find a lot of them (and even more) on dockerhub. Notable ones:
- postgres - comes with postgres installed and configured
- node - nodejs and npm
- openjdk - openjdk for your java application
Building docker image
To build your docker image use:
$ docker build -f $file -t $tag $context
file- is your Dockerfile. The default value for this argument is
tag- name for your docker image. The convention is
context- remember how you can do a
./file? The context is the working directory from which the
COPYcommand will be executed.
$ docker build -t jp2gmd:latest .
Now you can perform
docker run --rm -it jp2gmd:latest bash. For now, it only runs a stock ubuntu container. Let’s see how we can add more stuff.
If we left our Dockerfile like this, we would end up with a pretty bare-bones ubuntu machine (no desktop environment tho, just shell). That’s pretty boring. Let’s run a command inside.
RUN echo "I like trains"
It’s important to notice, that Dockerfile contains instructions, which create an environement, so this
echo will run during build process, not during runtime.
Let’s install cowsay as another example.
RUN apt update RUN apt install -y cowsay fortune RUN /usr/games/fortune | /usr/games/cowsay
Note: this way of doing things is wrong, but we’ll come back to it later.
Let’s say that we have prepared “libraries” for our application - it’s time to copy it to the image.
COPY ./my-app.py ./my-app.py
The syntax is
COPY <from> <to>.
The last part is to setup a “command that should be ran when container starts”. Example:
CMD ["/usr/bin/python3", "./my-app.py"]
For obvious reasons there can only be 1 run command in a Dockerfile.
You may want to setup environment variables inside your container.
These values can also be provided dynamically during build time. Good for API keys.
This is a helper command that changes your working directory while building. Other commands will obey it.
Example dockerfile for python
FROM python:3.8 WORKDIR /app COPY requirements.txt requirements.txt RUN pip install -r requirements.txt COPY src/ src/ COPY static/ static/ COPY templates/ templates/ CMD ["uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "5000"]
Disclaimer about layer caching
Docker caches every layer (state of building image after every command) to improve build times. Let’s consider the following 2 orders of commands:
During first build, both do exactly the same. But then, you make a small adjustment in your source code.
The left stack needs to start all over from command
COPY and then has to execute 2
RUN commands after. Compared to that, right stack only needs to run 1
COPY command. I hope this demonstrates the issue quite clearly.
Lesson to be learned: put commands that will rarely change at the top of your dockerfile and commands that will need to be re-run often at the end of the Dockerfile.
Disclaimer about layer caching
To save disk space it is encouraged to group some commands into single layers.
RUN apt update RUN apt upgrade
produces 2 layers (double disk size), while:
RUN apt update && \ apt upgrade
produces only 1 layer. This is also a neat trick for multiline commands.