Dockerize the build plan v2.0

Back in 2017, I wrote a series of articles about using Helm for Continuous Deployment. One year later, I want to look back on some things I wrote and offer some alternative solutions. The most interesting thing is about the article CD with Helm part 2: Dockerize the build plan.

This is the approach I had used back then:

We have a sample application called blog-helm. It can greet you with "Hello world!" on port 3000. It is written in nodeJS/express.
npm is used for dependency management.
The Dockerfile of the application packages the production image (only production dependencies).
A separate Dockerfile named Dockerfile-ci is used to "dockerize the build plan". All dependencies (including dev dependencies) are installed in the image, creating a highly specialized Docker image for the app.

So the Dockerfile looked like this:

FROM node:alpine
EXPOSE 3000
RUN mkdir /app
WORKDIR /app
ADD . /app
RUN npm install --only=production
CMD ["node", "index.js"]

and the Dockerfile-ci looked like this:

FROM node:8.9-alpine
RUN mkdir -p /app
ADD package.json /app
ADD package-lock.json /app
WORKDIR /app
RUN npm install
ADD . /app

After the introduction of WebdriverIO testing, the Dockerfile-ci changed further into this:

# phantomJS does not work with alpine
FROM node:8-slim

# dependencies
RUN apt-get update && apt-get install -y \
    bzip2 \
    libfontconfig1 \
 && rm -rf /var/lib/apt/lists/*

# install phantom
RUN curl -L https://github.com/Medium/phantomjs/releases/download/v2.1.1/phantomjs-2.1.1-linux-x86_64.tar.bz2 | tar jx
RUN mv phantomjs-2.1.1-linux-x86_64/bin/phantomjs /usr/local/bin/ \
    && rm -rf phantomjs-2.1.1-linux-x86_64

RUN mkdir -p /app
ADD package.json /app
ADD package-lock.json /app
WORKDIR /app
RUN npm install
ADD . /app

One big advantage of this approach is that the time consuming npm install step will be cached, thanks to Docker’s powerful caching.

It’s not however the approach I’m currently using. The CI systems I’m currently using are:

Travis for my open source/hobby projects
Bitbucket Pipelines at work

They both have great support for using Docker in your build pipeline transparently and they deal with caching of npm dependencies (or anything heavy for that matter) in a different way.

To explain what I mean by “using Docker transparently”, let’s take a trip down the memory lane. Before Docker, your build steps would run directly on the build agent. So you would run npm install and npm test directly on the agent. This had several problems:

the necessary tooling needed to be installed on the build agent (which has typically a dependency with the system administrator team that need to sign off any modifications to the shared company infrastructure)
multiple versions of the same tools might be required (e.g. one team on nodeJS 8 and another on bleeding edge)
multiple technologies might be required (e.g. one team on nodeJS and another one on Java)

This was not fun for anyone. Docker brought a great solution to this problem. Build steps in modern build servers, like Bitbucket Pipelines, are in fact always running within a Docker container.

However, a great build server should also hide away the fact that we are using Docker. This makes the learning curve smoother and promotes separation of concerns. So it would be great if my build steps are still npm install and npm test, with a small note somewhere that these run within the Docker image node:8-slim (for example). The Docker image is thus a small almost invisible implementation detail, a transparent glue if you like, between the build server and the build plan.

Which leaves us with caching of the npm dependencies. Both Travis and Bitbucket Pipelines offer caching as a first class citizen in their build definition language. This removes the last argument for my original, highly specialized, project-specific, Dockerfile-ci image.

In the approach I’m currently using, I always try to find first an existing Docker image at the Docker Hub that meets my requirements without being too bloated. It’s like finding the right shoe. Examples:

For an Angular project, I needed an image that can run Karma tests and Protractor end to end tests, with Chrome headless browser. I found this one, it worked fine for many months now.
For deployment projects that use Helm, I needed an image that supports kubectl and helm. I found this one, fantastic.
For deployment projects that use the AWS CLI and the AWS ElasticBeanstalk CLI... I didn't find an image. I created one myself and published it to the Docker Hub.

Some criteria for picking an existing image: is it well documented? Is it still maintained? Does it have its Dockerfile on GitHub somewhere? How popular is it (downloads and stars)? Does it have automatic builds?

So, to wrap it up:

in my current projects, there is no Dockerfile-ci.
My build plan consists of simple step definitions, whose only relationship with Docker is that they indicate which Docker image they will be executed into (it might even be that different steps use different Docker images).
Docker images are off the shelf images from Docker Hub.
If a suitable image does not exist, then I would have to create one, but I would do that in a separate repository, so that the lifecycle of that image is separated from the project that is using it.
Last, but most definitely not least, caching of npm/maven/whatever dependencies is handled by the CI server (so the CI server must support this!).