CircleCI Docker Flow

April 20, 2016

At Affinity, we recently started using Kubernetes in production. Our deployment strategy, in turn, had to change significantly: rather than introduce incremental updates to existing VMs, we rebuild Docker images and push them to a repository, delegating to Kubernetes to fetch updates and run our services.

In doing so, we wanted to tightly couple our deployment strategy with our continuous integration (CI) system: once a build passes, a new image should be pushed and ready to deploy. Tests are run directly within a production-ready container to minimize environment inconsistencies. And because images are solely handled by CI, we largely abstract the dependencies and build process from the rest of the development team.

Although it's straightforward to get started with CircleCI, our CI system of choice, and Docker, we ran into a few difficulties along the way, notably with image caching and push performance.

Getting Started

The CircleCI docs present us a good starting circle.yml file, which configures the build process:

machine:
  services:
    - docker

dependencies:
  override:
    - docker info
    - docker build -t circleci/elasticsearch .

test:
  override:
    - docker run -d -p 9200:9200 circleci/elasticsearch; sleep 10
    - curl --retry 10 --retry-delay 5 -v http://localhost:9200

deployment:
  hub:
    branch: master
    commands:
      - docker login -e $DOCKER_EMAIL -u $DOCKER_USER -p $DOCKER_PASS
      - docker push circleci/elasticsearch

Under the machine section, we specify Docker as a requirement. For dependencies, we build a Docker image, and during the test phase, we run commands to ensure the image is functional. Once testing has passed for the master branch, we push the image to Docker hub, as per the deployment section.

Image Caching and Push Performance

When building an image, Docker starts with a base system and then applies changes incrementally, saving new layers until all packages, dependencies, and filesystem modifications are finalized. This process can take a significant amount of time, and if it's run from scratch during each CI build, it becomes a bottleneck.

To resolve this, Docker employs caching: when building an image, Docker uses layers that it has already built in the past if there are no changes. In this way, Docker only incrementally rebuilds what it needs to. For code changes with no dependency adjustments, the build process can speed up significantly, since all but a few steps are cached.

Caching also helps when an image is pushed to a remote repository, as layers that have already been pushed in the past don't need to be transferred again, thereby saving unnecessary uploads.

But because CircleCI runs each CI build in a new, isolated environment, the Docker cache isn't populated, and we have to rebuild our image from scratch. To resolve this, CircleCI recommends using docker save to save an image and it's layers after one CI build. Then, during a subsequent CI build, you can use docker load to reload the image and effectively populate the cache in the dependencies step, like so:

dependencies:
  cache_directories:
    - "~/docker"

  override:
    - if [[ -e ~/docker/image.tar ]]; then docker load -i ~/docker/image.tar; fi
    - docker build -t circleci/elasticsearch .
    - mkdir -p ~/docker; docker save circleci/elasticsearch > ~/docker/image.tar

We tried using the approach, but found a significant issue: docker load doesn't populate the cache used by docker push, as per this GitHub issue. While our Docker build times reduced significantly, pushing our image took upwards of 7 minutes, which was much too slow.

After digging into it further, we found that running docker pull on our own image prior to rebuilding it actually populated both caches, making our subsequent build and push faster. Since we use Amazon EC2 Container Registry (ECR) to host our images, our dependencies step is the following:

dependencies:
  override:
    # set region so we can use the aws command-line tool to log into ecr
    - aws configure set default.region us-west-2
    - eval $(aws ecr get-login)

    # pull image from ecr to cache docker layers for quicker rebuilds
    - docker pull example.ecr.us-west-2.amazonaws.com/example:latest

    # build new image
    - docker build -t example .

docker save and docker load combined took around 35 seconds, compared to 50 seconds for docker pull. While we lost a few seconds there, the new docker push time was a mere 30 seconds, thereby saving us more than 6 minutes of build time.

Note that if you'd like to use ECR, you'll need to specify an appropriate access/secret key pair in CircleCI's "Project Settings" under "AWS Permissions." Also make sure you change example.ecr.us-west-2.amazonaws.com/example:latest to the correct URL to your repository.

Separate Build/Push Scripts

While caching was the biggest help for our CI builds, there was another improvement worth mentioning.

We took the commands necessary to build and push our Docker image and factored them out into two separate scripts: bin/build.sh and bin/push.sh, both of which are called in circle.yml. This way, if we ever need to run the build ourselves, test a local image, or manually push an image in case something goes wrong, we can just run these scripts.

Our bin/push.sh always pushes up two versions of the image: one tagged with the commit hash, and one tagged with the string latest. In this way, we can always fetch the latest image using the latest tag, and we can revert back to an old image simply by using the hash of the relevant commit.

Here's our bin/build.sh:

#!/bin/bash
set -euo pipefail
IFS=$'\n\t'

docker build -t example .

And our bin/push.sh:

#!/bin/bash
set -euo pipefail
IFS=$'\n\t'
REMOTE=example.ecr.us-west-2.amazonaws.com
NAME=example
HASH=$(git rev-parse HEAD)

eval $(aws ecr get-login)

# Push same image twice, once with the commit hash as the tag, and once with
# 'latest' as the tag. 'latest' will always refer to the last image that was
# built, since the next time this script is run, it'll get overridden. The
# commit hash, however, is a constant reference to this image.
docker tag -f $NAME $REMOTE/$NAME:$HASH
docker push $REMOTE/$NAME:$HASH
docker tag -f $NAME $REMOTE/$NAME:latest
docker push $REMOTE/$NAME:latest

docker logout https://$REMOTE

Make sure to change the image name example and the remote host example.ecr.us-west-2.amazonaws.com before you use these scripts.

And our final circle.yml for reference:

machine:
  services:
    - docker

dependencies:
  override:
    # set region so we can use the aws command-line tool to log into ecr
    - aws configure set default.region us-west-2
    - eval $(aws ecr get-login)

    # pull image from ecr to cache docker layers for quicker rebuilds
    - docker pull example.ecr.us-west-2.amazonaws.com/example:latest

    # build new image
    - bin/build.sh

test:
  override:
    # put your test command(s) here
    - ...

deployment:
  production:
    # only apply this deployment on the master branch
    branch: master

    commands:
      # push image to ecr
      - bin/push.sh

If you'd like to use our circle.yml in your own project, make sure you modify the repository URL to pull from, the command to run tests, and the two helper scripts as outlined above.