Drone

Use buildx for native Docker building?

I’ve been tracking the native Docker building issue (https://github.com/drone/drone/issues/2462) for a little bit since it sounds like a great feature. Right now we have to push containers in order to use them in the next steps, plus the syntax is nicer than the plugin.

The issue currently states that the feature is blocked on moby integrating containerd, but I found that Docker is starting to expose BuiltKit features in the Docker CLI via “buildx”, see https://github.com/docker/buildx.

I’m wondering if this would let Drone move to native image building? Seems like the ‘context’ feature would allow much better building isolation to prevent cache poisoning.

Thoughts?

I am interested in learning more. Can you provide some example commands that demonstrate how this might work?

I don’t have any specific examples, but from the linked issue it sounded like main issue was that Docker had no way to push a tag to a remote repository without tagging the image locally first. This would cause issues if a pipeline tagged an image like node. That image would get picked up by other pipelines expecting the official node image.

With buildx, more of the functionality from BuildKit is exposed. This includes commands like docker buildx build with the --push argument, which looks like it allows Docker to build and push at the same time without tagging the image locally.

Also there’s a new docker buildx create CONTEXT which seems to be aimed at the exact CI use case. It seems to effectively give a namespace to all Docker commands so one pipeline’s context can’t affect another’s. The context could be named after the repo or repo+branch for extra isolation. This would still allow for layer caching on a per repo basis, which would speed things up a lot.

I haven’t tested any of this though :slight_smile:

I think one challenge with contexts is (as I understand them) is they would need to be per-pipeline to support this use case and avoid cache poisoning. The downside to a per-pipeline cache is that it would always have a cold image cache, which could slow down pipeline execution time (since it would re-download popular images on every execution). But admittedly I probably need to play around with this feature a bit more.

Another challenge is that we can only rely on these docker-specific features for the docker pipelines. I think it would be a challenge for kubernetes pipelines since kubernetes could use docker, containerd, crio, firecracker or other engines under the hood. But for now we can try to focus on the docker pipeline use case to get this working.

One short / medium term workaround is we could do something like build an image with a random tag and use in the subsequent steps that reference the user-defined image name. When it is time to publish, we could export the image and upload using the user-defined tag. I think buildah might be able to export and upload, but if not, I am sure there is something out there. The export process would probably be quite slow, but probably better than nothing … It would at least give us something to baseline and iterate on.

Good points. I think that even if caches are scoped per pipeline, isn’t that better than what the current plugin does? I thought there’s currently no cache and every image is downloaded every time.

As for Kubernetes, it sounds like it already has access to these lower primitives so it might be possible to have a different implementation that has the same proposed interface already.

I think that even if caches are scoped per pipeline, isn’t that better than what the current plugin does? I thought there’s currently no cache and every image is downloaded every time.

In the below example, the build step uses the golang image, which is sourced from the host machine’s primary docker cache. If the image already exists in the cache it does not need to be downloaded again.

kind: pipeline
type: docker
name: default

steps:
- name: build
  image: golang
  commands:
  - go build

The challenge with using an ephemeral, per-pipeline context is that we would need to source all pipeline images from that context. This means the golang image (in the above example) would be sourced from the context and would not exist (since the context is ephemeral) and would need to be downloaded every time the pipeline executes. So this would improve the use case of building an image in one step and using the image in a subsequent step, but at the expense of no longer having access to the host machine’s primary docker cache for other images.

An important caveat is that I have only read the context document and the above comments are based on my understanding of contexts, which could be inaccurate.

I see, so the only way to write to the cache currently is to specify the image in a step. Using the Docker plugin to build an image does not add those layers to the cache though and so every build needs to run through the full Dockerfile. That’s what I was thinking about.

So I played with buildx for a little and the --push option seems to solve the problem in the original ticket. Buildx can push an image to a repository without first tagging the image locally. This means namespaced tags can be used locally and non-namespace tags can then be pushed to a remote repository without ever showing up locally, solving the cache poisoning issue.

Here’s an example
Dockerfile:

FROM busybox
RUN echo "hello world"

Commands:

# docker -v
Docker version 19.03.12, build 48a66213fe
# export DOCKER_CLI_EXPERIMENTAL=enabled
# docker buildx create
wizardly_hertz
# docker buildx use wizardly_hertz
# docker buildx build -t blopker/dronetest --push .
[+] Building 4.6s (6/6) FINISHED
 => [internal] load .dockerignore                                                                                                                                 0.0s
 => => transferring context: 2B                                                                                                                                   0.0s
 => [internal] load build definition from Dockerfile                                                                                                              0.0s
 => => transferring dockerfile: 87B                                                                                                                               0.0s
 => [internal] load metadata for docker.io/library/busybox:latest                                                                                                 1.3s
 => [1/2] FROM docker.io/library/busybox@sha256:9ddee63a712cea977267342e8750ecbc60d3aab25f04ceacfa795e6fce341793                                                  0.0s
 => => resolve docker.io/library/busybox@sha256:9ddee63a712cea977267342e8750ecbc60d3aab25f04ceacfa795e6fce341793                                                  0.0s
 => CACHED [2/2] RUN echo "hello world"                                                                                                                           0.0s
 => exporting to image                                                                                                                                            3.4s
 => => exporting layers                                                                                                                                           0.0s
 => => exporting manifest sha256:9ce7dfad6a28040a36e1b0d2b03b6e03979642eebfc3db6fbd661e8647b15014                                                                 0.0s
 => => exporting config sha256:b166d61b401655375e40343e5d636d0ada1ee5e74079eb56d66b01b56399d295                                                                   0.0s
 => => pushing layers                                                                                                                                             2.5s
 => => pushing manifest for docker.io/blopker/dronetest:latest
# docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
moby/buildkit       buildx-stable-1     f2a88cb62c92        3 months ago        82.8MB

Notice that a) the build used cached layers since I had already built the image before and b) the tag blopker/dronetest does not show up when docker images is run. It seems like using it this way we don’t even need separate contexts for each pipeline, but just one global one.

@blopker thanks for providing the sample commands. I think the assumption here, as I understand it, is that the docker layers would be cached locally even though the image is not tagged. So presumably when the subsequent pipeline step tries to use the docker image, it will make a call to the registry to pull the image and will find that it has all layers locally, and will not need to re-download. Is that correct?

I agree that this would be an improvement over the current approach because it would give us a method to cache build layers locally and speed up builds, while also allowing us to more easily use images in subsequent steps without poising the cache.

I am not sure it fully solves the use case described in issue #2462 to build the image in step one, use the image step two for testing purposes, and push the image in step three iff the previous step passes. So basically only push the image after it has been used and tested. But again, I think you proposal is still an improvement over the status quo, even though it doesn’t fully solve this paricular use case.

I think this deserves some further research.

I thought about pushing an already created image. It’s kind of hacky, but one solution could be to dynamically create a Dockerfile with just the namespaced tag and use that with the buildx command.

Example:

docker buildx build -t blopker/dronetest --push -<<EOF
FROM ${NAMESPACED_TAG}
EOF

I suspect this will add another layer, but would be the same image otherwise.

Very interesting idea!

I was also considering using google’s crane utility which may provides some useful utilities for publishing images and working with registries. A few ideas I kicked around

  1. push the image using the temporary tag. Then use crane tag to tag the image in the remote registry with the proper tag. Finally use crane delete to delete the temporary tag.
  2. use docker save to export the result as a tarball and then use crane push to push the tarball directly to the remote registry. I previously tried using docker save and then docker load and docker push but this was slow. Directly piping the output of docker save into crane push could provide tolerable performance.

I think your idea of generating a random dockerfile on the fly is pretty creative. I think all of our options are pretty hacky at this point, so it is a matter of choosing the least hacky option :slight_smile:

Crane does look useful. Regarding the temporary tag, since I work in a regulated environment we tend to adopt an “append only” permissions structure on most systems. We currently don’t allow tags to be deleted or changed except for a few well-known tags like latest. Probably an edge case, but just wanted to note it. Although if this is the way you want to go we’ll make it work.