I previously wrote about a mtelvers/package-tool which would generate Dockerfiles for each package in opam.
The tool also created a single 10MB Dockerfile containing all ~4000 package builds. Each build looked like this:
FROM debian:12 AS builder_package_name
RUN apt update && apt upgrade -y
# ... setup opam
RUN opam install dependency1.version >> build.log 2>&1 || echo 'FAILED' >> build.log
RUN opam install dependency2.version >> build.log 2>&1 || echo 'FAILED' >> build.log
RUN opam install package.version >> build.log 2>&1 || echo 'FAILED' >> build.log
Followed by a final aggregation step:
FROM debian:12 AS results
COPY --from=builder_package_1 ["/home/opam/build.log", "/results/package1"]
COPY --from=builder_package_2 ["/home/opam/build.log", "/results/package2"]
# ... ~4000 times
This is a spectacular failure. Docker’s RPC layer cannot handle the 10MB Dockerfile, throwing COMPRESSION_ERROR
messages.
I attempted to bypass Docker’s RPC limitations and go straight to BuildKit.
buildctl build \
--frontend dockerfile.v0 \
--local context=. \
--local dockerfile=. \
--output type=image,name=myimage:latest
The result was the same: compression errors. BuildKit’s RPC layer cannot handle the massive Dockerfile either.
Surely there is an elegant solution to build this with Docker? I generated a docker-bake.hcl
file defining all the targets:
group "all-packages" {
targets = [
"pkg-0install-2-18",
"pkg-abella-2-0-8",
// ... ~4000 packages
]
}
BuildKit starts fine, but collapses in a few seconds with errors like rpc error: code = NotFound desc = no such job
.
$ docker buildx bake results
=> [internal] load local bake definitions
=> => reading docker-bake.hcl 698.97kB / 698.97kB
=> [pkg-random-package internal] load build definition from random-package.dockerfile
=> => transferring dockerfile: 4.74kB
...
ERROR: target pkg-random-package: failed to receive status: rpc error: code = NotFound desc = no such job dwu7wqewt4vppoe4lhe3xx44f
Maybe BuildKit just needed some restraint? I tried various approaches:
export GOMAXPROCS=100
export BUILDKIT_STEP_LOG_MAX_SIZE=50000000
docker buildx bake results
I even created a custom BuildKit configuration, tried different drivers, and limited concurrent operations. However, it was still failing.
Building, at first one, then two, and then three packages at once worked well:
docker buildx bake pkg-0install-2-18 pkg-abella-2-0-8 pkg-absolute-0-3
# [+] Building 17.7s (100/100) FINISHED
This led me to add the --batch-size
parameter to create batches of packages rather than listing them on the command line. By trial and error, I found that 100 is about the upper bound.
package-tool --opam-repository ~/opam-repository --dockerfile --batch-size 100
for a in {0..33} ; do sudo docker buildx bake batch$a ; done
I have now hit the next limitation: there is a maximum number of layers.
ERROR: target pkg-async_rpc_websocket-v0-17-0: failed to solve: failed to prepare ofhokk68c4o0esql38hz1yrzb as n4ytj8qd0izkhvs0srfj9vyi3: max depth exceeded