-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Use docker cache mounts for apt, pip and cargo #11106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -1,19 +1,32 @@ | ||||||||||||||||||||||||||
| FROM --platform=$BUILDPLATFORM ubuntu AS build | ||||||||||||||||||||||||||
| ENV HOME="/root" | ||||||||||||||||||||||||||
| ENV HOME="/root" \ | ||||||||||||||||||||||||||
| # Place tool-specific caches in the buildkit tool cache. | ||||||||||||||||||||||||||
| CARGO_HOME=/buildkit-cache/cargo \ | ||||||||||||||||||||||||||
| CARGO_ZIGBUILD_CACHE_DIR=/buildkit-cache/cargo-zigbuild \ | ||||||||||||||||||||||||||
| PIP_CACHE_DIR=/buildkit-cache/pip \ | ||||||||||||||||||||||||||
| RUSTUP_HOME=/buildkit-cache/rustup \ | ||||||||||||||||||||||||||
| ZIG_GLOBAL_CACHE_DIR=/buildkit-cache/zig | ||||||||||||||||||||||||||
| WORKDIR $HOME | ||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||
| RUN apt update \ | ||||||||||||||||||||||||||
| RUN \ | ||||||||||||||||||||||||||
| --mount=type=cache,target=/var/cache/apt,sharing=locked \ | ||||||||||||||||||||||||||
| --mount=type=cache,target=/var/lib/apt,sharing=locked \ | ||||||||||||||||||||||||||
| # remove the default docker-specific apt config that auto-deletes /var/apt/cache archives | ||||||||||||||||||||||||||
| rm -f /etc/apt/apt.conf.d/docker-clean && \ | ||||||||||||||||||||||||||
| # and configure apt-get to keep downloaded archives in the cache | ||||||||||||||||||||||||||
| echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' >/etc/apt/apt.conf.d/keep-cache && \ | ||||||||||||||||||||||||||
| apt update \ | ||||||||||||||||||||||||||
| && apt install -y --no-install-recommends \ | ||||||||||||||||||||||||||
| build-essential \ | ||||||||||||||||||||||||||
| curl \ | ||||||||||||||||||||||||||
| python3-venv \ | ||||||||||||||||||||||||||
| cmake \ | ||||||||||||||||||||||||||
| && apt clean \ | ||||||||||||||||||||||||||
| && rm -rf /var/lib/apt/lists/* | ||||||||||||||||||||||||||
| cmake | ||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||
| # Setup zig as cross compiling linker | ||||||||||||||||||||||||||
| RUN python3 -m venv $HOME/.venv | ||||||||||||||||||||||||||
| RUN .venv/bin/pip install cargo-zigbuild | ||||||||||||||||||||||||||
| RUN \ | ||||||||||||||||||||||||||
| --mount=type=cache,target=/buildkit-cache,id="tool-caches" \ | ||||||||||||||||||||||||||
| .venv/bin/pip install cargo-zigbuild | ||||||||||||||||||||||||||
| ENV PATH="$HOME/.venv/bin:$PATH" | ||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||
| # Install rust | ||||||||||||||||||||||||||
|
|
@@ -25,21 +38,32 @@ RUN case "$TARGETPLATFORM" in \ | |||||||||||||||||||||||||
| esac | ||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||
| # Update rustup whenever we bump the rust version | ||||||||||||||||||||||||||
| ENV PATH="$CARGO_HOME/bin:$PATH" | ||||||||||||||||||||||||||
| COPY rust-toolchain.toml rust-toolchain.toml | ||||||||||||||||||||||||||
| RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --target $(cat rust_target.txt) --profile minimal --default-toolchain none | ||||||||||||||||||||||||||
| ENV PATH="$HOME/.cargo/bin:$PATH" | ||||||||||||||||||||||||||
| # Installs the correct toolchain version from rust-toolchain.toml and then the musl target | ||||||||||||||||||||||||||
| RUN rustup target add $(cat rust_target.txt) | ||||||||||||||||||||||||||
| RUN \ | ||||||||||||||||||||||||||
| --mount=type=cache,target=/buildkit-cache,id="tool-caches" \ | ||||||||||||||||||||||||||
| ( \ | ||||||||||||||||||||||||||
| rustup self update \ | ||||||||||||||||||||||||||
| || curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --target $(cat rust_target.txt) --profile minimal --default-toolchain none \ | ||||||||||||||||||||||||||
| ) \ | ||||||||||||||||||||||||||
| # Installs the correct toolchain version from rust-toolchain.toml and then the musl target | ||||||||||||||||||||||||||
| && rustup target add $(cat rust_target.txt) | ||||||||||||||||||||||||||
|
Comment on lines
+45
to
+50
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since no toolchain is installed at this point, the Also, due to the copied Given that you're using the same To do that, shift the earlier There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
NOTE: In my older PR I also set This is required if you run another
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Minor improvement from my rejected PR was to fail early, such as with pipelines with You'd add this FROM --platform=$BUILDPLATFORM ubuntu AS build
# Configure the shell to exit early if any command fails, or when referencing unset variables.
# Additionally `-x` outputs each command run, this is helpful for troubleshooting failures.
SHELL ["/bin/bash", "-eux", "-o", "pipefail", "-c"]I had some build failures when building the image locally, for |
||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||
| # Build | ||||||||||||||||||||||||||
| COPY crates crates | ||||||||||||||||||||||||||
| COPY ./Cargo.toml Cargo.toml | ||||||||||||||||||||||||||
| COPY ./Cargo.lock Cargo.lock | ||||||||||||||||||||||||||
| RUN case "${TARGETPLATFORM}" in \ | ||||||||||||||||||||||||||
| RUN \ | ||||||||||||||||||||||||||
| # bind mounts to access Cargo config, lock, and sources, without having to | ||||||||||||||||||||||||||
| # copy them into the build layer and so bloat the docker build cache | ||||||||||||||||||||||||||
| --mount=type=bind,source=crates,target=crates \ | ||||||||||||||||||||||||||
| --mount=type=bind,source=Cargo.toml,target=Cargo.toml \ | ||||||||||||||||||||||||||
| --mount=type=bind,source=Cargo.lock,target=Cargo.lock \ | ||||||||||||||||||||||||||
| # Cache mounts to speed up builds | ||||||||||||||||||||||||||
| --mount=type=cache,target=$HOME/target/ \ | ||||||||||||||||||||||||||
| --mount=type=cache,target=/buildkit-cache,id="tool-caches" \ | ||||||||||||||||||||||||||
| case "${TARGETPLATFORM}" in \ | ||||||||||||||||||||||||||
| "linux/arm64") export JEMALLOC_SYS_WITH_LG_PAGE=16;; \ | ||||||||||||||||||||||||||
| esac && \ | ||||||||||||||||||||||||||
|
Comment on lines
+62
to
64
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Adjusted ARG TARGETPLATFORM
RUN \
# Use bind mounts to access Cargo config, lock, and sources; without needing to
# copy them into a build layer (avoids bloating the docker build layer cache):
--mount=type=bind,source=crates,target=crates \
--mount=type=bind,source=Cargo.toml,target=Cargo.toml \
--mount=type=bind,source=Cargo.lock,target=Cargo.lock \
# Add cache mounts to speed up builds:
--mount=type=cache,target=${HOME}/target/ \
--mount=type=cache,target=/buildkit-cache,id="tool-caches" \
<<HEREDOC
# Handle platform differences like mapping target arch to naming convention used by cargo targets:
# https://en.wikipedia.org/wiki/X86-64#Industry_naming_conventions
case "${TARGETPLATFORM}" in
( 'linux/amd64' )
export CARGO_BUILD_TARGET='x86_64-unknown-linux-musl'
;;
( 'linux/arm64' )
export CARGO_BUILD_TARGET='aarch64-unknown-linux-musl'
export JEMALLOC_SYS_WITH_LG_PAGE=16
;;
( * )
echo "ERROR: Unsupported target platform: '${TARGETPLATFORM}'"
return 1
;;
esac
cargo zigbuild --release --bin uv --bin uvx --target "${CARGO_BUILD_TARGET}"
cp "target/${CARGO_BUILD_TARGET}/release/uv" /uv
cp "target/${CARGO_BUILD_TARGET}/release/uvx" /uvx
HEREDOC |
||||||||||||||||||||||||||
| cargo zigbuild --bin uv --bin uvx --target $(cat rust_target.txt) --release | ||||||||||||||||||||||||||
| RUN cp target/$(cat rust_target.txt)/release/uv /uv \ | ||||||||||||||||||||||||||
| cargo zigbuild --bin uv --bin uvx --target $(cat rust_target.txt) --release \ | ||||||||||||||||||||||||||
| && cp target/$(cat rust_target.txt)/release/uv /uv \ | ||||||||||||||||||||||||||
| && cp target/$(cat rust_target.txt)/release/uvx /uvx | ||||||||||||||||||||||||||
| # TODO(konsti): Optimize binary size, with a version that also works when cross compiling | ||||||||||||||||||||||||||
| # RUN strip --strip-all /uv | ||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This
RUNdoes not play well with concurrent writers when thattool-cachescache mount is used. Causing builds to fail:While cargo might manage lock files to avoid this type of scenario, you need to be mindful of cache mount usage when it's not compatible with the default
sharing=sharedmount option.To prevent this problem use
sharing=lockedto block another build from writing to the same cache mount id. That or running two separate build commands to build one platform at a time.While on the topic of cache mounts. It's a non-issue for CI of a project where you only build a single
Dockerfileyour project maintains.However on user systems, AFAIK if that
idis used in another projectDockerfile, it also shares that cache. Sometimes that's a non-issue, but be mindful of accidentally mixing/sharing with other projects that shouldn't share a cache mount due to concerns like invalidating each others storage, or like seen here conflicting write access, or withsharing=lockedblocking a build of another project.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EDIT: As per feedback in the next comment, I'm really not sure about the toolchain being stored in a cache mount as a good idea? Rather then apply this fix it may be better to just avoid the cache mount entirely (you'd then have the ability to build the
buildstage and shell into it to troubleshoot building if need be too, actually maybe not due toCARGO_HOMEif you need zigbuild)Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about why the rust toolchain is stored in a cache mount, while Zig and other toolchains are left in the image layers? To pair an update of
rust-toolchain.tomlbumping the toolchain to triggerrustup self update?The
COPYforrust-toolchain.tomlwould invalidate theRUNlayer, so it would be updated just the same no?Dockerfilewithout common base layer sharing, but if those projects were configured with different toolchains they likewise accumulate in cache storage? (which is more prone to GC than an actively used layer) Cleaning up unused layers is probably preferable, cache should really be used for actual cache (I think it's possible for a cache mount to clear betweenRUN, not ideal for a toolchain).Dockerfileis with the actual cargo build later on, so pulling from a CI cache blob or from the remote source (rustup, package manager, etc) are not likely to be that much faster. Regardless you're configuring persistence in CI via cache mounts, is that beneficial vs standard caching of image layers?You will however benefit from the cache mount when building multiple targets separately (rather than multiple
cargo buildin the sameRUN):ARG TARGETPLATFORMintroducing a divergence in layer cache (1.3GB + 1.4GB to support without cache mount but actual diff is approx 200MB only).That concern is easily fixed as per my suggestion for avoiding divergence at this point. Both targets added are 354MB combined. Total layer weight with minimal profile is 930MB (instead of 1.6GB), be that layer cache or a cache mount.
Breakdown:
Sizes (bolded is within a cache mount):
/buildkit-cache/rustup(also adds 19MB to sibling dircargo/):lib/rustlib/aarch64-unknown-linux-musl/lib(135MB) /lib/rustlib/x86_64-unknown-linux-musl/lib(219MB)lib/rustlib/x86_64-unknown-linux-gnu/bin(18MB) +lib/rustlib/x86_64-unknown-linux-gnu/lib(158MB)lib/libLLVM.so.19.1-rust-1.86.0-stable(174MB) +lib/librustc_driver-ea2439778c0a32ac.so(141MB)/buildkit-cache/pip/http-v2/var/cache/apt(220MB) +/var/lib/apt(48MB)/root/.venv/lib/python3.12/site-packages/ziglang/usr(base package layer adds 436MB)Image build time:
On a budget VPS (Fedora 42 at Vultr, 1vCPU + 2GB RAM with 3GB more via zram swap):
aptlayer built within 37scargo-zigbuildinstall 12srustupsetup 32scargorelease build (x86_64), 2 hours 25 minutes.The build took excessively long presumably due to single CPU and quite possibly RAM, I didn't investigate that too extensively. Changing from
lto="fat"tolto="thin"brought that build time down to 43 minutes, at the expense of being 25% larger (40MB => 50MB).You're getting much better results reported for the build, but the bulk of the time is down to the actual build. I'd avoid wasting CI cache store (causing evictions sooner than necessary for cache items that are actually helpful) on the rust toolchain, saving a minute at best is not worth better using the cache to optimize the build time (requires
sccacheIIRC to be decent but is not without quirks).That said you can use the cache mounts in CI and not upload/restore them for minimizing the image layers cache, but presently there is very little benefit in caching image layers at all? You could instead just focus on the cache mount(s) for the
cargobuild itself.The
cargotarget cache is 1GB alone when building this project, but as mentioned it's a bit of a hassle to actually leverage for the CI.After a build
For reference, the cargo and zig caches are decent in size, but a good portion of the cargo one isn't relevant, nor is the zigbuild cache mount worthwhile?
As per my PR attempt, the bulk of the cargo cache mount there is from data that is quick to generate/compute at build time, thus not worth persisting. I used two separate tmpfs cache mounts to filter those out:
Only relevant if storage of the cache mount is a concern, which it may be for CI limits to keep tame, otherwise is overkill :)