build vs. buildx cache evaluation behaves differently on AWS FSx for Lustre mount #838
Description
Behavior
In a directory mounted from an AWS FSx for Lustre volume, docker buildx build
will invalidate the build cache when a file has been replaced, even by one with the same contents, while docker build
will preserve the build cache.
Desired behavior
If the file content remains the same, the build cache should be leveraged as it is a) using docker build
and b) using docker buildx build
within a traditional filesystem. This would enable, for example, leveraging the build cache across multiple identical clones of a repo in a CI environment which happen to use this kind of mount.
Environment
docker info
:
Client:
Context: default
Debug Mode: false
Plugins:
app: Docker App (Docker Inc., v0.9.1-beta3)
buildx: Build with BuildKit (Docker Inc., v0.5.1-docker)
Server:
Containers: 2
Running: 0
Paused: 0
Stopped: 2
Images: 9
Server Version: 20.10.5
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 05f951a3781f4f2c1911b05e61c160e9c30eaa8e
runc version: 12644e614e25b05da6fd08a38ffa0cfe1903fdec
init version: de40ad0
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 5.4.0-1025-aws
Operating System: Ubuntu 18.04.5 LTS
OSType: linux
Architecture: x86_64
CPUs: 16
Total Memory: 30.36GiB
Name: ip-10-255-0-85
ID: XYYA:2K4L:PS6H:N62Y:T24V:WCR3:6FC2:7AUH:WMQ7:EIRJ:ZAYP:OOJU
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: No swap limit support
mount | grep lustre
:
10.255.0.107@tcp:/5mgfhbmv on /mnt/fsx/fs1 type lustre (rw,noatime,flock,lazystatfs,_netdev)
Steps to reproduce
I omitted steps to set up an EC2 instance with a FSx for Lustre volume, because I suspect the issue is somewhat broader. I'm happy to test out another environment (perhaps a simpler network-mounted volume?) to nail down the scope of this issue, but I'm hoping for advice on what would be most sensible to try.
Prepare the build context:
$ docker buildx use default
$ mkdir -p /mnt/fsx/fs1/test/cache_test/src
$ cd /mnt/fsx/fs1/test/cache_test
$ echo 'FROM busybox:latest' > Dockerfile
$ echo 'COPY src /src' >> Dockerfile
$ echo 'RUN sleep 3' >> Dockerfile
$ touch src/cache_test_file
^ assuming an FSx mount at /mnt/fsx/fs1
Demonstrate the build cache invalidation:
$ docker buildx build . # will build without cache
[+] Building 3.6s (8/8) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 358B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/busybox:latest 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 647B 0.0s
=> CACHED [1/3] FROM docker.io/library/busybox:latest 0.0s
=> [2/3] COPY src /src 0.1s
=> [3/3] RUN sleep 3 3.3s
=> exporting to image 0.1s
=> => exporting layers 0.1s
=> => writing image sha256:2676c4bf10cce5d5efce5bbc1f5ba26541e5b30c89366911e81483420e6dd5de 0.0s
$ docker buildx build . # will build with cache
[+] Building 0.1s (8/8) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 306B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/busybox:latest 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 643B 0.0s
=> [1/3] FROM docker.io/library/busybox:latest 0.0s
=> CACHED [2/3] COPY src /src 0.0s
=> CACHED [3/3] RUN sleep 3 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:2676c4bf10cce5d5efce5bbc1f5ba26541e5b30c89366911e81483420e6dd5de 0.0s
$ rm -rf src/ && mkdir src && touch src/cache_test_file && docker buildx build . # will build without cache
[+] Building 3.5s (8/8) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 306B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/busybox:latest 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 647B 0.0s
=> CACHED [1/3] FROM docker.io/library/busybox:latest 0.0s
=> [2/3] COPY src /src 0.1s
=> [3/3] RUN sleep 3 3.3s
=> exporting to image 0.1s
=> => exporting layers 0.1s
=> => writing image sha256:cd55e083e5fe225f2c8dc318535104a2c9564529e7b7f0a6df549eadb42b1b28 0.0s
Since the file contents in the final build are the same as the previous builds, the cache should be used rather than rebuilt.
Discrepancy vs. docker build
and vs. on a traditional filesystem
In the context prepared above, docker build
will not invalidate the build cache in the same way:
$ cd /mnt/fsx/fs1/test/cache_test
$ docker build . # will build without cache
$ rm -rf src/ && mkdir src && touch src/cache_test_file && docker build . # will build with cache
Similarly, in a directory not mounted from the FSx volume, docker build
and docker buildx build
will preserve the build cache:
$ cp -r ../cache_test $HOME
$ cd $HOME/cache_test
$ rm -rf src/ && mkdir src && touch src/cache_test_file && docker buildx build . # will build without cache
$ docker buildx build . # will build with cache
$ rm -rf src/ && mkdir src && touch src/cache_test_file && docker buildx build . # will build with cache