-
Notifications
You must be signed in to change notification settings - Fork 47
Description
Is it not possible for desync to avoid modifying files that have no difference?
There is no blocks/chunks to update, yet the mtime is modified each time I run the untar command? (desync untar appears to be the equivalent of casync extract for a directory tree?)
UPDATE: See follow-up comment. At a glance I think desync could diff between two index (caidx) files for before/after, to filter out files with no change in their content digest (and perhaps the other metadata attributes) as desync mtree / desync info can derive information from a common store dir and caidx files?
Context
I am new to casync / desync, there's a lot of options/commands and jargon to ingest, so perhaps I've misunderstood something. I've looked over existing issues and this may be a duplicate of #242 or just overlap with it.
In my scenario, I wanted to sync changes from the archive (src) to the target (dest) based on file content (not concerned with file metadata changes at this point). The impression I had was desync could effectively detect only what needs to be updated from an index/store. The linked issue suggests this is a problem with untar and needing support for providing a seed.
Wrt the untar stage, the issue is that with seeds you can check if a chunk is present or not. But there aren't any chunks available if the target is a directory. A caidx is just a caibx of an archive (catar), and there's no concept of chunks of the target.
A chunk inside an archive can span over multiple files so you couldn't really say a file is changed or not until it's unpacked.
mtime or other metadata related changes is problematic within Docker images. Similar to the linked issue, I'm interested in updating a filesystem root with only the subset of changes from the archive (typically much smaller than the existing destination target).
When all files are modified redundantly, the new Docker layer will duplicate that file content in full which is undesirable.
Reproduction
Related issue with casync: systemd/casync#264 (comment)
$ docker run --rm -it fedora:41
# Get desync:
$ curl -fsSL https://github.com/folbricht/desync/releases/download/v0.9.6/desync_0.9.6_linux_amd64.tar.gz \
| tar -xz --no-same-owner -C /usr/local/bin desync
# Prep basic content example:
$ cd /tmp && mkdir -p src && touch src/file
# Add content if it makes a difference (24 bytes):
$ echo 'this is a quick example' > src/file
# Avoid storing mtime:
# NOTE: `casync make` supports archiving filesystem directories to castr stores, but `desync make` does not? (`desync tar` instead?)
$ mkdir store-src
$ desync tar --no-time --store store-src --index src.caidx src
$ desync untar --store store-src --index src.caidx dest
# Alternatively (without index):
# desync tar --no-time src.catar src
# desync untar src.catar dest
$ ls -li dest
total 4
317278 -rw-r--r-- 1 root root 24 Sep 11 09:00 file
# Wait a minute and try again:
$ desync untar --index --store store-src src.caidx dest
$ ls -li dest
total 4
317278 -rw-r--r-- 1 root root 24 Sep 11 09:01 file
# Inspect archive:
$ desync mtree -i -s store-src src.caidx
# Alternatively (without index):
# desync mtree src.catar
#mtree v1.0
. type=dir mode=0755 uid=0 gid=0 time=0. 0
file type=file mode=0644 uid=0 gid=0 size=24 time=0.000000000 sha512256digest=97b0fc819edb24745c11422b30476acf214a8459d888fb5dda857ee9bb195a5e
It does manage to avoid replacing the inode unlike casync which is an improvement I think? However I'd rather it not unnecessarily modify files.