Skip to content

Conversation

@Larhzu
Copy link
Member

@Larhzu Larhzu commented Dec 27, 2024

xz's default behavior is to delete the input file after successful compression or decompression (unless writing to standard output). If the system crashes soon after the deletion, it is possible that the newly written file has not yet hit the disk while the previous delete operation might have. In that case neither the original file nor the written file is available.

The --synchronous option makes xz call fsync() on the file and possibly the directory where the file was created. A similar option was added to GNU gzip 1.7 in 2016.

Larhzu and others added 2 commits December 27, 2024 09:15
xz's default behavior is to delete the input file after successful
compression or decompression (unless writing to standard output).
If the system crashes soon after the deletion, it is possible that
the newly written file has not yet hit the disk while the previous
delete operation might have. In that case neither the original file
nor the written file is available.

The --synchronous option makes xz call fsync() on the file and possibly
the directory where the file was created. A similar option was added to
GNU gzip 1.7 in 2016. There some differences in behavior:

  - When writing to standard output and processing multiple input files,
    xz calls fsync() after every file while gzip does so only after all
    files have been processed.

  - This has no effect on "xz --list". xz doesn't sync standard output
    in --list mode but gzip does.

Portability notes:

  - <libgen.h> and dirname() should be available on all POSIX systems,
    and aren't needed on non-POSIX systems.

  - fsync() is available on all POSIX systems. The directory syncing
    could be changed to fdatasync() although at least on ext4 it
    doesn't seem to make a performance difference in xz's usage.
    fdatasync() would need a build system check to support (old)
    special cases, for example, MINIX 3.3.0 doesn't have fdatasync()
    and Solaris 10 needs -lrt.

  - On native Windows, _commit() is used to replace fsync(). Directory
    syncing isn't done and shouldn't be needed. (In Cygwin, fsync() on
    directories is a no-op.) It is known that syncing will fail if
    writing to stdout and stdout isn't a regular file.

  - DJGPP has fsync() for files. ;-)

Co-authored-by: Sebastian Andrzej Siewior <[email protected]>
Link: https://bugs.debian.org/814089
Link: https://www.mail-archive.com/[email protected]/msg00282.html
Closes: #151
Closes: #159
@Larhzu Larhzu closed this Jan 4, 2025
@Larhzu Larhzu deleted the synchronous branch January 5, 2025 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants