Skip to content

Commit c491213

Browse files
authored
Merge pull request #300 from lexming/reprod-tarballs
update documentation on reproducible tarballs for EasyBuild 5.0
2 parents d56ce93 + 2a3531b commit c491213

File tree

1 file changed

+37
-10
lines changed

1 file changed

+37
-10
lines changed

docs/easybuild-v5/enhancements.md

+37-10
Original file line numberDiff line numberDiff line change
@@ -76,16 +76,43 @@ status codes are reported in the corresponding logs.
7676
## Reproducible tarballs for sources created via `git_config` { : #reproducible-tarballs-git_config }
7777

7878
EasyBuild can now generate reproducible tarballs of sources cloned from Git
79-
repositories. This means that those sources using the `git_config` option will
80-
now have consistent contents across different systems and across time, allowing
81-
to reliably validate them with checksums. EasyBuild follows the
82-
[archival guidelines from reproducible-builds.org](https://reproducible-builds.org/docs/archives/)
83-
to generate those reproducible tarballs.
84-
85-
This new feature does not apply to sources cloned with `keep_git_dir` enabled.
86-
Including the `.git` folder in the sources is inherently time-dependent as it
87-
contains information about the clone action itself, which hinders the creation
88-
of a reproducible tarball.
79+
repositories. This means that easyconfigs with sources using the `git_config`
80+
option can now have consistent contents across different systems and across
81+
time, allowing to reliably validate them with checksums.
82+
83+
EasyBuild follows the [archival guidelines from reproducible-builds.org](https://reproducible-builds.org/docs/archives/)
84+
to generate reproducible tarballs. The new method to create archives in
85+
EasyBuild 5.0 is fully implemented in Python, which removes our dependency on
86+
external tools such as [GNU Tar](https://www.gnu.org/software/tar/) or file
87+
compressors for this task. However, extraction of archives continues to work
88+
by executing external commands on the host system.
89+
90+
Reproducible tarballs have the following restrictions:
91+
92+
- Sources cloned with `keep_git_dir` enabled cannot be archived in a
93+
reproducible manner. Including the `.git` folder in the sources is inherently
94+
time-dependent as it contains information about the clone action itself, which
95+
hinders the creation of a reproducible tarball. Hence, EasyBuild 5.0 will
96+
create the archive of sources with `keep_git_dir`, but their checksums cannot
97+
be validated across systems.
98+
99+
- Reproducible archives are supported in uncompressed TAR format (`.tar`) or
100+
for tarballs compressed with [XZ compression](https://en.wikipedia.org/wiki/XZ_Utils)
101+
(`.tar.zx`). The wide-spread [GZip compression](https://en.wikipedia.org/wiki/Gzip)
102+
is not currently supported because its implementation injects metadata in the
103+
compressed archive that is time dependent.
104+
105+
- Systems running EasyBuild with Python < 3.9 will skip checksum validation for
106+
sources from Git repos. Due to changes in the low-level code of the `tarfile`
107+
module in the Python base distribution, tarballs generated before version 3.9
108+
result in archives with different contents than those generated in Python 3.9+.
109+
110+
Easyconfigs found in the repository of EasyBuild that contain sources from Git
111+
repos without `keep_git_dir` have already been updated to use reproducible
112+
tarballs. Archives will be created in `.tar.xz` format and checksums will be
113+
validated on Python 3.9+. Therefore, beware that EasyBuild 5.0 might generate
114+
new archives for sources that were already cloned in your system due to this
115+
changes in format.
89116

90117
---
91118

0 commit comments

Comments
 (0)