Skip to content

Commit 0a6ec90

Browse files
Thiago Goulartfacebook-github-bot
Thiago Goulart
authored andcommitted
New "deterministic_output" argument to produce consistent zip files across all host platforms
Summary: ### Motivation My team has a concrete need for buck to generate 100% matching zip files for the same sets of inputs on all host platforms (macOS, Linux, Windows). Current limitations: 1. File order can be different on file system with different case sensitivity. 2. Windows can't write correct posix mode (i.e. permissions) for any entries. Although the entries themselves might fully match, those discrepancies result in different metadata, which results in a different zip file. See D67149264 for an in-depth explanation of the use case that requires this level of determinism. ### Tentative solution #1 In D66386385, I made it so the asset generation rule was only executable from Linux. Paired with buck cross builds, it made so that outputs from macOS and Linux matched, but did not work on Windows [due to some lower level buck problem](https://fb.workplace.com/groups/930797200910874/posts/1548299102494011) (still unresolved). ### Tentative solution facebook#2 In D66404381, I wrote my own Python script to create zip files. I got all the files and metadata to match everywhere, but I could not get around differences in the compression results. Decided not to pursue because compression is important for file size. ### Tentative solution facebook#3 In D67149264, I duplicated and tweaked buck's zip binary. It did work, but IanChilds rightfully pointed out that I'd be making maintenance on those libraries more difficult and that the team is even planning on deleting those, at some point. ### Tentative solution facebook#4 (this diff!) IanChilds advised me to try to fix buck itself to produce consistent results, so this is me giving it a try. Because the root problem could not have been done in a backwards compatible way (the file permissions, specifically; see inlined comment), I decided to use an argument to control whether the zip tool should strive to produce a deterministic output or not, at the expense of some loss of metadata. The changes are simple and backwards compatible, but any feedback on the root problem, idea and execution are welcome. Reviewed By: christolliday Differential Revision: D67301945 fbshipit-source-id: c42ef7a52efd235b43509337913d905bcbaf3782
1 parent 6749340 commit 0a6ec90

File tree

2 files changed

+9
-0
lines changed

2 files changed

+9
-0
lines changed

prelude/decls/core_rules.bzl

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1485,6 +1485,11 @@ zip_file = prelude_rule(
14851485
14861486
The regexes must be defined using `java.util.regex.Pattern` syntax.
14871487
"""),
1488+
"deterministic_output": attrs.option(attrs.bool(), default = None, doc = """
1489+
If set to true, Buck ensures that all files in the generated zip and their associated metadata are
1490+
consistent across all platforms, resulting in an identical zip file everywhere. Note that this might
1491+
come at the expense of losing some otherwise relevant metadata, like file permissions and timestamps.
1492+
"""),
14881493
"on_duplicate_entry": attrs.enum(OnDuplicateEntry, default = "overwrite", doc = """
14891494
Action performed when Buck detects that zip\\_file input contains multiple entries with the same
14901495
name.

prelude/zip_file/zip_file.bzl

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ def _zip_file_impl(ctx: AnalysisContext) -> list[Provider]:
2626

2727
on_duplicate_entry = ctx.attrs.on_duplicate_entry
2828
entries_to_exclude = ctx.attrs.entries_to_exclude
29+
deterministic_output = ctx.attrs.deterministic_output
2930
zip_srcs = ctx.attrs.zip_srcs
3031
srcs = ctx.attrs.srcs
3132

@@ -59,6 +60,9 @@ def _zip_file_impl(ctx: AnalysisContext) -> list[Provider]:
5960
create_zip_cmd.append("--entries_to_exclude")
6061
create_zip_cmd.append(entries_to_exclude)
6162

63+
if deterministic_output:
64+
create_zip_cmd.append("--deterministic_output")
65+
6266
ctx.actions.run(cmd_args(create_zip_cmd), category = "zip")
6367

6468
return [DefaultInfo(default_output = output)]

0 commit comments

Comments
 (0)