-
Notifications
You must be signed in to change notification settings - Fork 5k
System.Formats.Tar: default to no ctime/atime. #115778
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
atime and ctime are not well supported on non-PAX formats. Even for PAX formats, tools do not include these timestamps by default because they are of limited use. This makes .NET to also default to not include these timestamps.
Tagging subscribers to this area: @dotnet/area-system-formats-tar |
@dotnet/area-system-formats-tar ptal. I still need to see what additional test cases make sense. Implementation wise this is ready for a review. |
/// The specified <paramref name="extendedAttributes"/> get appended to the default attributes, unless the specified enumeration overrides any of them. | ||
/// <para>The following entries are always found in the Extended Attributes dictionary of any PAX entry:</para> | ||
/// The specified <paramref name="extendedAttributes"/> are additional attributes to be used for the entry. | ||
/// <para>It may include PAX attributes like:</para> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated this list to the things I think a user may want to set.
For the other items mentioned, TarEntry
has properties which control what is effectively written.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had previously already agreed to keep ATime and CTime for GNU when the user explicitly set them. #115211
This PR goes against that by removing them entirely for Gnu entries. Is that really what we want? I understand that they were problematic, but Carlos was trying to preserve the support for them for those that may have wanted it while avoiding use it by default.
We have a goal of being as compatible as possible - if a tar file did use these fields, preserve them. Don't write them for new Tar files, unless the user goes out of their way to set them explicitly. We felt that struck the balance between avoiding these while still being compatible. Carlos did a lot of manual testing by round tripping every Tar we produce to ensure that our current implementation will produce binary-identical tar files after read/write (without compression).
I didn't intend to change this behavior, and I don't think I updated tests to expect different behavior so I assumed it had not changed. I'll fix it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may have jumped to conclusion that you were changing behavior here, maybe that's not the case. The overriding goal should be preserve/round-trip and make it hard for user to fall in pit of failure (no automatic time attribute in GNU). So long as we move in that direction it's good.
I like the simplifications here.
A test we should do for this once we are settled on implementation is to do round-tripping for all the TAR's produced in our build. We did that before by downloading all the tar files from a unified build, decompressing, then passing through tarreader/writer to ensure no change.
src/libraries/System.Formats.Tar/src/System/Formats/Tar/TarHeader.Read.cs
Show resolved
Hide resolved
src/libraries/System.Formats.Tar/tests/TarEntry/PaxTarEntry.Conversion.Tests.cs
Show resolved
Hide resolved
I've addressed the feedback, and with the additional test cases I think this is now also good for test coverage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking much better. Thank you for addressing the feedback.
Were you able to test round-tripping of real-world Tar files? Previously when Carlos did this work he used a small tool like https://github.com/dotnet/arcade/blob/dbc44140e227eb6d14a29943fb7ea0c7c3c9aca0/src/Microsoft.DotNet.SignTool/src/ZipData.cs#L476-L502 (simplified to just read/write) and ran it over every tar file in the combined product output.
If you wanted to do all that you could, but not sure we have a great way of fetching the combined output publicly. Maybe instead you could spot check a couple of the files from https://github.com/dotnet/dotnet/blob/main/docs/builds-table.md
Maybe a linux, linux-musl, and mac tar file, and the deb file (expanding it first to get to the tars). Run those through a rewrite to ensure there is no binary diff.
Thanks for the reviewing!
I'll do rond trip testing on some of these tar files. I should have time to do this on Monday. I'll share the results then. |
atime and ctime are not well supported on non-PAX formats.
Even for PAX formats, tools do not include these timestamps by default because they are of limited use.
This makes .NET to also default to not include these timestamps.
Fixes #115434.