-
Notifications
You must be signed in to change notification settings - Fork 497
Description
I wanted to iterate over the contents and metadata of some archives and sharp compress has largely worked great for that.
I initially was confused when ReaderFactory.Open(stream) didn't work on 7z.
the official docs do cover this but I missed it.
Initially I had then seen they also mention for perf: to use ExtractAllEntries for 7z and rar. Although it seemed to be primarily for writing the files out it seemed it might be a good way to go for performance and it gave an IReader back. It generally seemed to work well but I noticed on some archives the memory usage was huge, only eventually tracing it to the dictionary size (so a 350MB dictionary may result in a 800MB of memory usage). I also noticed just accessing .Entries on the SevenZipArchive didn't have this penalty. I then realized my error of ExtractAllEntries calling LoadEntries vs GetEntries causing the full dictionary load.
As I wanted to minimize the alternate code path for 7z I eventually settled on a custom IReader implementation that worked well for me but may be some problem outside of just performance for its specific exclusion.
internal class Our7ZReader(SevenZipArchive archive) : IReader {
public bool MoveToNextEntry() {
if (!inited) {
inited = true;
enumerator = archive.Entries?.GetEnumerator();
}
return enumerator.MoveNext();
}
public EntryStream OpenEntryStream() => new EntryStream(this, enumerator?.Current.OpenEntryStream());
public void WriteEntryTo(Stream writableStream) => enumerator?.Current.WriteTo(writableStream);
private bool inited = false;
private IEnumerator<SevenZipArchiveEntry> enumerator;
public ArchiveType ArchiveType => archive.Type;
public IEntry Entry => enumerator?.Current;
public bool Cancelled { get; private set; }
public event EventHandler<ReaderExtractionEventArgs<IEntry>> EntryExtractionProgress;
public event EventHandler<CompressedBytesReadEventArgs> CompressedBytesRead;
public event EventHandler<FilePartExtractionBeginEventArgs> FilePartExtractionBegin;
public void Cancel() => Cancelled = true;
public void Dispose() {
enumerator?.Dispose();
enumerator = null;
}
}I then just deviate to add detection code prior to my use of ReaderFactory and then only call ReaderFactory if my IReader reader is not set:
} else if (archivePath.EndsWith(".7z", StringComparison.CurrentCultureIgnoreCase)) {
if (SharpCompress.Archives.SevenZip.SevenZipArchive.IsSevenZipFile(stream)) {
stream.Seek(0, SeekOrigin.Begin);
var archive = SharpCompress.Archives.SevenZip.SevenZipArchive.Open(stream);
toDispose.Add(archive);
var ourReader = new Our7ZReader(archive);
reader = ourReader;
toDispose.Add(ourReader);
}
}Again there is probably a reason IReader was avoided, also above OpenEntryStream is not possible to implement as it is as EntryStream's constructor is internal. I did look at using AbstractReader as it avoided that issue but its constructor is also internal so was moot. For my needs I didn't need that but in theory above may work other than access.