Skip to content

Fast listing of file names contained in Zip archive #240

Open
@MoreDelay

Description

@MoreDelay

Is your feature request related to a problem? Please describe.
I ported a small utility that I wrote in bash to rust that handles some file conversion within zip archives. I used the tool 7z to list files in archives and if it finds some file it can convert, it extracts the archive, converts the files and compresses them again. After porting my script to Rust using this crate, I noticed that listing the file names takes a lot longer than it used to with 7z. Most of my archives are already completed and only new archives need conversion, so this makes for a relatively huge performance drop iterating through the old ones.

Describe the solution you'd like
I'm not very familiar with the zip file spec, but it seems there is a central dictionary that contains all file names (and other metadata) at the end of the archive. It should not be too difficult, and fast, to read just over that last portion of the file. As far as I can see, there is only the function file_names that gives access to the file names. I looked through the code a little, and I think right now it has to read through the whole archive and create a mapping from file names to its binary blob, which is wasted computation most of the time in my case.

I tried to iterate through the archive using 0..archive.len() and indexing the file to get their names, but this does not seem to make any difference on performance.

Describe alternatives you've considered
I haven't found another crate that provides just the listing functionality. As I said, 7z implements this but that is written in C++.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions