Allow (de)serialization of gzip'ed files

**Is your feature request related to a problem? Please describe.**
For my thesis, I want to collect some sets of data (think of huge lists of matrices, where each entry is a FreeAsssociativeAlgebraElem). In many of the observed cases, the resulting mrdi file exceeds sizes of multiple GB. However, when running gzip (with default settings), the size reduces dramatically. In one particular example that I want to mention here, the file size goes down from 3.2G to 65M, which is a factor of ~50. Running gzip in this case needs about 20sec, which is negligible compared to the time required to produce the data and moving the data around.
Furthermore, I regularly fill my disk quota on our compute servers with such uncompressed files.

**Describe the solution you'd like**
Some way to let Oscar.Serialization produce and read gzipped files, without having to manually handle uncompressed files.

**Describe alternatives you've considered**

1. Leave it to the user to attach a `CodecZlib.GzipCompressorStream` to the opened file, and call `save` with the resulting io object.
2. In addition to `save` and `load` have functions `save_compressed` and `load_compressed` that behave basically identically, but include the `CodecZlib.GzipCompressorStream` in-between layer when opening files.
3. Add a `GzipSerializer` that gets created as e.g. `GzipSerializer(JSONSerializer())` and when called in `(de)serializer_open` wraps the `io` object in an `CodecZlib.GzipCompressorStream`.

Orthogonal to the above options, one could leave it to the deserializer to detect if a given file is compressed (either by file name ending or by the magic bytes `1f  8b`) and in this case automatically decompress it.


I am happy to implement this myself, but I wanted to collect some opinions on the different options before starting further work.

Pinging people that might have an opinion (@antonydellavecchia @benlorenz @fingolfin), but everybody else please also comment


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow (de)serialization of gzip'ed files #5664

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Allow (de)serialization of gzip'ed files #5664

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions