Skip to content

InferenceData.to_netcdf and Arviz.from_netcdf be able to take file objects or buffers #1237

Open
@k-sys

Description

@k-sys

Most libraries that allow writing or reading files also allow reading and writing to a python file object or an IO file-like object (eg. BytesIO). Currently Arviz requires writing these files to disk which creates an awkward dance if one simply wants to upload to s3, for instance. This may not seem like a big deal, but the files can be large, you have to remember to delete them off disk after you're done with them, etc.

Currently, to upload to S3, I have to do the following:

with tempfile.NamedTemporaryFile() as fp:
    inference_data.to_netcdf(fp.name)
    fp.seek(0)
    s3.Bucket("bucket-name").upload_fileobj(
        fp, "my-s3-file-key"
    )

Would be much easier to do this (without writing to disk)

with BytesIO() as buffer:
    inference_data.to_netcdf(buffer)
    s3.Bucket("bucket-name").upload_filobj(buffer, "my-s3-file-key")

Even better, I'd suggest using s3fs which is what Pandas does and support s3:// "protocol" so:

inference_data.to_netcdf("s3://bucket-name/my-s3-file-key")

However this last suggestion is not specifically related to the more important ability to write/read from a buffer or file object.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions