Skip to content

Benchmarks #8

Open
Open
@Jefffrey

Description

@Jefffrey

To make improving performance more measurable, include benchmarks to be run.

Requires benchmark programs (see https://github.com/apache/arrow-rs/tree/master/parquet/benches)

And also large data files, ideally with all supported data types

Note for the data files, completely random data may not be sufficient, as some encodings take advantage of patterns in the data (e.g. int v2 RLE), so need to keep that in mind if considering generating data for the benchmarks

Could also use something like TPCH or TPCDS data, or NYC taxi, for more variety in data

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions