Rust based benchmarks & Tests

# Requirments

* The benchmarks should be entirely written in rust.
* The benchmarks should be portable and not rely on the presence of platform defined dictionary files.
* The benchmarks should have the ability to be run with specific parameters

  1. Number of input lines
  2. Fraction of duplicates
  2. Distribution of input line length
  3. Char set (binary/text)

* The benchmarks should still be able to run against all the preexisting commands (`sort|uniq`).

# Design

A CLI application should be written that produces a set of random tokens according to the parameters specified on the CLI:

```sh
genbench --charset ascii/binary --delim CHAR --number NUM --duplicates PERCENTAGE --short LEN --long LEN
```

The short/long parameters each indicate the 90% percentile of string lengths, using a gaussian distribution.

For the actual benchmark we should write a benchmark executor that runs each of the implementations with a variety of parameters handed to `genbench`.

# Tests

We can reuse the same strategy for testing by generating test data with genbench and then comparing the output of the full huniq and a super naive, unoptimized huniq implementation. We should specifically make sure, that buffer growing is tested (supply some very long, >20kb strings).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Rust based benchmarks & Tests #8

Requirments

Design

Tests

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Rust based benchmarks & Tests #8

Description

Requirments

Design

Tests

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions