Benchmark / program to test Spilling Sorts

### Is your feature request related to a problem or challenge?

- Part of  https://github.com/apache/datafusion/issues/15271

There are many interesting ideas on how to improve DataFusion while spilling for example https://github.com/apache/datafusion/issues/15271 from @2010YOUY01  and others. 

What I think we really need next to make progress in this area is a benchmark / agreed upon way of measuring our progress so that we can improve and

### Describe the solution you'd like

I would like a documented command / set of commands that is:
1. Easy to run (and thus fast to test / iterate on)
2. Exercises the spilling feature at different levels of memory pressure
3. Spends most of its time sorting/spilling/merging (not generating output for example)

### Describe alternatives you've considered

idea 1: can use some `datafusion-cli` features / flags and document them

Idea 2: Add a new suite to bench.sh / `dfbench`: https://github.com/apache/datafusion/tree/main/benchmarks


As for what to do I suggest something relatively simple like sorting the  TPCH lineitem table with 200MB, 500MB,  1GB, 5GB and 10GB of memory for example

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark / program to test Spilling Sorts #15664

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Benchmark / program to test Spilling Sorts #15664

Description

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions