Skip to content

datafusion-cli fails to run ClickBench queries with 8GB of RAM #18473

@alamb

Description

@alamb

Describe the bug

@pmcgleenon reports on #17721 (comment):

The instance type c6a.xlarge only has 8GB RAM (the clickbench dataset is 15GB) and there are a number of issues with it, including

...
OOM errors happens during test execution, with results reported as null for several tests

To Reproduce

  1. Use docker to build datafusion:
cd .devcontainer
docker build -t datafusion-build .
cd ..
docker run -m 4G -v `pwd`:/datafusion  -it datafusion-build   /bin/bash

Now, in the docker container

# build datafusion-cli
cd /datafusion
cargo install   --profile=release-nonlto --path datafusion-cli
# Get benchmark data
cd /datafusion/benchmarks
./bench.sh data clickbench_partitioned
cd /datafusion/benchmarks/data
# make symlink to hits so queries can run without modification
ln -s hits_partitioned hits

# run the queries
for q in `ls ../queries/clickbench/queries/*.sql` ; do datafusion-cli -f $q ; done

This is the loop that runs the queries

for q in `ls ../queries/clickbench/queries/*.sql` ; do
  echo "Running $q..." ;
  datafusion-cli -f $q ;
done

You'll see queries get killed due to OOM like this:

Running ../queries/clickbench/queries/q18.sql...
DataFusion CLI v50.3.0
bash: line 1: 57537 Killed                  datafusion-cli -f $q

The queries that are killed are:

Running ../queries/clickbench/queries/q18.sql...
Running ../queries/clickbench/queries/q20.sql...
Running ../queries/clickbench/queries/q22.sql...
Running ../queries/clickbench/queries/q23.sql...
Running ../queries/clickbench/queries/q32.sql...
Running ../queries/clickbench/queries/q33.sql...
Running ../queries/clickbench/queries/q34.sql...
Running ../queries/clickbench/queries/q35.sql...

You can find these queries here: https://github.com/apache/datafusion/tree/main/benchmarks/queries/clickbench/queries

Expected behavior

The queries should not be killed

Additional context

This ticket tracks build OOM'ing:

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions