Skip to content

Conversation

xe-nvdk
Copy link

@xe-nvdk xe-nvdk commented Oct 6, 2025

Hey everyone,

We’re the new folks in the neighborhood, sharing ClickBench results for Arc, our time-series warehouse that’s launching soon.
I’ve made sure everything follows the benchmark requirements, but happy to adjust if needed.

Appreciate your work on this project!
– Ignacio

@CLAassistant
Copy link

CLAassistant commented Oct 6, 2025

CLA assistant check
All committers have signed the CLA.

@rschu1ze rschu1ze self-assigned this Oct 6, 2025
Copy link
Author

@xe-nvdk xe-nvdk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are going to push a new update of this PR in a few minutes. Thank you for marking the issues.

@xe-nvdk

This comment was marked as resolved.

@xe-nvdk
Copy link
Author

xe-nvdk commented Oct 7, 2025

Just updated the files and make it public the repo. Thanks.

arc/benchmark.sh Outdated

# Install Python and dependencies
echo "Installing dependencies..."
pip3 install fastapi uvicorn duckdb pyarrow requests gunicorn
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires running pip with --break-system-packages.

Would it be possible to create a Python venv? See e.g. chdb/benchmark.sh for an example.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, we have in our start.sh in the repo, I'm adding to this script.

arc/benchmark.sh Outdated

# Create API token for benchmark
python3 << EOF
from api.auth import AuthManager, Permission
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the next error here:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: cannot import name 'Permission' from 'api.auth' (/data/ClickBench/arc/arc/api/auth.py)

I checked, there is indeed no Permission class in file auth.py.

Copy link
Author

@xe-nvdk xe-nvdk Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uff, thank you for this, its old code, in our repo we have this right. Let me update it here too.

## Prerequisites

- Ubuntu/Debian Linux (or compatible)
- Python 3.11+
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be no prerequisites - the benchmark runs automatically on an empty AWS machine with Ubuntu AMI.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. We’ll revisit the submission later this year. For now, we’re happy to have the benchmark numbers internally and will use them for our own reference. Once we release official binaries, we’ll try again to get included in ClickBench.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a problem, let's push this PR to ClickBench. The more systems included, the better.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @alexey-milovidov we just updated, we were able to run the benchmark.sh according to clickbench guidelines. Let me know if you have issues running, but shouldn't have any. Thank you.

@alexey-milovidov
Copy link
Member

No success so far:

Running ClickBench queries via Arc HTTP API...
================================================
Checking if Arc is running at http://localhost:8000...
Arc is running. Using parquet file: /ClickBench/arc/hits.parquet
Running 43 queries via Arc HTTP API...
Query 1 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 2 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 3 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 4 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 5 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 6 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 7 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 8 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 9 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 10 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 11 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 12 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 13 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 14 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 15 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 16 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 17 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 18 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 19 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 20 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 21 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 22 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 23 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 24 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 25 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 26 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 27 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 28 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 29 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 30 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 31 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 32 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 33 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 34 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 35 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 36 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 37 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 38 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 39 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 40 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 41 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 42 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Query 43 failed: 401 - {"error":"Unauthorized","detail":"Invalid or missing API token"}
Benchmark complete!

@alexey-milovidov
Copy link
Member

However, it did something before:

Creating API token...
Created API token: EdodzXfV99KRO-0XoWONWxxUs0AGH2HqjcNRq4c4rfg
Token created successfully

@xe-nvdk
Copy link
Author

xe-nvdk commented Oct 13, 2025

Ok guys, the benchmark.sh is fixed and we put some results for no cached and cached for c6a.4xlarge with gp2 of 500gb in aws. Please, validate this and let me know.

@alexey-milovidov
Copy link
Member

Thanks! Now it runs successfully.

A few corrections are still needed, e.g.,

It should output the results in the following format: - one or more lines Load time: 1234 with the time in seconds; - a line Data size: 1234567890 with the data size in bytes; the data size should include indexes and transaction logs if applicable; - 43 consecutive lines in the form of [1.234, 5.678, 9.012], for the runtimes of every query; - the output may include other lines with the logs, that are not used for the report.

@xe-nvdk
Copy link
Author

xe-nvdk commented Oct 13, 2025

Ok, Thanks, this should be good now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants