Skip to content

Conversation

@amoeba
Copy link

@amoeba amoeba commented May 31, 2025

Hello! In Arrow 20, some notable improvements were made to our table join implementation in terms of speed and memory requirements so I wanted to update the benchmarks using the latest version.

I wasn't able to use the included scripts as-is to set up my EC2 instance but I think I was able to get close enough by referencing them. I also wasn't able to preview the results due to errors but I did poke around the CSVs and I think the we don't OOM on some of the questions we currently do. I'd love to see what the results look like now.

Let me know if I can add anything else here.

@Tmonster
Copy link
Collaborator

Tmonster commented Jun 4, 2025

Hi Thanks for the PR!
I'm looking into fixing some other things on the benchmark, will take look at this today/thursday

@amoeba
Copy link
Author

amoeba commented Jun 4, 2025

Thanks @Tmonster, much appreciated.

@Tmonster Tmonster merged commit d7b386e into duckdblabs:main Jun 10, 2025
@Tmonster
Copy link
Collaborator

thanks! Will publish later today as well

@amoeba
Copy link
Author

amoeba commented Jun 10, 2025

Awesome, thanks @Tmonster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants