Skip to content

feat: Add integration tests for tpchgen-cli#156

Merged
alamb merged 8 commits into
clflushopt:mainfrom
alamb:alamb/integration_test
Jul 21, 2025
Merged

feat: Add integration tests for tpchgen-cli#156
alamb merged 8 commits into
clflushopt:mainfrom
alamb:alamb/integration_test

Conversation

@alamb

@alamb alamb commented Jul 15, 2025

Copy link
Copy Markdown
Collaborator

Rationale

As we make changes to tpchgen-cli, we need some way to test that our changes are correct.
Let's add some tests

Changes

Add CLI based tests using assert_cmd

THis was partly written by GPT 4.0 / Github copilot

@alamb alamb changed the title Implement integration tests Integration tests for tpchgen-cli Jul 15, 2025
@alamb alamb marked this pull request as ready for review July 15, 2025 13:04
@alamb alamb changed the title Integration tests for tpchgen-cli feat: Add integration tests for tpchgen-cli Jul 15, 2025
@alamb alamb requested a review from clflushopt July 15, 2025 13:04
@alamb

alamb commented Jul 15, 2025

Copy link
Copy Markdown
Collaborator Author

FYI @@mrendi29 as well

Comment thread tpchgen-cli/tests/cli_integration.rs
@alamb

alamb commented Jul 15, 2025

Copy link
Copy Markdown
Collaborator Author

The new tests are run as part of CI. Here is an example:

@alamb alamb mentioned this pull request Jul 15, 2025
5 tasks
@mrendi29

Copy link
Copy Markdown

Thanks @alamb for taking a stab at this. @kevinjqliu i recall you had some test that you had not pushed yet, so question to you both, do we want to build off the integration tests suite within this branch/PR or should we split this into multiple PRs?

@clflushopt clflushopt left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @alamb I am approving this PR to unblock and will let you resolve the comments from @mrendi29. As a follow up, do we want to have /data/ use two sources of truth duckdb + parquet and dbgen outputs ?

@alamb

alamb commented Jul 17, 2025

Copy link
Copy Markdown
Collaborator Author

Thanks @alamb for taking a stab at this. @kevinjqliu i recall you had some test that you had not pushed yet, so question to you both, do we want to build off the integration tests suite within this branch/PR or should we split this into multiple PRs?

I personally think we can build from this integration test and then add the "check the output parquet file format"

@alamb

alamb commented Jul 17, 2025

Copy link
Copy Markdown
Collaborator Author

Thanks @alamb I am approving this PR to unblock and will let you resolve the comments from @mrendi29. As a follow up, do we want to have /data/ use two sources of truth duckdb + parquet and dbgen outputs ?

Thank you @clflushopt 🙏

When you say "two sources of truth" I wonder what you mean? Do you mean some way to verify that the actual values written into the .parquet are are correct?

The tests in this PR do not do that (as you note)

@clflushopt

Copy link
Copy Markdown
Owner

Yes one for the tbl format via dbgen and one for the parquet format via duckdb we can generate samples at various scale factors and have them in the repo as fixtures.

@alamb

alamb commented Jul 21, 2025

Copy link
Copy Markdown
Collaborator Author

Yes one for the tbl format via dbgen and one for the parquet format via duckdb we can generate samples at various scale factors and have them in the repo as fixtures.

Filed #161 to track

@alamb alamb merged commit 914612f into clflushopt:main Jul 21, 2025
7 checks passed
@alamb

alamb commented Jul 21, 2025

Copy link
Copy Markdown
Collaborator Author

Thanks again @clflushopt

@alamb alamb deleted the alamb/integration_test branch July 21, 2025 11:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants