Skip to content

fix: Parquet format output not working in CLI for show commands #25997

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Karribalu
Copy link

BREAKING CHANGE:
The short option -o, previously used for order-by in the table-list command, has been replaced and is now used for the output option.

Closes #25941

Describe your proposed changes here.

  1. Mandated the -o or --output option when -format specified as Parquet
  2. Updated the following commands with the same fix
    • show databases
    • show system table
    • show system table-list
    • show system summary
  • I've read the contributing section of the project README.
  • Signed CLA (if not already signed).

BREAKING CHANGE:
The short option -o, previously used for order-by in the table-list command, has been replaced and is now used for the output option.
Copy link
Contributor

@hiltontj hiltontj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

He @Karribalu - I have a couple of suggestions in line. I think the breaking change is okay given the prior discussion about it, and that this makes it consistent with other CLIs that output parquet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to add a success test case that writes the parquet to a temp file, then reads it, and validates its contents.

Some helpful APIs that would enable that:

  • We use the tempfile crate for temporary files in tests
  • There are APIs for reading parquet files into Arrow RecordBatchs in the parquet crate
  • There are helpers for visually asserting on the contents of those record batches in DataFusion, e.g., assert_batches_sorted_eq

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @hiltontj,
Apologies for the late reply.
I have added test cases for validating the parquet files using the resources you provided.
Please let me know if you have any comments.

@hiltontj
Copy link
Contributor

@Karribalu - sorry that we have not followed up on this PR until now.

I will have to request some more changes in order to get this merged:

  • It looks like you may need to merge main to get recent cargo-audit failures addressed.
  • The BREAKING CHANGE needs to be removed from this PR. That means that the short option -o needs to be taken out, and this command can only use the long --output option. We cannot introduce a breaking change at this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug with 'influxdb3 show databases' when outputting to parquet
2 participants