Skip to content

Conversation

ArnavBalyan
Copy link
Member

@ArnavBalyan
Copy link
Member Author

cc @wgtmac @shangxinli could you please review thanks!

Copy link
Member

@wgtmac wgtmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for approving it by accident. I need to leave a comment to retract it.

private static void writeSalesData(String filename, MessageType schema) throws IOException {
Path file = new Path(filename);

try (ParquetWriter<Group> writer = ExampleParquetWriter.builder(file)
Copy link
Member

@wgtmac wgtmac Aug 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am just hesitant to use ExampleParquetWriter as examples which is not for production purpose. Adding an example module also incurs more maintenance burden so I don't think this is a good idea TBH.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, we can remove the sub-module and just produce it as reference only example, it should also be able to resolve the documentation concerns raised in the issue, wdyt?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the dependency from ExampleParquetWriter and removed the pom to eliminate maintenance overhead

@ArnavBalyan
Copy link
Member Author

Gentle reminder cc @wgtmac @ggershinsky thanks!

@wgtmac
Copy link
Member

wgtmac commented Aug 28, 2025

TBH, I don't think adding some random examples would really help users because they are pretty similar to what's already in the unit test. What in my mind is something like https://arrow.apache.org/cookbook/ which requires a lot of effort to craft examples and maintain them to be in sync. Today LLMs are smart enough to produce code like this (I believe this PR is exactly doing this, right?).

@ArnavBalyan
Copy link
Member Author

TBH, I don't think adding some random examples would really help users because they are pretty similar to what's already in the unit test. What in my mind is something like https://arrow.apache.org/cookbook/ which requires a lot of effort to craft examples and maintain them to be in sync. Today LLMs are smart enough to produce code like this (I believe this PR is exactly doing this, right?).

Thanks cookbook is a great idea, I would like to implement it for Parquet java, let me add support in another change. I came up with the examples in this to allow beginners to understand basic examples, I myself faced issues a while back when onboarding to Parquet.
I think the change should be harmless and can only help users with some more guidance/help when onboarding to the project, wdyt? (the examples are structured in 3 stages to allow from basic usage to advanced usage to solve for issues raised in #2914)
Thanks so much for the review!

@ArnavBalyan
Copy link
Member Author

ArnavBalyan commented Aug 28, 2025

cc @wgtmac @Fokko @gszadovszky @shangxinli just wanted to get a sense of the community thoughts on a cookbook as a follow up to this PR. I think having better documentation to parquet will help users adopt the project faster and in general would be a good ecosystem addition to the project. If you are open to this I'd like to add support and maintain it in the future. thanks for the suggestion @wgtmac

@ArnavBalyan
Copy link
Member Author

Have created an issue to track this story thanks! Would be really great if folks can review and add suggestions/feedback thanks! #3284

@ArnavBalyan ArnavBalyan requested a review from wgtmac August 28, 2025 13:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants