-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Move Row to typehints/row.py, add compatibility alias in pvalue.py, update imports, and update CHANGES.md #37071
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…pdate imports, and update CHANGES.md (Fixes apache#35095)
Summary of ChangesHello @nikitagrover19, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request refactors the Python SDK by relocating the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment |
|
Assigning reviewers: R: @claudevdm for label python. Note: If you would like to opt out of this review, comment Available commands:
The PR bot will only process comments in the main thread (not review comments). |
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors the Row class by moving it from pvalue.py to a new, more appropriate location in typehints/row.py. This is a good code organization improvement that enhances modularity. The changes maintain backward compatibility by providing an alias in the old location.
I've found a couple of issues in the moved Row class implementation that were likely pre-existing but are good to fix as part of this refactoring. Specifically, the __hash__ method is implemented in a way that will cause a TypeError at runtime, and the __eq__ method has an incorrect check for NamedTuple instances. I've also left a minor formatting suggestion for CHANGES.md.
Overall, the refactoring is well-executed. Addressing the identified issues will improve the robustness of the Row class.
| return 'Row(%s)' % ', '.join('%s=%r' % kv for kv in self.__dict__.items()) | ||
|
|
||
| def __hash__(self): | ||
| return hash(self.__dict__.items()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The __hash__ method will raise a TypeError because dict.items() returns a view object, which is not hashable in Python 3. To fix this, you should convert the items to a hashable, ordered type like a tuple. Given that field order is important for Row objects, converting to a tuple is the correct approach.
| return hash(self.__dict__.items()) | |
| return hash(tuple(self.__dict__.items())) |
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| elif type(other) == type(NamedTuple): | ||
| other_dict = other._asdict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check type(other) == type(NamedTuple) is incorrect for identifying NamedTuple instances. typing.NamedTuple is a class factory, and instances of named tuples do not have NamedTuple as their type. A more robust way to check for namedtuple-like objects is to see if it's a tuple and has an _asdict method.
| elif type(other) == type(NamedTuple): | |
| other_dict = other._asdict() | |
| elif isinstance(other, tuple) and hasattr(other, '_asdict'): | |
| other_dict = other._asdict() |
Move Row to typehints/row.py, add compatibility alias in pvalue.py, update imports, and update CHANGES.md
This change reorganizes the Python SDK by moving the Row class into a dedicated module under apache_beam.typehints.
The goal is to improve module structure, reduce coupling, and better align Beam’s schema utilities with the type-hinting subsystem.
Description
This PR moves the Row implementation from apache_beam.pvalue into a new file, apache_beam.typehints.row.
This is a more appropriate location for a schema-aware, type-hinted data container and helps simplify the structure of pvalue.py.
To maintain full backward compatibility, pvalue.py now exposes a compatibility alias so that existing user code importing apache_beam.Row continues to work without modification.
Additional Updates
This improves code organization and moves schema utilities closer to the rest of Beam’s type-hinting system, without breaking any public APIs.
Fixes #35095
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>instead.CHANGES.mdwith noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.