Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fn Runner Watermark issue #34484

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Fn Runner Watermark issue #34484

wants to merge 3 commits into from

Conversation

13MaxG
Copy link

@13MaxG 13MaxG commented Mar 31, 2025

Proposing a solution to #26190 .
It appears the Flatten was not setting a watermark, which caused following steps not to execute.

The issue was returning errors on beam versions 2.39 onwards, and it potentially produced unstable results before 2.39.

There are meaningful TODOs mentioned:
# TODO(robertwb): Possibly fuse multi-input flattens into one of the stages.
# TODO(pabloem): Consider case where there are various producers

Copy link
Contributor

Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment assign set of reviewers

@github-actions github-actions bot added the build label Apr 1, 2025
Copy link
Contributor

github-actions bot commented Apr 1, 2025

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @claudevdm for label python.
R: @damccorm for label build.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

@@ -388,7 +388,8 @@ def create_stages(
translations.lift_combiners,
translations.expand_sdf,
translations.expand_gbk,
translations.sink_flattens,
translations.fix_flatten_coders,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this added?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix_flatten_coders was added to solve yaml unit tests.
It also mimics the translations.standard_optimize_phases() used by Portable Runner

Without it, theYamlMappingTest::test_basic yields

apache_beam.testing.util.BeamAssertException: Failed assert: [
Row(label='11a', isogeny='a'), Row(label='37a', isogeny='a'), Row(label='389a', isogeny='a')] == [BeamSchema_ccf257cb_1966_410e_8157_00cd826e7392(label='11a', isogeny='a'), BeamSchema_ccf257cb_1966_410e_8157_00cd826e7392(label='37a', isogeny='a'), BeamSchema_ccf257cb_1966_410e_8157_00cd826e7392(label='389a', isogeny='a')], 
unexpected elements [BeamSchema_ccf257cb_1966_410e_8157_00cd826e7392(label='11a', isogeny='a'), BeamSchema_ccf257cb_1966_410e_8157_00cd826e7392(label='37a', isogeny='a'), BeamSchema_ccf257cb_1966_410e_8157_00cd826e7392(label='389a', isogeny='a')], 
missing elements [Row(label='11a', isogeny='a'), Row(label='37a', isogeny='a'), Row(label='389a', isogeny='a')
]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the PreCommit YAML is still failing, can you please take a look>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this maybe doesn't work because fix_flatten_coders assumes that the flattens will eventually be dealt with by sink_flattens. That may be what is causing the failures?

@@ -388,7 +388,8 @@ def create_stages(
translations.lift_combiners,
translations.expand_sdf,
translations.expand_gbk,
translations.sink_flattens,
translations.fix_flatten_coders,
# translations.sink_flattens,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, this PR isn't fixing the underlying issue, rather it is disabling the optimization?

Can we add a comment referencing why this is disabled, with reference to the bug?

Also I am not sure if the tradeoff between disabling this optimization is worth fixing this specific edge-case, maybe @damccorm has thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to try to fix the underlying issue. This has the potential to introduce new encoding/decoding problems which it may be possible to generally avoid

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants