Skip to content

[Bug]: GBK with global window must fit all values for a key into a memory #30184

Open
@mareksimunek

Description

@mareksimunek

What happened?

With optimization global window grouping in Spark
https://issues.apache.org/jira/browse/BEAM-12646

8c3af01#diff-c13404655c9bf261fcbcc72feb949e0ffcf428802897d2f39097f34d7a3d995aL185

it reintroduced issue for jobs with global windows. Now it is using spark's groupByKey = all values for a single key need to fit in-memory at once. It was already solved in:
https://issues.apache.org/jira/browse/BEAM-5392

Issue Priority

Priority: 3 (minor)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions