Description
What happened?
Some similar issues has been reported and fixed in beam: #29890 #25510 and in flink https://issues.apache.org/jira/browse/FLINK-28248
We are using Apache Beam with the Python SDK to submit some batch jobs with the Flink REST API.
- beam:
apache-beam==2.58.0
- flink:
1.18.1
The metaspace memory is only growing in flink (after each submit):

At some point, it is not possible to submit new jobs (the Flink api hang).
I am not really use to debug JVM memory leak. We tried to run jmap -clstats 1
there is a lot of duplicate class like: org.apache.beam.vendor.grpc.v1p60p1.com.google.protobuf.DescriptorProtos$DescriptorProto
org.apache.beam.vendor.grpc.v1p60p1.com.google.protobuf.Descriptors$Descriptor
but those size are really small, so I am not sure it is a problem.
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components
- Component: Python SDK
- Component: Java SDK
- Component: Go SDK
- Component: Typescript SDK
- Component: IO connector
- Component: Beam YAML
- Component: Beam examples
- Component: Beam playground
- Component: Beam katas
- Component: Website
- Component: Infrastructure
- Component: Spark Runner
- Component: Flink Runner
- Component: Samza Runner
- Component: Twister2 Runner
- Component: Hazelcast Jet Runner
- Component: Google Cloud Dataflow Runner