Skip to content

[Bug][Prism]: Colab cannot run PrismRunner multiple times #33623

Open
@liferoad

Description

@liferoad

What happened?

Steps to reproduce the issue:

  1. Go to https://colab.research.google.com/
  2. Create a new notebook
  3. Install beam
!pip3 install apache-beam[gcp]
  1. ignore any warnings and no need to restart the session
  2. Do this in a cell:
import apache_beam as beam
from apache_beam.runners.portability.prism_runner import PrismRunner
with beam.Pipeline(runner=PrismRunner()) as p:
  N =5 
  p | "Create Elements" >> beam.Create(range(N)) | "Squares" >> beam.Map(lambda x: x**2) | "Print" >> beam.Map(print)
  1. the first run should work well but if you run the cell second time, the below error occurs:
OSError: [Errno 26] Text file busy: '/root/.apache_beam/cache/prism/bin/apache_beam-v2.61.0-prism-linux-amd64'
Image
  1. Check the running process:
Image

/root/.apache_beam/cache/prism/bin/apache_beam-v2.61.0-prism-linux-amd64 is still running. If you kill this process, step 5 will run fine.

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions