Skip to content

[Bug]: beam.io.WriteToBigQuery failed when given schema with space #25704

Open
@jwzh222

Description

@jwzh222

What happened?

there is any issue in python SDK beam.io.WriteToBigQuery()
when you add a space in schema, like schema="name: STRING", it will fail.

error message:
"message": "Invalid value for type: STRING is not a valid value"

example code:

`import apache_beam as beam
from apache_beam.options.pipeline_options import PipelineOptions

def run():
pipeline_args = []
pipeline_options = PipelineOptions(pipeline_args)

table_ref = 'project_id:dataset_id.table_id'
schema_with_space = "name: STRING"
schema_without_space = "name:STRING"

with beam.Pipeline(options=pipeline_options) as p :
    records = p | 'load records' >> beam.Create([{"name":"bob"},{"name":"alice"}])
    records | 'write to bigquery' >> beam.io.WriteToBigQuery(
        table = table_ref,
        schema = schema_with_space,
        create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED
    )

if name == 'main':
pipeline_args = [
'--runner',
'DirectRunner',
'--project',
'YOUR_PROJECT_ID',
]

run()`

Issue Priority

Priority: 3 (minor)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions