Skip to content

BigQuery IO Source is not Exporting to GCS as written in documentation #19174

Open
@kennknowles

Description

@kennknowles

Did some check on the beam code and find out that DataFlow is querying BigQuery and retrieve the result using pagination [1]. As per our understanding, this means no parallelism on reading BigQuery table. It is contradictory to what the documentation is telling us [2].
 
Is this some kind of work in progress? I'm filing as a bug since documentation telling me that it is using GCS meanwhile it's using NativeSourceReader which yield data per row as iterator.
 
[1] 


[2] 
The main and side inputs are implemented differently. Reading a BigQuery table

Imported from Jira BEAM-5352. Original Jira may contain additional context.
Reported by: rendybjunior.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions