Exporting data to Google Cloud Storage in Parquet format available but undocumented

Hello all.
I'm trying to export queried data from a BigQuery database table. Since the resulting table can be large (2.5GB or more), I followed the suggestion "Larger datasets" from the ` bq_table_download()` help, and I `used bq_table_save()` to save the data in multiple files in Google Cloud Storage.

When I tried to apply `bq_table_save()`, I discovered an undocumented option to export the files: `destination_format = "PARQUET"` in place of `"NEWLINE_DELIMITED_JSON"` or `"CSV"`. If I use this parameter, `bq_table_save()` saves correctly the data in multiple "parquet" files.

Can I use this option without problems? It seems to me that it works very well: it is very performant, and the use of parquet files saves me a lot of work to check data types.

The following code summarizes at most the code I used to export data succesfully to a Google Cloud Storage bucket:

```
project_id  <- "<project identifier>"
sql_dwn <- "SELECT * FROM <table from which to extract data>"
tb <- bq_project_query(project_id, sql_dwn)
bq_table_save(tb, destination_uris = "destination_bucket/folder/filename_*.parquet", destination_format="PARQUET") 
```
Thank you in advance for your help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Exporting data to Google Cloud Storage in Parquet format available but undocumented #614

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Exporting data to Google Cloud Storage in Parquet format available but undocumented #614

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions