Skip to content

Releases: GoogleCloudPlatform/DataflowTemplates

Dataflow Templates 2022-09-13-00_RC00

20 Sep 02:33
Compare
Choose a tag to compare

Release Week of 2022-09-12

New Templates

  • JDBC to BigQuery Flex template added. Same functionality as the existing classic template, but the new template also supports BigQuery Storage Write API.

Improvements

  • [BigQueryToParquet Template] Now supports row restrictions of the BigQuery Storage Read API.
  • [DataStreamToSpanner Template] Better handling of incoming change events based on the schema mappings in the session file.
  • Prevent WindowedFilenamePolicy from changing bucket names. WindowedFileNamePolicy replaces date patterns in output directory for dynamically changing the output location based on the window end time. This leads to errors when bucket name contains a date pattern. This change makes sure that the bucket name is always unchanged.

Bug Fixes

  • Temporary workarounds for SpannerIO: LocalSpannerIO and LocalReadSpannerSchema classes now cast JSONB records as VARCHAR.

Minor changes

  • Removed the explicit Bigtable client version from the pom.xml files to use the transitive version from Beam. This is to keep the client library up-to-date and match the version expected by Beam.
  • Updated maven-dependency-plugin version.

Contributors

@pranavbhandari24
@oleg-semenov
@Deep1998
@shubhamswe
@bvolpato

Full Changelog: 2022-09-05-00_RC00...2022-09-13-00_RC00

Dataflow Templates 2022-09-19-00_RC00

19 Sep 13:21
Compare
Choose a tag to compare

Release Week of 2022-09-19

Improvements

  • Updated to Beam 2.41
  • [Pub/Sub] Added framework for integration tests
  • WindowedFileNamePolicy bucket protections
  • BigQueryToParquet template supports row restrictions
  • Updated maven dependency plugin

Contributors

@oleg-semenov
@pranavbhandari24
@bvolpato

Dataflow Templates 2022-09-05-00_RC00

06 Sep 15:02
Compare
Choose a tag to compare

Release Week of 2022-09-05

Improvements

[Pub/Sub Proto to BigQuery] Improve documentation to give correct include_imports flag
[DataStream To Spanner] Add transformations supported by HarbourBridge
[Spanner Change Streams to BigQuery] Remove unnecessary logging statements

Contributors

@zhoufek
@bvolpato
Deep Chowdhury

Dataflow Templates 2022-08-29-00_RC00

30 Aug 18:20
Compare
Choose a tag to compare

Release Week of 2022-08-29

Improvements

[Spanner Change Stream] Allow setting autoscaling parameters
[Pub/Sub to Cloud Storage] Allow configuration of windowDuration parameter

Bug Fixes

[Spanner Change Stream to BigQuery] Fix parameter name from spannerRpcAuthority to rpcAuthority
[Spanner] Use Spanner version 6.23.3 to fix null pointer exceptions
[CDC] Better error handling when merge info can't be fetched from BigQuery
[BigQuery] Change - characters to _ in BigQuery dataset names

Contributors

@bvolpato
@marengaz
Internal contributors

Dataflow Templates 2022-08-15-00_RC02

16 Aug 14:15
Compare
Choose a tag to compare

Release Week of 2022-08-15

New Templates

N/A

Improvements

[ElasticSearchIO] ElasticsearchIO template improvements
[MongoDB] MongoDB batch template - Removed bucketing to avoid aggregation at read time
[Spanner] Created Spanner Resource Manager

Bug Fixes

[Spanner to BigQuery] Debug template options

Contributors

Mark Pevec
Venkatesh Shanbhag
ike-albert

Dataflow Templates 2022-08-08-00_RC01

10 Aug 18:50
Compare
Choose a tag to compare

Release Week of 2022-08-08

Improvements

[Dataplex Templates] Minor changes in the updateDataplexMetadata parameter description.

Bug Fixes

Fixed an issue with null error messages in BulkDecompressor (#433).

Contributors

@zhoufek
@an2x

Dataflow Templates 2022-08-01-00_RC04

05 Aug 14:16
Compare
Choose a tag to compare

Release Week of 2022-08-01 (second candidate)

Bug Fixes

[MongoDB] Fix classpath in spec files

Contributors

@an2x

Dataflow Templates 2022-08-01-00_RC02

03 Aug 14:01
Compare
Choose a tag to compare

Release Week of 2022-08-01

New Templates

  • MongoDB to BigQuery

Improvements

  • Expose numShards option in Pub/Sub to Text
  • Updated static to parallel processing for MongoDB/BigQuery templates

Bug Fixes

  • WindowedFilenamePolicy no longer changes the bucket name

Contributors

@an2x
@jrmccluskey
@pranavbhandari24
@theshanbhag

Dataflow Templates 2022-07-26-00_RC00

26 Jul 20:26
Compare
Choose a tag to compare

Release Week of July 25th, 2022

New Templates

[MongoDB] New MongoDB templates
[PubSubToText] New flex template that can handle both subscription and topic

Improvements

[DataplexFileFormatConversion] Support updating metadata in Dataplex for the newly generated data
[DataplexJdbcIngestion] Support updating metadata in Dataplex for the newly generated data.
[DatastreamToBigQuery] Bug-fix: Merge sort keys comparison should be wrapped by parentheses
[All] Upgrade Beam version to v2.39.0

Contributors

Venkatesh Shanbhag
@an2x
@oleg-semenov

Dataflow Templates 2022-07-18-00_RC00

24 Jul 21:13
Compare
Choose a tag to compare

Release Week of July 18, 2022

[Spanner templates] Improving Unit test coverage for spanner, spanner.ddl, spanner.common packages and AvroRecordConverter.
[Spanner templates] Moving generated classes from proto to a separate package.
[DatastreamToBigQuery] Use consistent BigQuery naming cleanup to avoid invalid characters.

Bug Fixes

[Pub/Sub to Splunk template] Disable failing test for HttpEventPublisher
[DatastreamToBigQuery] Add MergeInfo equals to avoid serialization warnings.

Contributors

@dhercher
@pranavbhandari24
djagaluru