Releases: GoogleCloudPlatform/DataflowTemplates
Dataflow Templates 2022-09-13-00_RC00
Release Week of 2022-09-12
New Templates
- JDBC to BigQuery Flex template added. Same functionality as the existing classic template, but the new template also supports BigQuery Storage Write API.
Improvements
- [BigQueryToParquet Template] Now supports row restrictions of the BigQuery Storage Read API.
- [DataStreamToSpanner Template] Better handling of incoming change events based on the schema mappings in the session file.
- Prevent WindowedFilenamePolicy from changing bucket names. WindowedFileNamePolicy replaces date patterns in output directory for dynamically changing the output location based on the window end time. This leads to errors when bucket name contains a date pattern. This change makes sure that the bucket name is always unchanged.
Bug Fixes
- Temporary workarounds for SpannerIO: LocalSpannerIO and LocalReadSpannerSchema classes now cast JSONB records as VARCHAR.
Minor changes
- Removed the explicit Bigtable client version from the pom.xml files to use the transitive version from Beam. This is to keep the client library up-to-date and match the version expected by Beam.
- Updated maven-dependency-plugin version.
Contributors
@pranavbhandari24
@oleg-semenov
@Deep1998
@shubhamswe
@bvolpato
Full Changelog: 2022-09-05-00_RC00...2022-09-13-00_RC00
Dataflow Templates 2022-09-19-00_RC00
Release Week of 2022-09-19
Improvements
- Updated to Beam 2.41
- [Pub/Sub] Added framework for integration tests
- WindowedFileNamePolicy bucket protections
- BigQueryToParquet template supports row restrictions
- Updated maven dependency plugin
Contributors
Dataflow Templates 2022-09-05-00_RC00
Release Week of 2022-09-05
Improvements
[Pub/Sub Proto to BigQuery] Improve documentation to give correct include_imports flag
[DataStream To Spanner] Add transformations supported by HarbourBridge
[Spanner Change Streams to BigQuery] Remove unnecessary logging statements
Contributors
Dataflow Templates 2022-08-29-00_RC00
Release Week of 2022-08-29
Improvements
[Spanner Change Stream] Allow setting autoscaling parameters
[Pub/Sub to Cloud Storage] Allow configuration of windowDuration parameter
Bug Fixes
[Spanner Change Stream to BigQuery] Fix parameter name from spannerRpcAuthority
to rpcAuthority
[Spanner] Use Spanner version 6.23.3 to fix null pointer exceptions
[CDC] Better error handling when merge info can't be fetched from BigQuery
[BigQuery] Change -
characters to _
in BigQuery dataset names
Contributors
Dataflow Templates 2022-08-15-00_RC02
Release Week of 2022-08-15
New Templates
N/A
Improvements
[ElasticSearchIO] ElasticsearchIO template improvements
[MongoDB] MongoDB batch template - Removed bucketing to avoid aggregation at read time
[Spanner] Created Spanner Resource Manager
Bug Fixes
[Spanner to BigQuery] Debug template options
Contributors
Mark Pevec
Venkatesh Shanbhag
ike-albert
Dataflow Templates 2022-08-08-00_RC01
Dataflow Templates 2022-08-01-00_RC04
Release Week of 2022-08-01 (second candidate)
Bug Fixes
[MongoDB] Fix classpath in spec files
Contributors
Dataflow Templates 2022-08-01-00_RC02
Release Week of 2022-08-01
New Templates
- MongoDB to BigQuery
Improvements
- Expose
numShards
option in Pub/Sub to Text - Updated static to parallel processing for MongoDB/BigQuery templates
Bug Fixes
WindowedFilenamePolicy
no longer changes the bucket name
Contributors
Dataflow Templates 2022-07-26-00_RC00
Release Week of July 25th, 2022
New Templates
[MongoDB] New MongoDB templates
[PubSubToText] New flex template that can handle both subscription and topic
Improvements
[DataplexFileFormatConversion] Support updating metadata in Dataplex for the newly generated data
[DataplexJdbcIngestion] Support updating metadata in Dataplex for the newly generated data.
[DatastreamToBigQuery] Bug-fix: Merge sort keys comparison should be wrapped by parentheses
[All] Upgrade Beam version to v2.39.0
Contributors
Venkatesh Shanbhag
@an2x
@oleg-semenov
Dataflow Templates 2022-07-18-00_RC00
Release Week of July 18, 2022
[Spanner templates] Improving Unit test coverage for spanner, spanner.ddl, spanner.common packages and AvroRecordConverter.
[Spanner templates] Moving generated classes from proto to a separate package.
[DatastreamToBigQuery] Use consistent BigQuery naming cleanup to avoid invalid characters.
Bug Fixes
[Pub/Sub to Splunk template] Disable failing test for HttpEventPublisher
[DatastreamToBigQuery] Add MergeInfo equals to avoid serialization warnings.
Contributors
@dhercher
@pranavbhandari24
djagaluru