Releases: GoogleCloudPlatform/DataflowTemplates
Dataflow Templates 2023-01-17-01_RC00
Release Week of 2023-01-17
Note: This release also includes changes from the release 2023-01-10-00_RC00, which was cancelled. If you're looking for a version that includes a bugfix from 2023-01-10-00_RC00, please use the latest version 2023-01-17-01_RC00 instead.
Improvements
- [Spanner Change Stream Templates] Support import/export for change streams in Cloud Spanner PostgreSQL-dialect databases.
- Added JDBC and Spanner sinks to StreamingDataGenerator template.
- A number of integration tests and resource managers added.
Bug Fixes
- [Datastream Templates] A fix for a bug sometimes causing duplicated records to be written to the target database.
Contributors
@Abacn
@nancyxu123
@bvolpato
@pranavbhandari24
@nirfi
@Polber
Dataflow Templates 2023-01-10-00_RC00
Release Week of 2023-01-10
Note: This release has been cancelled and hasn't been fully rolled out to production. Please use version 2023-01-17-01_RC00 or later instead, which includes these changes.
Improvements
- [Integration Tests] Create Elasticsearch Resource Manager and Create GCS to Elasticsearch integration test
- [Documentation] Improve documentation for Dataflow CSV import pipeline trailing delimiter.
- [Flex Templates] Add SecretManagerUtils in v2 common directory
Bug Fixes
- [Security] Bump postgresql dependency due to CVE-2022-21724 and CVE-2022-31197
- [Spanner Tests] Fix an invalid column default value in RandomDdlGenerator.
Contributors
Dataflow Templates 2023-01-03-00_RC00
Release Week of 2023-01-03
Improvements
[TextIOToBigQuery] Integration test for TextIOToBigQuery flex template
[All Templates] Enable available Nashorn engine ES6 support for Teleport templates
[Syndeo] An integration test for the Kafka-to-BQ flow in Syndeo
[StreamingDataGenerator] Add schema templates to StreamingDataGenerator dataflow template
[Flex Templates] Use structured logging for uncaught exceptions, to log them with ERROR severity/level.
[Flex Templates] Refactor v2 to remove ValueProviders
[Templates in googlecloud-to-googlecloud] Improve dependency management
Bug Fixes
[Security] Bump commons-beanutils dependency due to CVE-2019-10086
[Security] Bump spring-expression due to CVE-2022-22950
[Security] Bump commons-configuration2 dependency due to CVE-2022-33980
[Integration Tests] Build plugin dependencies using a specific target folder to prevent ClassLoader issues
[Spanner Templates] Unit test fixes
[Templates Plugin] Improve plugins documentation + sanitize bucket name arguments
[MongoDbToBigQuery] Integration test fixes
[Datastore Templates] Do not use @default for Firestore Workers, as it overrides Datastore options
Contributors
@andreigurau
@bvolpato
@oleg-semenov
@pabloem
@Polber
@pranavbhandari24
@nancyxu123
@ryanmadden-google
Dataflow Templates 2022-12-13-00_RC01
Release Week of 2022-12-13
Note: This release is in the process of rolling out. It may not be in your region yet.
Improvements
- [BigQuery] Add support to Storage Write API to several templates interacting with BigQuery.
- [DatastreamToSpanner] Support Postgres as a source
- [Integration Tests] Added several integration / end-to-end tests and resource managers
- [PubsubAvroToBigQuery] Upgrade PubsubAvroToBigQuery template to support Storage Write API.
- [Security] Bump dependencies as a response to CVEs
- [Templates Plugin] Several improvements to metadata annotations and templates plugin
Bug Fixes
- [All Templates] Add imperative version for Jackson/FasterXML to match Beam 2.43.0
- [Changestreams] Fix WriteDataChangeRecordsToAvroTest serialization issue
- [Flex Templates] Make classpath deterministic for Flex Template executions, by always making sure that Conscrypt is loaded first.
Contributors
Dataflow Templates 2022-12-05-00_RC00
Release Week of 2022-12-05
Improvements
- [All templates] Introduce metadata annotations.
- [DataStreamToBigQuery] Expose mergeConcurrency option and re-throw error on merge statement fail.
- Trigger Java PR workflow when any XML is changed + files are deleted.
- [Classic templates] Support JSONB arrays.
- [PubSubCdcToBigQuery] Support maxStreamingBatchSize parameter
- [DatastreamToSpanner] Add changes for the new HarbourBridge session file with tableID and columnID support.
- [All templates] Upgrade Beam version to 2.43.
- [Classic templates] Prepare plugin infra to test classic templates + Create BulkCompressionIT
- [Integration Tests] Do not make artifactBucket mandatory (only if bucketName not provided for ITs)
- [DataStreamToSpanner] Change default values for dlqRetryMinutes and dlqMaxRetryCount params.
- [Integration Tests] Avoid Joiner conflict, and improve plugin staging speed
- [Flex templates] Plain text logging for Flex Templates unit tests
- [Integration Tests] Improve plugin bucket parameter requirements
- [SpannerChangeStreamsTemplates] Simplify the code of setting experiments for spanner change streams to BigQuery and spanner change streams to GCS templates.
- [Integration Tests] Create MongoDB Resource Manager
- [Integration Tests] Create MongoDBToBigQueryIntegrationTest
- [Integration Tests] Add TestContainers framework
- [Integration Tests] Create PubsubAvroToBigQueryIT + prepare profile to run integration tests together
- [MongoDBToBigQuery] Create udf for MongoDB BigQuery Templates
- [Security] Update hadoop version affected by CVE-2022-25168
- Improve Templates Plugin instructions
- [Syndeo templates] Separating JSON build
Bug Fixes
- [Classic templates] WindowedFilenamePolicy's dayPattern defaults to dd instead of DD
- [JDBC templates] Do not log unencrypted values/keys to the console
- [DataStream templates] Rethrow exception from ExtractGcsFile so that Dataflow will retry the pardo
- [Integration Tests] Fix integration tests parameters passing
- [Flex templates] Fix log dependencies (log4j initialization error)
Contributors
@bvolpato
@oleg-semenov
@pabloem
@Polber
@pranavbhandari24
@theshanbhag
Dataflow Templates 2022-11-16-00_RC00
Release Week of 2022-11-16
New Templates
Text-to-BigQuery Flex template with support for BQ Storage Write API
Improvements
- [All templates] Upgrade Beam version to 2.42.
- [Batch Flex template with BQ Sinks] BQ Storage Write API options for Batch Flex Templates
- [Syndeo] Add Jib to syndeo-template module and bind it to the package phase for the Syndeo template
- [Syndeo] Handle schemas that are not provided by configuration
- [Flex templates] Enable structured logging for v2 templates
- [All templates] Enforce Conscrypt version 2.5.2 (matching Beam 2.42.0)
- [All templates] Update commons-text
- [BigTable templates] update bigtable-beam-import version
- [Spanner templates] Supported new value capture types (NEW_VALUES and NEW_ROW).
Bug Fixes
- pom file name fix in .github/workflows/prepare-java-cache.yml
- [Unit tests] Reducing verbosity of unit test logs
Contributors
Dataflow Templates 2022-11-01-00_RC00
Dataflow Templates 2022-10-25-00_RC00
Release Week of 2022-10-25
Note: This release is in the process of rolling out. It may not be in your region yet.
Improvements
[Project Structure] The structure of the project has changed to simplify contributing.
- Move classic templates to their own subdirectory called
v1/
, this is consistent with flex templates being housed in thev2/
subdirectory. - Remove
unified-templates.xml
and rely solely on the rootpom.xml
for building the project.
[JdbcToBigQuery Template] Add optional data loading pipeline option to toggle whether data is truncated or appended into BigQuery.
Contributors
Pablo Estrada @pabloem
Suddhasatwa Bhaumik @suddhasatwabhaumik
Bruno Volpato @bvolpato
Dataflow Templates 2022-10-18-00_RC00
Release Week of 2022-10-18
Improvements
[BigTable Templates] Added Bigtable Resource Manager for integration testing
[BigQuery Templates] Added BigQuery Resource Manager for integration testing
[Splunk Template] Enable Splunk batching by default (10) on the Pub/Sub to Splunk template
Bug Fixes
[ElasticSearch Templates] Bug fix for the cases when maxBatchSizeBytes may be exceeded
Contributors
Bruno Volpato @bvolpato
Jeffrey Kinard @Polber
Mark Pevec @ggprod
olegsa @oleg-semenov
Dataflow Templates 2022-09-26-01_RC00
Release Week of 2022-09-26
Improvements
[Spanner Template] Removing CAST to string statement when reading Numeric columns from Spanner.
[All templates] Do not apply formatting/spotless on Apache Beam code