Description
Checklist
- I did not find a related open issue.
- I did not find a solution in the troubleshooting guide: (https://cloud.google.com/config-connector/docs/troubleshooting)
- If this issue is time-sensitive, I have submitted a corresponding issue with GCP support.
Bug Description
The datasetRef
field in bigquerydatatransfer.cnrm.cloud.google.com/v1beta1
BigQueryDataTransferConfig
is a required field in config connector. However, its possible to create a scheduled_query data transfer configuration that does not have a destination_dataset_id
via the API, the gcp console and via Terraform: https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/bigquery_data_transfer_config#destination_dataset_id-1
We currently have terraform configurations we can't really move to KCC because the datasetRef
is a required field, but our users don't use destination_dataset_id
in some of their scripts. Mostly because their scheduled queries use multiple statements that create data / tables in multiple datasets.
Additional Diagnostic Information
Leaving the destination dataset empty is possible via the API / GUI
Kubernetes Cluster Version
v1.30.8-gke.1051000
Config Connector Version
1.128.0
Config Connector Mode
namespaced mode (default)
Log Output
The BigQueryDataTransferConfig "tst-script" is invalid:
- spec.datasetRef: Required value
- : Invalid value: "null": some validation rules were not checked because the object was invalid; correct the existing errors to complete validation
Steps to reproduce the issue
Execute the yaml below to see that the datasetRef is required in KCC.
In the BQ UI, write a query, like
DECLARE x INT64 DEFAULT 0;
LOOP
SET x = x + 1;
IF x >= 10 THEN
LEAVE;
END IF;
INSERT myproject.mydataset1.test (column1)
VALUES(x);
INSERT myproject.mydataset2.test (column1)
VALUES(x);
END LOOP;
And then click schedule -> create new schedule. The Destination for query results
configuration is optional.
YAML snippets
---
apiVersion: bigquerydatatransfer.cnrm.cloud.google.com/v1beta1
kind: BigQueryDataTransferConfig
metadata:
annotations:
cnrm.cloud.google.com/management-conflict-prevention-policy: none
name: tst-script
namespace: myproject
spec:
dataSourceID: scheduled_query
# datasetRef: # this is required, even though I reference 2 datasets in the script. Its optional in the BQ GUI and Terraform
# external: ""
displayName: tst_script
location: europe
params:
query: |-
DECLARE x INT64 DEFAULT 0;
LOOP
SET x = x + 1;
IF x >= 10 THEN
LEAVE;
END IF;
INSERT myproject.mydataset1.test (column1)
VALUES(x);
INSERT myproject.mydataset2.test (column1)
VALUES(x);
END LOOP;
projectRef:
name: myproject
namespace: myproject
schedule: every 2 hours
serviceAccountRef:
name: serviceaccount
namespace: myproject
Activity