-
Notifications
You must be signed in to change notification settings - Fork 58
Open
Description
I'm using this script:
#!/bin/bash
# Parameters to replace:
# The GOOGLE_CLOUD_PROJECT is the project that contains your BigQuery dataset.
GOOGLE_CLOUD_PROJECT=psjh-eacri-data
INPUT_PATTERN=https://storage.googleapis.com/gcp-public-data--gnomad/release/2.1.1/vcf/exomes/gnomad.exomes.r2.1.1.sites.*.vcf.bgz
# INPUT_PATTERN=gs://gcp-public-data--gnomad/release/2.1.1/vcf/exomes/*.vcf.bgz
OUTPUT_TABLE=eacri-genomics:gnomad.gnomad_hg19_2_1_1
TEMP_LOCATION=gs://psjh-eacri/balter/gnomad_tmp/vcf/exomes/*.vcf.bgz/tmp
COMMAND="vcf_to_bq \
--input_pattern ${INPUT_PATTERN} \
--output_table ${OUTPUT_TABLE} \
--temp_location ${TEMP_LOCATION} \
--job_name vcf-to-bigquery \
--runner DataflowRunner \
--zones us-east1-b \
--network projects/phs-205720/global/networks/psjh-shared01 \
--subnet projects/phs-205720/regions/us-east1/subnetworks/subnet01"
docker run -v ~/.config:/root/.config \
gcr.io/cloud-lifesciences/gcp-variant-transforms \
--project "${GOOGLE_CLOUD_PROJECT}" \
--temp_location ${TEMP_LOCATION} \
"${COMMAND}"
And, yet, the error says that the network was not specified, and the network slot is empty in the JSON output.
What change do I need to make to my script? Or, is some other format needed to specify the network?
The script template doesn't include a network or subnet parameter at all.
base) jupyter@balter-genomics:~$ ./script.sh
--project 'psjh-eacri-data' --temp_location 'gs://psjh-eacri/balter/gnomad_tmp/vcf/exomes/*.vcf.bgz/tmp' -- 'vcf_to_bq --input_pattern gs://gcp-public-data--gnomad/release/2.1.1/vcf/exomes/*.vcf.bgz --output_table eacri-genomics:gnomad.gnomad_hg19_2_1_1 --temp_location gs://psjh-eacri/balter/gnomad_tmp/vcf/exomes/*.vcf.bgz/tmp --job_name vcf-to-bigquery --runner DataflowRunner --zones us-east1-b --subnet subnet03'
Your active configuration is: [variant]
{
"pipeline": {
"actions": [
{
"commands": [
"-c",
"mkdir -p /mnt/google/.google/tmp"
],
"entrypoint": "bash",
"imageUri": "gcr.io/cloud-genomics-pipelines/io",
"mounts": [
{
"disk": "google",
"path": "/mnt/google"
}
]
},
{
"commands": [
"-c",
"/opt/gcp_variant_transforms/bin/vcf_to_bq --input_pattern gs://gcp-public-data--gnomad/release/2.1.1/vcf/exomes/*.vcf.bgz --output_table eacri-genomics:gnomad.gnomad_hg19_2_1_1 --temp_location gs://psjh-eacri/balter/gnomad_tmp/vcf/exomes/*.vcf.bgz/tmp --job_name vcf-to-bigquery --runner DataflowRunner --zones us-east1-b --subnet subnet03 --project psjh-eacri-data --region us-east1 --temp_location gs://psjh-eacri/balter/gnomad_tmp/vcf/exomes/*.vcf.bgz/tmp"
],
"entrypoint": "bash",
"imageUri": "gcr.io/cloud-lifesciences/gcp-variant-transforms",
"mounts": [
{
"disk": "google",
"path": "/mnt/google"
}
]
},
{
"alwaysRun": true,
"commands": [
"-c",
"gsutil -q cp /google/logs/output gs://psjh-eacri/balter/gnomad_tmp/vcf/exomes/*.vcf.bgz/tmp/runner_logs_20210510_230717.log"
],
"entrypoint": "bash",
"imageUri": "gcr.io/cloud-genomics-pipelines/io",
"mounts": [
{
"disk": "google",
"path": "/mnt/google"
}
]
}
],
"environment": {
"TMPDIR": "/mnt/google/.google/tmp"
},
"resources": {
"regions": [
"us-east1"
],
"virtualMachine": {
"disks": [
{
"name": "google",
"sizeGb": 10
}
],
"machineType": "g1-small",
"network": {},
"serviceAccount": {
"scopes": [
"https://www.googleapis.com/auth/cloud-platform",
"https://www.googleapis.com/auth/devstorage.read_write"
]
}
}
}
}
}
Pipeline running as "projects/447346450878/locations/us-central1/operations/13027962545459232820" (attempt: 1, preemptible: false)
Output will be written to "gs://psjh-eacri/balter/gnomad_tmp/vcf/exomes/*.vcf.bgz/tmp/runner_logs_20210510_230717.log"
23:07:26 Worker "google-pipelines-worker-ab367d994b1cd7881ebf66950fec6c17" assigned in "us-east1-b" on a "g1-small" machine
23:07:26 Execution failed: allocating: creating instance: inserting instance: Invalid value for field 'resource.networkInterfaces[0].network': ''. The referenced network resource cannot be found.
23:07:27 Worker released
"run": operation "projects/447346450878/locations/us-central1/operations/13027962545459232820" failed: executing pipeline: Execution failed: allocating: creating instance: inserting instance: Invalid value for field 'resource.networkInterfaces[0].network': ''. The referenced network resource cannot be found. (reason: INVALID_ARGUMENT)
(base) jupyter@balter-genomics:~$ ./script.sh
--project 'psjh-eacri-data' --temp_location 'gs://psjh-eacri/balter/gnomad_tmp/vcf/exomes/*.vcf.bgz/tmp' -- 'vcf_to_bq --input_pattern https://storage.googleapis.com/gcp-public-data--gnomad/release/2.1.1/vcf/exomes/gnomad.exomes.r2.1.1.sites.*.vcf.bgz --output_table eacri-genomics:gnomad.gnomad_hg19_2_1_1 --temp_location gs://psjh-eacri/balter/gnomad_tmp/vcf/exomes/*.vcf.bgz/tmp --job_name vcf-to-bigquery --runner DataflowRunner --zones us-east1-b --subnet subnet03'
Your active configuration is: [variant]
{
"pipeline": {
"actions": [
{
"commands": [
"-c",
"mkdir -p /mnt/google/.google/tmp"
],
"entrypoint": "bash",
"imageUri": "gcr.io/cloud-genomics-pipelines/io",
"mounts": [
{
"disk": "google",
"path": "/mnt/google"
}
]
},
{
"commands": [
"-c",
"/opt/gcp_variant_transforms/bin/vcf_to_bq --input_pattern https://storage.googleapis.com/gcp-public-data--gnomad/release/2.1.1/vcf/exomes/gnomad.exomes.r2.1.1.sites.*.vcf.bgz --output_table eacri-genomics:gnomad.gnomad_hg19_2_1_1 --temp_location gs://psjh-eacri/balter/gnomad_tmp/vcf/exomes/*.vcf.bgz/tmp --job_name vcf-to-bigquery --runner DataflowRunner --zones us-east1-b --subnet subnet03 --project psjh-eacri-data --region us-east1 --temp_location gs://psjh-eacri/balter/gnomad_tmp/vcf/exomes/*.vcf.bgz/tmp"
],
"entrypoint": "bash",
"imageUri": "gcr.io/cloud-lifesciences/gcp-variant-transforms",
"mounts": [
{
"disk": "google",
"path": "/mnt/google"
}
]
},
{
"alwaysRun": true,
"commands": [
"-c",
"gsutil -q cp /google/logs/output gs://psjh-eacri/balter/gnomad_tmp/vcf/exomes/*.vcf.bgz/tmp/runner_logs_20210511_000846.log"
],
"entrypoint": "bash",
"imageUri": "gcr.io/cloud-genomics-pipelines/io",
"mounts": [
{
"disk": "google",
"path": "/mnt/google"
}
]
}
],
"environment": {
"TMPDIR": "/mnt/google/.google/tmp"
},
"resources": {
"regions": [
"us-east1"
],
"virtualMachine": {
"disks": [
{
"name": "google",
"sizeGb": 10
}
],
"machineType": "g1-small",
"network": {},
"serviceAccount": {
"scopes": [
"https://www.googleapis.com/auth/cloud-platform",
"https://www.googleapis.com/auth/devstorage.read_write"
]
}
}
}
}
}
Pipeline running as "projects/447346450878/locations/us-central1/operations/3293803574088782620" (attempt: 1, preemptible: false)
Output will be written to "gs://psjh-eacri/balter/gnomad_tmp/vcf/exomes/*.vcf.bgz/tmp/runner_logs_20210511_000846.log"
00:08:56 Worker "google-pipelines-worker-e05c2864661a5ba9f1b29012de1ac56d" assigned in "us-east1-d" on a "g1-small" machine
00:08:56 Execution failed: allocating: creating instance: inserting instance: Invalid value for field 'resource.networkInterfaces[0].network': ''. The referenced network resource cannot be found.
00:08:57 Worker released
"run": operation "projects/447346450878/locations/us-central1/operations/3293803574088782620" failed: executing pipeline: Execution failed: allocating: creating instance: inserting instance: Invalid value for field 'resource.networkInterfaces[0].network': ''. The referenced network resource cannot be found. (reason: INVALID_ARGUMENT)
Metadata
Metadata
Assignees
Labels
No labels