-
Notifications
You must be signed in to change notification settings - Fork 228
Migrate gsutil usage to gcloud storage #4321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
102d0a2
a332732
ab13c61
36d5ecc
2bafa65
6ddc8a2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -326,8 +326,7 @@ | |
| }, | ||
| "outputs": [], | ||
| "source": [ | ||
| "! gsutil mb -l $REGION gs://$BUCKET_NAME" | ||
| ] | ||
| "! gcloud storage buckets create --location=$REGION gs://$BUCKET_NAME" ] | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The Consider adding the
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. bhandarivijay@bhandarivijay:$BUCKET_NAME=my-bucket232 |
||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
|
|
@@ -346,8 +345,7 @@ | |
| }, | ||
| "outputs": [], | ||
| "source": [ | ||
| "! gsutil ls -al gs://$BUCKET_NAME" | ||
| ] | ||
| "! gcloud storage ls --all-versions --long gs://$BUCKET_NAME" ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
|
|
@@ -476,8 +474,7 @@ | |
| }, | ||
| "outputs": [], | ||
| "source": [ | ||
| "! gsutil cat $IMPORT_FILE | head -n 10" | ||
| ] | ||
| "! gcloud storage cat $IMPORT_FILE | head -n 10" ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
|
|
@@ -1132,8 +1129,7 @@ | |
| " data = {\"id\": 0, \"text_snippet\": {\"content\": test_item}}\n", | ||
| " f.write(json.dumps(data) + \"\\n\")\n", | ||
| "\n", | ||
| "! gsutil cat $gcs_input_uri" | ||
| ] | ||
| "! gcloud storage cat $gcs_input_uri" ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
|
|
@@ -1280,9 +1276,7 @@ | |
| "source": [ | ||
| "destination_uri = output_config[\"gcs_destination\"][\"output_uri_prefix\"][:-1]\n", | ||
| "\n", | ||
| "! gsutil ls $destination_uri/*\n", | ||
| "! gsutil cat $destination_uri/prediction*/*.jsonl" | ||
| ] | ||
| "! gcloud storage ls $destination_uri/*\n", "! gcloud storage cat $destination_uri/prediction*/*.jsonl" ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
|
|
@@ -1614,8 +1608,7 @@ | |
| "\n", | ||
| "\n", | ||
| "if delete_bucket and \"BUCKET_NAME\" in globals():\n", | ||
| " ! gsutil rm -r gs://$BUCKET_NAME" | ||
| ] | ||
| " ! gcloud storage rm --recursive gs://$BUCKET_NAME" ] | ||
| } | ||
| ], | ||
| "metadata": { | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -511,8 +511,7 @@ | |
| }, | ||
| "outputs": [], | ||
| "source": [ | ||
| "! gsutil mb -l $REGION $BUCKET_URI" | ||
| ] | ||
| "! gcloud storage buckets create --location $REGION $BUCKET_URI" ] | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The Consider adding the |
||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
|
|
@@ -531,8 +530,7 @@ | |
| }, | ||
| "outputs": [], | ||
| "source": [ | ||
| "! gsutil ls -al $BUCKET_URI" | ||
| ] | ||
| "! gcloud storage ls --all-versions --long $BUCKET_URI" ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
|
|
@@ -1147,8 +1145,7 @@ | |
| "! rm -f custom.tar custom.tar.gz\n", | ||
| "! tar cvf custom.tar custom\n", | ||
| "! gzip custom.tar\n", | ||
| "! gsutil cp custom.tar.gz $BUCKET_URI/trainer_example.tar.gz" | ||
| ] | ||
| "! gcloud storage cp custom.tar.gz $BUCKET_URI/trainer_example.tar.gz" ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
|
|
@@ -1357,8 +1354,7 @@ | |
| "delete_bucket = False\n", | ||
| "\n", | ||
| "if delete_bucket or os.getenv(\"IS_TESTING\"):\n", | ||
| " ! gsutil rm -r $BUCKET_URI" | ||
| ] | ||
| " ! gcloud storage rm --recursive $BUCKET_URI" ] | ||
| } | ||
| ], | ||
| "metadata": { | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -484,8 +484,7 @@ | |
| }, | ||
| "outputs": [], | ||
| "source": [ | ||
| "! gsutil mb -l $REGION $BUCKET_URI" | ||
| ] | ||
| "! gcloud storage buckets create --location=$REGION $BUCKET_URI" ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
|
|
@@ -504,8 +503,7 @@ | |
| }, | ||
| "outputs": [], | ||
| "source": [ | ||
| "! gsutil ls -al $BUCKET_URI" | ||
| ] | ||
| "! gcloud storage ls --all-versions --long $BUCKET_URI" ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
|
|
@@ -717,8 +715,7 @@ | |
| "with open(\"instance.yaml\", \"w\") as f:\n", | ||
| " f.write(yaml)\n", | ||
| "\n", | ||
| "! gsutil cp instance.yaml {BUCKET_URI}/instance.yaml" | ||
| ] | ||
| "! gcloud storage cp instance.yaml {BUCKET_URI}/instance.yaml" ] | ||
| }, | ||
| { | ||
| "cell_type": "markdown", | ||
|
|
@@ -1274,30 +1271,25 @@ | |
| " + \"/evaluation_metrics\"\n", | ||
| " )\n", | ||
| " if tf.io.gfile.exists(EXECUTE_OUTPUT):\n", | ||
| " ! gsutil cat $EXECUTE_OUTPUT\n", | ||
| " return EXECUTE_OUTPUT\n", | ||
| " ! gcloud storage cat $EXECUTE_OUTPUT\n", " return EXECUTE_OUTPUT\n", | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This command is using Consider adding a newline character ( |
||
| " elif tf.io.gfile.exists(GCP_RESOURCES):\n", | ||
| " ! gsutil cat $GCP_RESOURCES\n", | ||
| " return GCP_RESOURCES\n", | ||
| " ! gcloud storage cat $GCP_RESOURCES\n", " return GCP_RESOURCES\n", | ||
| " elif tf.io.gfile.exists(EVAL_METRICS):\n", | ||
| " ! gsutil cat $EVAL_METRICS\n", | ||
| " return EVAL_METRICS\n", | ||
| " ! gcloud storage cat $EVAL_METRICS\n", " return EVAL_METRICS\n", | ||
| "\n", | ||
| " return None\n", | ||
| "\n", | ||
| "\n", | ||
| "print(\"get-vertex-model\")\n", | ||
| "artifacts = print_pipeline_output(pipeline, \"get-vertex-model\")\n", | ||
| "print(\"\\n\\n\")\n", | ||
| "output = !gsutil cat $artifacts\n", | ||
| "output = json.loads(output[0])\n", | ||
| "output = !gcloud storage cat $artifacts\n", "output = json.loads(output[0])\n", | ||
| "model_id = output[\"artifacts\"][\"model\"][\"artifacts\"][0][\"metadata\"][\"resourceName\"]\n", | ||
| "print(\"\\n\\n\")\n", | ||
| "print(\"endpoint-create\")\n", | ||
| "artifacts = print_pipeline_output(pipeline, \"endpoint-create\")\n", | ||
| "print(\"\\n\\n\")\n", | ||
| "output = !gsutil cat $artifacts\n", | ||
| "output = json.loads(output[0])\n", | ||
| "output = !gcloud storage cat $artifacts\n", "output = json.loads(output[0])\n", | ||
| "endpoint_id = output[\"artifacts\"][\"endpoint\"][\"artifacts\"][0][\"metadata\"][\n", | ||
| " \"resourceName\"\n", | ||
| "]\n", | ||
|
|
@@ -1404,8 +1396,7 @@ | |
| "delete_bucket = True\n", | ||
| "\n", | ||
| "if delete_bucket or os.getenv(\"IS_TESTING\"):\n", | ||
| " ! gsutil rm -rf {BUCKET_URI}" | ||
| ] | ||
| " ! gcloud storage rm --recursive --continue-on-error {BUCKET_URI}" ] | ||
| } | ||
| ], | ||
| "metadata": { | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -152,11 +152,9 @@ | |
| "if BUCKET_URI is None or BUCKET_URI.strip() == \"\" or BUCKET_URI == \"gs://\":\n", | ||
| " BUCKET_URI = f\"gs://{PROJECT_ID}-tmp-{now}-{str(uuid.uuid4())[:4]}\"\n", | ||
| " BUCKET_NAME = \"/\".join(BUCKET_URI.split(\"/\")[:3])\n", | ||
| " ! gsutil mb -l {REGION} {BUCKET_URI}\n", | ||
| "else:\n", | ||
| " ! gcloud storage buckets create --location={REGION} {BUCKET_URI}\n", "else:\n", | ||
|
||
| " assert BUCKET_URI.startswith(\"gs://\"), \"BUCKET_URI must start with `gs://`.\"\n", | ||
| " shell_output = ! gsutil ls -Lb {BUCKET_NAME} | grep \"Location constraint:\" | sed \"s/Location constraint://\"\n", | ||
| " bucket_region = shell_output[0].strip().lower()\n", | ||
| " shell_output = ! gcloud storage ls --full --buckets {BUCKET_NAME} | grep \"Location constraint:\" | sed \"s/Location constraint://\"\n", " bucket_region = shell_output[0].strip().lower()\n", | ||
| " if bucket_region != REGION:\n", | ||
| " raise ValueError(\n", | ||
| " \"Bucket region %s is different from notebook region %s\"\n", | ||
|
|
@@ -180,8 +178,8 @@ | |
| "\n", | ||
| "\n", | ||
| "# Provision permissions to the SERVICE_ACCOUNT with the GCS bucket\n", | ||
| "! gsutil iam ch serviceAccount:{SERVICE_ACCOUNT}:roles/storage.admin $BUCKET_NAME\n", | ||
| "\n", | ||
| "# Note: Migrating scripts using gsutil iam ch is more complex than get or set. You need to replace the single iam ch command with a series of gcloud storage bucket add-iam-policy-binding and/or gcloud storage bucket remove-iam-policy-binding commands, or replicate the read-modify-write loop.\n", | ||
| "! gcloud storage buckets add-iam-policy-binding $BUCKET_NAME --member=serviceAccount:{SERVICE_ACCOUNT} --role=roles/storage.admin\n", "\n", | ||
| "! gcloud config set project $PROJECT_ID\n", | ||
| "! gcloud projects add-iam-policy-binding --no-user-output-enabled {PROJECT_ID} --member=serviceAccount:{SERVICE_ACCOUNT} --role=\"roles/storage.admin\"\n", | ||
| "! gcloud projects add-iam-policy-binding --no-user-output-enabled {PROJECT_ID} --member=serviceAccount:{SERVICE_ACCOUNT} --role=\"roles/aiplatform.user\"\n", | ||
|
|
@@ -229,8 +227,7 @@ | |
| "\n", | ||
| " ! mkdir -p ./gemma\n", | ||
| " ! curl -X GET \"{signed_url}\" | tar -xzvf - -C ./gemma/\n", | ||
| " ! gsutil -m cp -R ./gemma/* {MODEL_BUCKET}\n", | ||
| "\n", | ||
| " ! gcloud storage cp --recursive ./gemma/* {MODEL_BUCKET}\n", "\n", | ||
| " model_path_prefix = MODEL_BUCKET\n", | ||
| " HF_TOKEN = \"\"\n", | ||
| "else:\n", | ||
|
|
@@ -1007,8 +1004,7 @@ | |
| "\n", | ||
| "delete_bucket = False # @param {type:\"boolean\"}\n", | ||
| "if delete_bucket:\n", | ||
| " ! gsutil -m rm -r $BUCKET_NAME" | ||
| ] | ||
| " ! gcloud storage rm --recursive $BUCKET_NAME" ] | ||
| } | ||
| ], | ||
| "metadata": { | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -177,11 +177,10 @@ | |
| "if BUCKET_URI is None or BUCKET_URI.strip() == \"\" or BUCKET_URI == \"gs://\":\n", | ||
| " BUCKET_URI = f\"gs://{PROJECT_ID}-tmp-{now}-{str(uuid.uuid4())[:4]}\"\n", | ||
| " BUCKET_NAME = \"/\".join(BUCKET_URI.split(\"/\")[:3])\n", | ||
| " ! gsutil mb -l {REGION} {BUCKET_URI}\n", | ||
| "else:\n", | ||
| " ! gcloud storage buckets create --location={REGION} {BUCKET_URI}\n", "else:\n", | ||
|
||
| " assert BUCKET_URI.startswith(\"gs://\"), \"BUCKET_URI must start with `gs://`.\"\n", | ||
| " shell_output = ! gsutil ls -Lb {BUCKET_NAME} | grep \"Location constraint:\" | sed \"s/Location constraint://\"\n", | ||
| " bucket_region = shell_output[0].strip().lower()\n", | ||
| " # Note: The format of the full listing output is different. gcloud storage uses a title case for keys and will not display a field if its value is \"None\".\n", | ||
| " shell_output = ! gcloud storage ls --full --buckets {BUCKET_NAME} | grep \"Location constraint:\" | sed \"s/Location constraint://\"\n", " bucket_region = shell_output[0].strip().lower()\n", | ||
| " if bucket_region != REGION:\n", | ||
| " raise ValueError(\n", | ||
| " \"Bucket region %s is different from notebook region %s\"\n", | ||
|
|
@@ -202,8 +201,7 @@ | |
| "\n", | ||
| "\n", | ||
| "# Provision permissions to the SERVICE_ACCOUNT with the GCS bucket\n", | ||
| "! gsutil iam ch serviceAccount:{SERVICE_ACCOUNT}:roles/storage.admin $BUCKET_NAME\n", | ||
| "\n", | ||
| "! gcloud storage buckets add-iam-policy-binding $BUCKET_NAME --member=serviceAccount:{SERVICE_ACCOUNT} --role=roles/storage.admin\n", "\n", | ||
| "! gcloud config set project $PROJECT_ID\n", | ||
| "! gcloud projects add-iam-policy-binding --no-user-output-enabled {PROJECT_ID} --member=serviceAccount:{SERVICE_ACCOUNT} --role=\"roles/storage.admin\"\n", | ||
| "! gcloud projects add-iam-policy-binding --no-user-output-enabled {PROJECT_ID} --member=serviceAccount:{SERVICE_ACCOUNT} --role=\"roles/aiplatform.user\"\n", | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
gcloud storage buckets createcommand requires the bucket name to be specified as a URI (e.g.,gs://my-bucket). It's crucial to ensure that$BUCKET_NAMEincludes thegs://prefix. If$BUCKET_NAMEdoes not include thegs://prefix, the command will fail.Consider adding the
gs://$BUCKET_NAMEto ensure the bucket name is correctly formatted.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gsutil mb -l $REGION $BUCKET_NAME
Creating gs://vijay-bucket11/...
gcloud storage buckets create --location=$REGION $BUCKET_NAME
Creating gs://vijay-bucket9999/...
command is correct..
gcloud storage buckets create --location=$REGION gs://$BUCKET_NAME
ERROR: (gcloud.storage.buckets.create) "gcloud storage buckets create" only accepts bucket URLs.
Hence Gemini incorrect ..