feat: Samples showing Gemini generating code and minor changes to get the code working by sundar-mudupalli-work · Pull Request #1095 · GoogleCloudPlatform/dataproc-templates

sundar-mudupalli-work · 2026-02-12T23:19:24Z

Hi,

This pull request shows how to provide prompts to Gemini to generate code in Python and Java

Hive to BigQuery (Python)
GCS to GCS (Python)
JDBC to JDBC (Java)
DeltaLaketoIceberg (Java)

The code generated by Gemini was committed as is. Some minor changes were needed to get the code working. Those changes were made and committed as well.

Create a PySpark script to tranform data in GCS from parquet to avro and use the add_insertion_time_column function in @data_tranformer.py to add an additional column Code Generated by Gemini for GCS to GCS Successfully ran serverless Spark job using following command: gcloud dataproc batches submit pyspark transform_parquet_to_avro.py --batch="parquet-to-avro-$(date +%s)" \ --jars=file:///usr/lib/spark/connector/spark-avro.jar --py-files=./data_transformer.py \ --deps-bucket=gs://dataproc-templates-python-deps \ -- --input=gs://dataproc-templates_cloudbuild/gemini-codegen/transform_parquet_to_avro/input/parquet-table \ --output=gs://dataproc-templates_cloudbuild/gemini-codegen/transform_parquet_to_avro/output/avro_table Confirmed by using BigQuery, creating parquet and avro files as external tables.

… an insertion_time column using the add_insertion_time_column function in @data_tranformer.py. Save this table to BigQuery, providing detailed instructions to run this script against a dataproc cluster. Save a summary of this session to hive_to_BQReadme.md Gemini generated Hive to BQ transformation script Comments: spark-bigquery comes preinstalled on dataproc clusters version 2.1 and higher, so no jars need to be provided. The following command worked: gcloud dataproc jobs submit pyspark gs://dataproc-templates_cloudbuild/gemini-codegen/transform_hive_to_bq/src/transform_hive_to_bigquery.py \ --cluster=mixer-test2 --py-files=gs://dataproc-templates_cloudbuild/gemini-codegen/transform_hive_to_bq/src/data_transformer.py \ --properties=spark.hadoop.hive.metastore.uris=thrift://10.115.64.27:9083 \ -- --hive_database=test_db --hive_table=employees --bq_table=gemini_codegen.py_hive_to_bq \ --bq_temp_gcs_bucket=gs://dataproc-templates-python-deps

… Java 17

…r to list classes - not fully working

sundar-mudupalli-work · 2026-02-12T23:33:35Z

/gcbrun

sundar-mudupalli-work added 16 commits December 16, 2025 19:57

Final Comments

0a850a8

Minor syntax update

ae56a23

Minor syntax update

f41f410

insertion_time function

7ac749b

JdbctoJdbc by Gemini

6161275

ClassPathLister Working

9483612

Updated ClassPathLister to print java, scala versions in BQ

8c205c9

Updated Class Path Lister to show all JAR files, pom.xml to work with…

04a78e9

… Java 17

Gemini Generated code

3c5a6b7

Working Java JDBC to JDBC

02956bf

Updated pom.xml to remove MANIFEST.MF warning. Updated ClassPathListe…

fbaa8f1

…r to list classes - not fully working

Migrate Deltalake to Iceberg generated by Gemini

720de4c

Working DeltaLake to Iceberg migration

976f575

Merge branch 'main' into gemini-codegen

a190e6a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Samples showing Gemini generating code and minor changes to get the code working#1095

feat: Samples showing Gemini generating code and minor changes to get the code working#1095
sundar-mudupalli-work wants to merge 16 commits intomainfrom
gemini-codegen

sundar-mudupalli-work commented Feb 12, 2026

Uh oh!

sundar-mudupalli-work commented Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

sundar-mudupalli-work commented Feb 12, 2026

Uh oh!

sundar-mudupalli-work commented Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments