feat: Samples showing Gemini generating code and minor changes to get the code working#1095
Open
sundar-mudupalli-work wants to merge 16 commits intomainfrom
Open
feat: Samples showing Gemini generating code and minor changes to get the code working#1095sundar-mudupalli-work wants to merge 16 commits intomainfrom
sundar-mudupalli-work wants to merge 16 commits intomainfrom
Conversation
Create a PySpark script to tranform data in GCS from parquet to avro and use the add_insertion_time_column function in @data_tranformer.py to add an additional column Code Generated by Gemini for GCS to GCS Successfully ran serverless Spark job using following command: gcloud dataproc batches submit pyspark transform_parquet_to_avro.py --batch="parquet-to-avro-$(date +%s)" \ --jars=file:///usr/lib/spark/connector/spark-avro.jar --py-files=./data_transformer.py \ --deps-bucket=gs://dataproc-templates-python-deps \ -- --input=gs://dataproc-templates_cloudbuild/gemini-codegen/transform_parquet_to_avro/input/parquet-table \ --output=gs://dataproc-templates_cloudbuild/gemini-codegen/transform_parquet_to_avro/output/avro_table Confirmed by using BigQuery, creating parquet and avro files as external tables.
… an insertion_time column using the add_insertion_time_column function in @data_tranformer.py. Save this table to BigQuery, providing detailed instructions to run this script against a dataproc cluster. Save a summary of this session to hive_to_BQReadme.md Gemini generated Hive to BQ transformation script Comments: spark-bigquery comes preinstalled on dataproc clusters version 2.1 and higher, so no jars need to be provided. The following command worked: gcloud dataproc jobs submit pyspark gs://dataproc-templates_cloudbuild/gemini-codegen/transform_hive_to_bq/src/transform_hive_to_bigquery.py \ --cluster=mixer-test2 --py-files=gs://dataproc-templates_cloudbuild/gemini-codegen/transform_hive_to_bq/src/data_transformer.py \ --properties=spark.hadoop.hive.metastore.uris=thrift://10.115.64.27:9083 \ -- --hive_database=test_db --hive_table=employees --bq_table=gemini_codegen.py_hive_to_bq \ --bq_temp_gcs_bucket=gs://dataproc-templates-python-deps
…r to list classes - not fully working
Collaborator
Author
|
/gcbrun |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hi,
This pull request shows how to provide prompts to Gemini to generate code in Python and Java
The code generated by Gemini was committed as is. Some minor changes were needed to get the code working. Those changes were made and committed as well.