Skip to content

Pinned Loading

  1. cloud-dataproc cloud-dataproc Public

    Cloud Dataproc: Samples and Utils

    Jupyter Notebook 206 131

Repositories

Showing 10 of 20 repositories
  • spark-bigquery-connector Public

    BigQuery data source for Apache Spark: Read data from BigQuery into DataFrames, write DataFrames into BigQuery tables.

    GoogleCloudDataproc/spark-bigquery-connector’s past year of commit activity
    Java 420 Apache-2.0 220 39 12 Updated Jan 27, 2026
  • GoogleCloudDataproc/dataproc-spark-connect-python’s past year of commit activity
    Python 7 Apache-2.0 12 2 5 Updated Jan 27, 2026
  • hadoop-connectors Public

    Libraries and tools for interoperability between Hadoop-related open-source software and Google Cloud Platform.

    GoogleCloudDataproc/hadoop-connectors’s past year of commit activity
    Java 290 Apache-2.0 260 49 51 Updated Jan 27, 2026
  • flink-bigquery-connector Public

    BigQuery connector for Apache Flink

    GoogleCloudDataproc/flink-bigquery-connector’s past year of commit activity
    Java 37 Apache-2.0 27 10 3 Updated Jan 26, 2026
  • GoogleCloudDataproc/gcs-jupyter-plugin’s past year of commit activity
    Python 0 Apache-2.0 3 0 6 Updated Jan 23, 2026
  • initialization-actions Public

    Run in all nodes of your cluster before the cluster starts - lets you customize your cluster

    GoogleCloudDataproc/initialization-actions’s past year of commit activity
    Shell 599 Apache-2.0 515 75 45 Updated Jan 23, 2026
  • GoogleCloudDataproc/dataproc-jupyter-plugin’s past year of commit activity
    TypeScript 9 Apache-2.0 16 7 21 Updated Jan 22, 2026
  • spark-spanner-connector Public

    Cloud Spanner Connector for Apache Spark

    GoogleCloudDataproc/spark-spanner-connector’s past year of commit activity
    Java 17 Apache-2.0 20 2 4 Updated Jan 21, 2026
  • cloud-dataproc Public

    Cloud Dataproc: Samples and Utils

    GoogleCloudDataproc/cloud-dataproc’s past year of commit activity
    Jupyter Notebook 206 Apache-2.0 131 2 5 Updated Jan 8, 2026
  • dataproc-ml-python Public

    Library to simplify running distributed ML workloads with Apache Spark

    GoogleCloudDataproc/dataproc-ml-python’s past year of commit activity
    Python 7 Apache-2.0 2 0 0 Updated Dec 18, 2025