Skip to content

Add throughput bucket samples for Cosmos Spark connector#48733

Closed
xinlian12 wants to merge 44 commits into
Azure:mainfrom
xinlian12:addSampleForThroughputBucketInSpark
Closed

Add throughput bucket samples for Cosmos Spark connector#48733
xinlian12 wants to merge 44 commits into
Azure:mainfrom
xinlian12:addSampleForThroughputBucketInSpark

Conversation

@xinlian12

@xinlian12 xinlian12 commented Apr 8, 2026

Copy link
Copy Markdown
Member

Summary

Add Python and Scala sample notebooks demonstrating server-side throughput bucket configuration for the Cosmos Spark connector (azure-cosmos-spark_3).

Changes

  • Samples/Python/NYC-Taxi-Data/04_ThroughputBucket.ipynb — PySpark notebook
  • Samples/Scala/NYC-Taxi-Data/04_ThroughputBucket.scala — Scala Databricks notebook

Both samples are modeled after the existing 01_Batch samples but replace the SDK-side global throughput control with the simpler server-side throughputBucket configuration:

Config key Description
spark.cosmos.throughputControl.enabled "true"
spark.cosmos.throughputControl.name Group name
spark.cosmos.throughputControl.throughputBucket Integer between 1 and 5

Key differences from 01_Batch

  • Removed the ThroughputControl metadata container creation (not needed for server-side buckets)
  • Removed separate throughput control account/catalog configuration
  • Replaced targetThroughputThreshold, globalControl.database, globalControl.container with throughputBucket
  • Ingestion uses bucket 5; delete uses bucket 1 to demonstrate different priority levels

Verification

These are Databricks notebook samples and do not have associated unit tests. The structure and configuration keys were verified against the Spark connector source code (CosmosConfig.scala, ThroughputControlHelper.scala).

Annie Liang and others added 30 commits March 19, 2026 13:16
Imports the shared review pipeline from xinlian12/sdk-auto-pr-review.
Triggers on PR open/push/ready_for_review. Posts inline review comments
with severity tags and AI disclaimer.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add sdkReviewAgent agentic workflow for automated PR review
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Recompile workflow with GH_TOKEN fix for mcp-scripts
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Imports the shared review pipeline from xinlian12/sdk-auto-pr-review.
Triggers on PR open/push/ready_for_review. Posts inline review comments
with severity tags and AI disclaimer.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The throughput bucket number (1-5) does not indicate priority level.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@xinlian12 xinlian12 closed this Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant