Warm memory tier.
RU spikes typically come from cross-partition queries, hot partitions, or large item reads/writes. This playbook focuses on partition-key usage and throttling handling.
- 429 throttling
- Sustained RU usage at or above provisioned limit
- Check RU metrics by container.
- Identify hot partitions.
- Review query patterns for cross-partition scans.
- Scale RU or enable autoscale.
- Add partition key filters.
- Cache frequently read data in hot tier.
- Enforce partition-keyed queries.
- Batch writes and reduce fan-out.
- Verify all queries include partition key.
- Add retry/backoff for 429 responses.
- Cache hot reads in Redis to reduce RU.
async def read_profile(container, user_id: str):
return await container.read_item(item=user_id, partition_key=user_id)import asyncio
from azure.cosmos.exceptions import CosmosHttpResponseError
async def read_with_retry(fn, retries=3):
for attempt in range(retries):
try:
return await fn()
except CosmosHttpResponseError as exc:
if exc.status_code != 429 or attempt == retries - 1:
raise
await asyncio.sleep(0.5 * (2 ** attempt))flowchart TD
A[RU spike / 429s] --> B[Check partition key usage]
B --> C{Cross-partition?}
C -->|Yes| D[Fix query to include key]
C -->|No| E[Check hot partitions]
E --> F[Scale RU or autoscale]
F --> G[Cache hot reads in Redis]
If throttling persists, engage data platform owner.