Skip to content

Latest commit

Β 

History

History
444 lines (339 loc) Β· 11.9 KB

File metadata and controls

444 lines (339 loc) Β· 11.9 KB

Usage Guide

Command-Line Interface

The application is designed as a command-line first tool with all key parameters provided via arguments for security and flexibility.

Basic Usage

# Analyze a specific database
dotnet run -- --endpoint "https://your-cosmos-account.documents.azure.com:443/" --database "YourDatabase" --output "C:\Reports"

# Analyze all databases in the account
dotnet run -- --endpoint "https://your-cosmos-account.documents.azure.com:443/" --all-databases --output "C:\Reports"

Command-Line Options

dotnet run -- --help

Available Options:

  • --endpoint, -e : Cosmos DB endpoint URL (required)
  • --database, -d : Specific database name to analyze
  • --all-databases : Analyze all databases in the account
  • --output, -o : Output directory for reports
  • --auto-discover : Auto-discover Azure Monitor settings
  • --help : Show help information

Report Output Structure

The application creates a timestamped folder for each analysis run:

C:\Reports\
└── CosmosDB-Analysis_2025-08-21__14-30-45\
    β”œβ”€β”€ DatabaseA-Analysis.xlsx
    β”œβ”€β”€ DatabaseB-Analysis.xlsx
    β”œβ”€β”€ DatabaseC-Analysis.xlsx
    └── Migration-Assessment.docx

File Structure:

  • Timestamped Folder: CosmosDB-Analysis_YYYY-MM-dd__HH-mm-ss
  • Excel Files: {DatabaseName}-Analysis.xlsx (one per database)
  • Word Document: Migration-Assessment.docx (combined assessment)

Multi-Database Analysis

Analyze All Databases

dotnet run -- --endpoint "https://your-cosmos-account.documents.azure.com:443/" --all-databases --output "C:\MultiDBAssessment"

This will:

  • Connect to your Cosmos DB account
  • Discover all databases automatically
  • Generate separate Excel reports for each database
  • Create one combined Word document with proper heading structure
  • Use proper Word styles for navigation and accessibility
  • Discover all databases automatically
  • Analyze each database independently
  • Generate combined assessment results
  • Create consolidated reports covering all databases

Interactive Output Directory Selection

# Application will prompt for output directory
dotnet run

If no output directory is specified via command line or configuration, the application will:

  1. Prompt you to enter a directory path
  2. Suggest a default timestamped directory
  3. Create the directory if it doesn't exist
  4. Validate write permissions

Advanced Usage Examples

Enterprise Multi-Database Assessment

# Full enterprise assessment across all databases
dotnet run -- \
  --all-databases \
  --output "\\shared\reports\cosmos-migration-$(Get-Date -Format 'yyyyMMdd')" \
  --auto-discover

This command will:

  • Analyze all databases in the Cosmos DB account
  • Auto-discover Azure Monitor settings
  • Save reports to a shared network location with date stamp
  • Generate comprehensive cross-database analysis

Single Database Deep Analysis

# Detailed analysis of specific database
dotnet run -- \
  --database "production-ecommerce" \
  --output "C:\Reports\EcommerceAnalysis"

Quick Development Assessment

# Fast assessment for development environments
dotnet run -- \
  --database "dev-testing" \
  --output ".\dev-reports"

Automated Assessment Pipeline

# For CI/CD or scheduled assessments
dotnet run -- \
  --all-databases \
  --output "%ASSESSMENT_OUTPUT_DIR%" \
  --auto-discover

Understanding the Assessment Process

Phase 1: Connection and Discovery

πŸ” Connecting to Azure services...
   βœ… Cosmos DB account: myapp-cosmos
   βœ… Database: ecommerce
   βœ… Azure Monitor workspace: connected
   βœ… Found 5 containers for analysis

What happens: The application authenticates with Azure and discovers your Cosmos DB structure.

Phase 2: Container Analysis

πŸ“Š Analyzing Cosmos DB containers...
   πŸ“¦ users (15,234 documents, 2.1 GB)
   πŸ“¦ orders (89,567 documents, 5.8 GB)
   πŸ“¦ products (2,456 documents, 0.3 GB)
   πŸ“¦ inventory (45,789 documents, 1.2 GB)
   πŸ“¦ sessions (234,567 documents, 8.9 GB)

What happens: Each container is analyzed for:

  • Document count and storage size
  • Schema detection and field analysis
  • Partition key effectiveness
  • Indexing policy review

Phase 3: Performance Metrics Collection

⚑ Collecting performance metrics (6 months)...
   πŸ“ˆ Request unit consumption patterns
   πŸ“ˆ Latency percentiles (P95, P99)
   πŸ“ˆ Throttling events and error rates
   πŸ“ˆ Regional distribution analysis

What happens: Historical performance data is collected from Azure Monitor to understand usage patterns.

Phase 4: SQL Migration Assessment

🎯 Generating SQL migration assessment...
   πŸ—οΈ  Platform recommendation: Azure SQL Database
   πŸ—οΈ  Service tier: Business Critical Gen5 8 vCore
   πŸ—οΈ  Estimated monthly cost: $2,847.60
   πŸ—οΈ  Migration complexity: Medium (6.2/10)

What happens: AI-driven analysis recommends optimal Azure SQL configuration based on your workload.

Phase 5: Data Factory Migration Planning

πŸš› Calculating Data Factory migration estimates...
   ⏱️  Estimated duration: 14.2 hours
   πŸ’° Estimated cost: $47.83
   πŸ”§ Recommended DIUs: 8
   πŸ”„ Parallel degree: 6

What happens: Migration timeline and costs are calculated based on data volume and complexity.

Phase 6: Report Generation

πŸ“„ Generating assessment reports...
   βœ… Excel report: Reports/CosmosDB_Assessment_20250820_143022.xlsx
   βœ… Word summary: Reports/CosmosDB_Assessment_20250820_143022.docx

What happens: Professional reports are generated for technical and executive audiences.

Interpreting Results

Excel Report Structure

The Excel report contains multiple worksheets:

1. Executive Summary

  • High-level metrics: Container count, data volume, cost estimates
  • Complexity assessment: Overall migration difficulty score
  • Key recommendations: Top 3-5 actionable items

2. Container Analysis

  • Per-container details: Document counts, storage, performance
  • Schema complexity: Field types, nesting levels, arrays
  • Partition key effectiveness: Hot partitions, distribution

3. Performance Metrics

  • Historical trends: 6-month RU consumption patterns
  • Latency analysis: P95/P99 response times
  • Throttling events: When and why throttling occurred

4. SQL Migration Plan

  • Platform recommendation: SQL Database vs Managed Instance vs VM
  • Sizing details: vCores, storage, service tier
  • Cost breakdown: Compute, storage, backup costs

5. Data Factory Estimates

  • Migration timeline: Detailed time estimates per container
  • Resource requirements: DIU recommendations
  • Cost projections: Detailed cost breakdown

Word Report Overview

The Word document provides:

  • Executive summary for stakeholders
  • Key findings and recommendations
  • Next steps for migration planning
  • Risk assessment and mitigation strategies

Understanding Complexity Scores

Score Level Description
1-3 Low Simple schema, minimal transformations needed
4-6 Medium Moderate complexity, some data transformation required
7-8 High Complex schema, significant transformation work
9-10 Very High Extensive restructuring and custom migration logic needed

Common Scenarios

Scenario 1: E-commerce Platform Migration

Context: Online store with user profiles, product catalog, and order history

Assessment Command:

dotnet run -- --database "ecommerce" --analysis-months 12

Expected Results:

  • Users container: Low complexity (simple profile data)
  • Products container: Medium complexity (nested categories, arrays)
  • Orders container: High complexity (embedded line items, complex relationships)

Recommendations:

  • Normalize nested order data into separate tables
  • Consider Azure SQL Database Business Critical tier
  • Implement staged migration approach

Scenario 2: IoT Data Migration

Context: Time-series data from IoT sensors

Assessment Command:

dotnet run -- --containers "sensor-data,device-metadata" --analysis-months 3

Expected Results:

  • Very high data volume with simple schema
  • High RU consumption for writes
  • Time-based partitioning requirements

Recommendations:

  • Consider Azure SQL Hyperscale for large datasets
  • Implement time-based table partitioning
  • Use columnstore indexes for analytics

Scenario 3: Multi-tenant SaaS Migration

Context: SaaS application with tenant-isolated data

Assessment Command:

dotnet run -- --enable-deep-analysis --analysis-months 6

Expected Results:

  • Complex tenant isolation patterns
  • Varied data volumes per tenant
  • Different access patterns by tenant size

Recommendations:

  • Design tenant isolation strategy for SQL
  • Consider elastic pools for smaller tenants
  • Implement row-level security

Automation and CI/CD

Automated Assessments

Create a PowerShell script for regular assessments:

# scheduled-assessment.ps1
param(
    [string]$Database,
    [string]$OutputPath,
    [int]$AnalysisMonths = 1
)

$timestamp = Get-Date -Format "yyyyMMdd"
$reportDir = "$OutputPath\$Database-$timestamp"

dotnet run -- \
  --database $Database \
  --analysis-months $AnalysisMonths \
  --output-dir $reportDir \
  --format json

# Upload reports to Azure Storage
az storage blob upload-batch \
  --source $reportDir \
  --destination "assessments" \
  --account-name "myassessments"

Integration with Azure DevOps

# azure-pipelines.yml
trigger:
  schedules:
  - cron: "0 2 * * 0"  # Weekly on Sunday at 2 AM
    branches:
      include:
      - main

jobs:
- job: CosmosAssessment
  pool:
    vmImage: 'ubuntu-latest'
  
  steps:
  - task: DotNetCoreCLI@2
    displayName: 'Run Cosmos Assessment'
    inputs:
      command: 'run'
      projects: 'CosmosToSqlAssessment.csproj'
      arguments: '--database production --output-dir $(Build.ArtifactStagingDirectory)'
  
  - task: PublishBuildArtifacts@1
    displayName: 'Publish Assessment Reports'
    inputs:
      pathtoPublish: '$(Build.ArtifactStagingDirectory)'
      artifactName: 'cosmos-assessment'

Performance Tips

Optimizing Assessment Speed

  1. Limit document sampling:

    dotnet run -- --max-samples 1000
  2. Reduce analysis period:

    dotnet run -- --analysis-months 1
  3. Analyze specific containers:

    dotnet run -- --containers "critical-container"
  4. Disable deep analysis:

    dotnet run -- --no-deep-analysis

Handling Large Datasets

For containers with millions of documents:

# Use statistical sampling
dotnet run -- \
  --max-samples 10000 \
  --enable-statistical-sampling \
  --confidence-level 95

Memory Optimization

# For memory-constrained environments
dotnet run -- \
  --batch-size 100 \
  --memory-limit 2GB \
  --enable-streaming

Troubleshooting Assessment Issues

Common Problems

  1. Timeout errors: Increase timeout settings in configuration
  2. Memory issues: Reduce sample sizes and enable streaming
  3. Permission errors: Verify Azure RBAC assignments
  4. Network issues: Check firewall and VPN configurations

Diagnostic Commands

# Test individual components
dotnet run -- --test-cosmos-connection
dotnet run -- --test-monitor-connection
dotnet run -- --test-report-generation

# Enable detailed logging
dotnet run -- --log-level Debug --log-to-file

Best Practices

  1. Schedule regular assessments to track changes over time
  2. Use version control for configuration files
  3. Archive assessment reports for historical comparison
  4. Review recommendations with your SQL DBA team
  5. Test migration approaches in development environments
  6. Monitor performance after migration completion