-
Notifications
You must be signed in to change notification settings - Fork 4
Description
User Story: Maintenance Window Management
Story Overview
As a DevOps Engineer, SRE, Release Manager, or Migration Specialist
I want to schedule and manage maintenance windows in Dynatrace Managed environments through natural language interactions with the MCP server
So that I can reduce alert fatigue during planned deployments, migrations, and infrastructure changes—while maintaining precise audit trails and coordinating activities across multiple Managed environments
🔍 Scope & Architecture
What This MCP Server Does
This is a Dynatrace Managed MCP Server. It:
- ✅ Connects to one or more Dynatrace Managed environments
- ✅ Creates, modifies, and deletes maintenance windows in Managed clusters only
- ✅ Returns maintenance window data and schedules to the NLM (Natural Language Model)
- ✅ Supports cross-Managed-environment operations (e.g., same window on multiple Managed clusters)
What This MCP Server Does NOT Do
- ❌ Connect to Dynatrace SaaS environments directly
- ❌ Communicate with SaaS MCP Server directly
- ❌ Create, modify, or query maintenance windows in SaaS environments
Multi-Environment Operations
Within Managed (supported):
- User can create the same maintenance window across multiple Managed clusters
- NLM calls this MCP once for each target Managed environment
- Example: "Create maintenance window on managed-prod-us, managed-prod-eu, and managed-staging"
Across Managed and SaaS (NLM-orchestrated):
When coordination between Managed and SaaS is needed:
- User makes request to NLM (e.g., "Create maintenance window during migration affecting both environments")
- NLM calls this Managed MCP for Managed environment maintenance windows
- NLM calls separate SaaS MCP for SaaS environment maintenance windows (if available)
- NLM coordinates timing and scope across environments
- NLM aggregates results and presents unified confirmation to user
This user story describes the complete end-to-end workflow to show business value, but this MCP Server only implements maintenance window management in Dynatrace Managed environments.
User Value Proposition
1. Rapid Response During Emergencies
- Instant ad-hoc maintenance windows - "Create a maintenance window NOW for the next 2 hours" when unplanned deployments happen
- Emergency deployment coverage - quickly suppress noise when critical hotfixes require immediate deployment
- Forgotten window recovery - retroactively create windows when admins forgot to schedule them before deployment
- Natural language speed - create windows in seconds vs. minutes through UI navigation
2. Reduce Alert Fatigue
- Deployment noise elimination - suppress expected availability and performance problems during releases
- Targeted suppression - scope windows precisely to affected services, hosts, or clusters using tags and management zones
- Recurring window patterns - set up weekly deployment windows once, never manually create them again
- Multi-environment consistency - apply the same schedule across dev/staging/production Managed clusters
3. Dynamic Adjustments
- Extend running windows - "The deployment is taking longer, extend the window by 30 minutes"
- Early closure - "Deployment finished early, close the maintenance window now"
- Scope refinement - adjust which entities are covered mid-window if needed
- Schedule modifications - change recurring patterns without recreating windows
4. Managed-to-SaaS Migration Support
- Wave-based scoping - create precise windows for each migration wave (app by app, host by host, cluster by cluster)
- Tag-based targeting - scope windows using DQL/tags like
migration:wave1,environment:production,cluster:k8s-prod - Management zone isolation - suppress alerts only for migrating entities, not entire environments
- Correlation with migration events - link maintenance windows to migration events for complete audit trail
5. Intelligent Suggestions
- NLM-driven recommendations - AI suggests maintenance windows based on detected deployment patterns
- Optimal window sizing - recommend duration based on historical deployment times
- Smart recurrence patterns - detect weekly deployment schedules and suggest recurring windows
- Conflict detection - warn if overlapping windows or critical monitoring gaps
6. Integration with Fleet Updates
- OneAgent/ActiveGate update coordination - suggest maintenance windows during fleet update operations
- Cluster upgrade protection - suppress expected problems during Dynatrace platform upgrades
- Phased rollout support - create staggered windows for host group updates
- Integration point for future fleet management story
Acceptance Criteria
Core Capabilities
1. Create Scheduled Maintenance Windows
Given a user plans a deployment or migration
When they describe the maintenance window requirements in natural language
Then the MCP server should:
- Parse the intent to extract:
- Start date/time (or "now" for immediate)
- Duration or end date/time
- Recurrence pattern (one-time, daily, weekly, monthly, custom)
- Scope (entities, tags, management zones, DQL selector)
- Description/reason
- Use Dynatrace Settings 2.0 API with
builtin:alerting.maintenance-windowschema to create the window - Support all scope types:
- Management zone
- Tag-based filtering
- Entity selector (DQL)
- Combination of filters
- Return confirmation with window ID, scope summary, and schedule details
Example interactions:
User: "Create a maintenance window every Tuesday from 2 PM to 4 PM UTC
for all services tagged deployment:weekly in Production"
MCP: Creating recurring maintenance window...
Schedule:
✓ Recurrence: Weekly on Tuesdays
✓ Time: 14:00-16:00 UTC (2 hours)
✓ Effective from: 2026-02-04 (next Tuesday)
Scope:
✓ Management Zone: Production
✓ Tag filter: deployment:weekly
✓ Matched entities: 12 services
- Payment Service (SERVICE-ABC123)
- User Service (SERVICE-ABC124)
- Checkout Service (SERVICE-ABC125)
... (9 more)
Suppression:
✓ All problems suppressed during window
✓ Notifications paused
✓ Alerting profiles: All
Maintenance window created: mw-weekly-deploy-001
Next occurrence: 2026-02-06 14:00 UTC (in 2 days)
2. Create Ad-Hoc Maintenance Windows
Given an unplanned deployment or emergency change is happening
When a user requests an immediate maintenance window
Then the MCP server should:
- Support "now" or "immediately" as start time
- Default to 2-hour duration if not specified (prompt for confirmation)
- Allow quick scope specification (service name, host name, cluster name)
- Create window within seconds
- Confirm what entities are covered
Example interaction:
User: "Emergency hotfix deployment on Payment Service starting NOW.
Create a maintenance window for the next 90 minutes."
MCP: Creating emergency maintenance window...
⚠️ IMMEDIATE WINDOW - Starting now
Schedule:
✓ Start: 2026-02-04 14:42 UTC (immediately)
✓ End: 2026-02-04 16:12 UTC (90 minutes)
✓ Type: One-time (ad-hoc)
Scope:
✓ Service: Payment Service (SERVICE-ABC123)
✓ Included process groups: 3
✓ Included hosts: 5
Reason: Emergency hotfix deployment
Maintenance window ACTIVE: mw-emergency-payment-001
Alert suppression in effect.
Reminder: Window ends at 16:12 UTC (set a timer?)
3. Create Migration-Wave Maintenance Windows
Given a Managed-to-SaaS migration is happening in waves
When a user describes the migration wave scope
Then the MCP server should:
- Support DQL-based entity selection for precise targeting
- Use tags to identify migration wave entities
- Optionally link to migration events for correlation
- Create appropriately sized windows for migration activities
- Support overlapping waves (different entities)
Example interaction:
User: "We're migrating the Payment application (wave 1) to SaaS this Saturday
from 8 AM to 2 PM UTC. Create a maintenance window for all entities
tagged migration:wave1 and app:payment"
MCP: Creating migration maintenance window...
Migration Wave: Wave 1 - Payment Application
Schedule:
✓ Date: Saturday 2026-02-08
✓ Time: 08:00-14:00 UTC (6 hours)
✓ Type: One-time (migration event)
Scope Analysis:
✓ Tag filters: migration:wave1 AND app:payment
✓ Matched entities: 23 total
- 4 services (Payment API, Payment Worker, Payment DB, Payment Cache)
- 12 process groups
- 7 hosts
Expected Impact:
⚠️ Availability problems suppressed (hosts leaving Managed)
⚠️ Connection loss alerts suppressed
✓ Other production services NOT affected
Maintenance window created: mw-migration-wave1-payment
Would you like me to create a migration event linked to this window?
User: "Yes, create the migration event."
MCP: ✓ Migration event created: evt-migration-wave1-payment
✓ Event linked to maintenance window
✓ Both Managed environment records updated
Migration tracking complete. Alert suppression ready for Saturday.
4. Create Recurring Maintenance Windows
Given deployments happen on a regular schedule
When a user describes the recurring pattern
Then the MCP server should:
- Support recurrence patterns:
- Daily (every day at X time)
- Weekly (specific days of week)
- Monthly (specific day of month or "first Monday", etc.)
- Custom cron-like patterns if supported by Dynatrace
- Prefer creating ONE recurring window vs. multiple one-time windows
- Allow specifying end date for recurrence (or infinite)
- Show preview of next 3-5 occurrences
Example interaction:
User: "Our deployment window is every Tuesday and Thursday from 6 PM to 8 PM
for the next 3 months. Scope: all services in Production management zone."
MCP: Creating recurring maintenance window...
Recurrence Pattern:
✓ Days: Tuesday, Thursday
✓ Time: 18:00-20:00 local time (2 hours)
✓ Duration: 3 months (until 2026-05-04)
✓ Total occurrences: ~26 windows
Scope:
✓ Management Zone: Production
✓ All services (42 services currently)
Next 5 occurrences:
1. Tue 2026-02-06 18:00-20:00 UTC
2. Thu 2026-02-08 18:00-20:00 UTC
3. Tue 2026-02-13 18:00-20:00 UTC
4. Thu 2026-02-15 18:00-20:00 UTC
5. Tue 2026-02-20 18:00-20:00 UTC
Maintenance window created: mw-recurring-deploy-prod
Recurrence active until 2026-05-04.
5. List and Query Maintenance Windows
Given a user wants to see scheduled or active maintenance windows
When they query maintenance windows
Then the MCP server should:
- List windows filtered by:
- Status (upcoming, active, completed)
- Time range
- Scope (management zone, tags, entities)
- Environment
- Show window details:
- Schedule (start, end, recurrence)
- Scope (entities covered)
- Status (scheduled, active, ended)
- Creator and creation date
- Present in human-readable format
Example interaction:
User: "What maintenance windows are scheduled for this week?"
MCP: Querying maintenance windows for this week (2026-02-04 to 2026-02-10)...
Found 4 scheduled windows:
1. ACTIVE NOW - Emergency Payment Hotfix
ID: mw-emergency-payment-001
End: Today 16:12 UTC (in 1h 30min)
Scope: Payment Service (SERVICE-ABC123)
Status: 🔴 ACTIVE - Alerts suppressed
2. Weekly Deployment - Production
ID: mw-recurring-deploy-prod
Next: Tuesday 2026-02-06 18:00 UTC (in 2 days)
Recurrence: Every Tue/Thu 18:00-20:00
Scope: Production management zone (42 services)
Status: ⏰ SCHEDULED
3. Migration Wave 1 - Payment App
ID: mw-migration-wave1-payment
Start: Saturday 2026-02-08 08:00 UTC (in 4 days)
Duration: 6 hours (08:00-14:00)
Scope: migration:wave1 AND app:payment (23 entities)
Status: ⏰ SCHEDULED
4. Weekly Deployment - Production
ID: mw-recurring-deploy-prod (recurrence)
Next: Thursday 2026-02-08 18:00 UTC (in 4 days)
Scope: Production management zone (42 services)
Status: ⏰ SCHEDULED
Total: 1 active, 3 upcoming
6. Extend or Shorten Active Maintenance Windows
Given a maintenance window is currently active
When a user needs to adjust the duration
Then the MCP server should:
- Support extending end time ("add 30 minutes", "extend until 5 PM")
- Support early closure ("close now", "end this window")
- Update the window via Settings 2.0 API
- Add comment explaining the change
- Confirm new end time
Example interaction:
User: "The deployment is taking longer than expected.
Extend the current maintenance window by 45 minutes."
MCP: Finding active maintenance window...
Active window: mw-emergency-payment-001
Current end time: 2026-02-04 16:12 UTC (in 1h 30min)
Extending by 45 minutes...
✓ New end time: 2026-02-04 16:57 UTC (in 2h 15min)
✓ Comment added: "Window extended due to deployment delays"
Maintenance window extended.
Alert suppression now active until 16:57 UTC.
---
User: "Deployment finished successfully. Close the maintenance window early."
MCP: Finding active maintenance window...
Active window: mw-emergency-payment-001
Originally scheduled until: 16:57 UTC (still 1h 45min remaining)
Closing window NOW...
✓ Window ended: 2026-02-04 15:12 UTC
✓ Duration: 2h 30min (planned 3h 15min, ended 45min early)
✓ Comment added: "Window closed early - deployment completed"
Maintenance window closed.
✓ Alert suppression ENDED
✓ Normal monitoring resumed
Reminder: Problems may take 1-2 minutes to reappear if still active.
7. Update Maintenance Window Scope
Given a user needs to adjust which entities are covered
When they modify the window scope
Then the MCP server should:
- Support adding or removing entities, tags, or management zones
- Update the Settings 2.0 object
- Show before/after entity count
- Validate scope changes (prevent empty scope)
- Apply immediately if window is active
Example interaction:
User: "Add all hosts tagged cluster:analytics to the current maintenance window.
We're doing unplanned maintenance on them too."
MCP: Finding active maintenance window...
Active window: mw-emergency-payment-001
Current scope: Payment Service (SERVICE-ABC123)
Current entities: 8 (1 service, 2 process groups, 5 hosts)
Adding scope: cluster:analytics
Analyzing impact...
Additional entities: 12 hosts in Analytics cluster
New total: 20 entities
Updating maintenance window scope...
✓ Tag filter added: cluster:analytics
✓ Scope updated (effective immediately)
✓ Comment added: "Scope expanded to include Analytics cluster hosts"
Maintenance window updated.
Alert suppression now covers 20 entities (was 8).
8. Delete Maintenance Windows
Given a user needs to cancel or remove a maintenance window
When they request deletion
Then the MCP server should:
- Support deleting upcoming windows (before they start)
- Prevent deletion of active windows (must close first, then delete)
- For recurring windows, offer to delete:
- Just the next occurrence
- All future occurrences
- The entire recurrence pattern
- Confirm deletion with scope and schedule summary
- Warn if window is linked to migration events
Example interaction:
User: "Cancel the maintenance window scheduled for Saturday.
The migration is delayed."
MCP: Finding maintenance windows scheduled for Saturday 2026-02-08...
Found: mw-migration-wave1-payment
Schedule: Saturday 2026-02-08 08:00-14:00 UTC
Scope: migration:wave1 AND app:payment (23 entities)
Status: SCHEDULED (in 4 days)
⚠️ Warning: This window is linked to migration event evt-migration-wave1-payment
Delete this maintenance window?
User: "Yes, delete it."
MCP: Deleting maintenance window...
✓ Window mw-migration-wave1-payment deleted
✓ No longer scheduled for Saturday
Note: Migration event evt-migration-wave1-payment still exists.
Should I also delete or update the migration event?
User: "Update the event to say migration postponed."
MCP: ✓ Migration event updated with status: postponed
✓ Comment added: "Migration postponed, new date TBD"
Maintenance window deleted, migration event updated.
9. Multi-Environment Maintenance Window Creation
Given the MCP server is connected to multiple Managed environments
When a user wants to create the same maintenance window in multiple environments
Then the MCP server should:
- Accept multiple environment IDs
- Create identical windows in each environment (same schedule, scope)
- Adjust entity selectors per environment if needed
- Report success/failure per environment
- Support bulk operations for consistency
Example interaction:
User: "Create a maintenance window on Tuesday 6 PM to 8 PM for Production
management zone in BOTH managed-prod-us and managed-prod-eu clusters."
MCP: Creating maintenance window in multiple environments...
Target environments:
- managed-prod-us (Managed US cluster)
- managed-prod-eu (Managed EU cluster)
Window configuration:
✓ Schedule: Tuesday 2026-02-06 18:00-20:00 UTC
✓ Scope: Management Zone "Production"
Creating in managed-prod-us...
✓ Window created: mw-us-deploy-tue-001
✓ Scope: 42 services in Production
Creating in managed-prod-eu...
✓ Window created: mw-eu-deploy-tue-001
✓ Scope: 38 services in Production
Maintenance windows created in both clusters.
Same schedule, environment-specific entity counts.
Summary:
- managed-prod-us: 42 services covered
- managed-prod-eu: 38 services covered
- Both active: Tuesday 18:00-20:00 UTC
10. Intelligent Maintenance Window Suggestions
Given the MCP server detects deployment patterns
When a user asks for recommendations
Then the MCP server should:
- Analyze historical deployment events
- Detect recurring patterns (same day/time)
- Calculate average deployment duration
- Suggest appropriate window schedules
- Recommend scope based on past deployments
Example interaction:
User: "We deploy Payment Service almost every week.
Can you suggest a recurring maintenance window?"
MCP: Analyzing deployment history for Payment Service...
Found 8 deployments in the last 60 days:
- 6 on Tuesday between 17:00-19:00 UTC
- 2 on Wednesday between 18:00-20:00 UTC
Deployment duration analysis:
- Average: 1h 15min
- Max: 2h 10min
- Min: 45min
💡 RECOMMENDATION:
Create recurring maintenance window:
✓ Day: Every Tuesday
✓ Time: 17:00-19:30 UTC (2.5 hours for safety margin)
✓ Scope: Payment Service (SERVICE-ABC123)
This covers 75% of your historical deployment window.
Should I create this recurring window?
User: "Yes, create it."
MCP: Creating recurring maintenance window...
✓ Recurrence: Every Tuesday 17:00-19:30 UTC
✓ Scope: Payment Service
✓ First occurrence: Next Tuesday 2026-02-06
Maintenance window created: mw-recurring-payment-deploy
Tip: If deployments shift to a different day, let me know
and I'll adjust the recurrence pattern.
11. Fleet Update Coordination (Integration Point)
Given a fleet update (OneAgent, ActiveGate, cluster upgrade) is planned
When a user schedules the fleet update
Then the MCP server should:
- Detect that fleet updates often justify maintenance windows
- Suggest creating a maintenance window for the update duration
- Offer scope based on fleet update targets (all hosts, specific host groups, etc.)
- Link maintenance window to fleet update activities
Example interaction:
User: "Schedule OneAgent update for all Production hosts this Friday midnight."
MCP: OneAgent fleet update scheduled for Friday 2026-02-08 00:00 UTC.
💡 Suggestion: Create a maintenance window?
Fleet updates often cause:
- Brief agent disconnections
- Process group restarts
- Temporary monitoring gaps
Recommended maintenance window:
✓ Start: Friday 2026-02-08 00:00 UTC
✓ Duration: 3 hours (typical for ~50 hosts)
✓ Scope: All Production hosts
This will suppress expected availability/monitoring problems during the update.
Create this maintenance window?
User: "Yes."
MCP: Creating maintenance window for fleet update...
✓ Window: mw-fleet-oneagent-prod-001
✓ Schedule: Friday 00:00-03:00 UTC
✓ Scope: Management Zone "Production" (52 hosts)
✓ Linked to: OneAgent update job fleet-update-20260208
Maintenance window created and linked to fleet update.
Alert suppression will activate during update window.
Technical Implementation Requirements
MCP Tools to Implement
Tool 1: create-maintenance-window
Purpose: Create a new maintenance window (scheduled or ad-hoc)
Inputs (Zod schema):
{
environmentId: string | string[], // One or more Managed environments
name: string, // Display name
description?: string, // Optional description/reason
schedule: {
startTime: string, // ISO timestamp or "now"
endTime?: string, // ISO timestamp (or use duration)
duration?: number, // Minutes (alternative to endTime)
recurrence?: { // Optional: for recurring windows
pattern: "daily" | "weekly" | "monthly" | "custom",
daysOfWeek?: number[], // For weekly (0=Sunday, 1=Monday, etc.)
dayOfMonth?: number, // For monthly
endDate?: string, // When recurrence stops (optional)
cronExpression?: string // For custom patterns
}
},
scope: {
managementZones?: string[], // Management zone IDs or names
tags?: string[], // Tag filters (e.g., "deployment:weekly")
entitySelector?: string, // DQL selector
entities?: string[] // Explicit entity IDs
},
suppressAlertingProfiles?: string[], // Specific alerting profiles (or all)
linkedEventId?: string, // Optional: link to deployment/migration event
metadata?: { // Optional structured metadata
reason?: string, // "deployment", "migration", "maintenance", etc.
ticket?: string,
owner?: string
}
}Implementation:
- Parse natural language to extract schedule and scope
- If startTime is "now" or relative, convert to absolute timestamp
- Construct Settings 2.0 payload for
builtin:alerting.maintenance-windowschema - For multiple environments, create window in each sequentially
- Validate scope (ensure entities exist)
- POST to
/api/v2/settings/objectswith appropriate payload - Return created window ID(s) and confirmation
Tool 2: list-maintenance-windows
Purpose: Query and list maintenance windows
Inputs:
{
environmentId: string,
filters?: {
status?: "scheduled" | "active" | "ended" | "all",
fromDate?: string, // Date range filter
toDate?: string,
managementZone?: string, // Filter by scope
tags?: string[],
includeRecurring?: boolean // Show all recurrence instances
},
includeDetails?: boolean // Full details vs. summary
}Implementation:
- GET
/api/v2/settings/objects?schemaIds=builtin:alerting.maintenance-window - Filter results based on schedule and scope
- Determine status (active, scheduled, ended) based on current time
- Format for human-readable output
- For recurring windows, calculate next occurrences
Tool 3: update-maintenance-window
Purpose: Modify an existing maintenance window
Inputs:
{
environmentId: string,
windowId: string, // Settings object ID or friendly name
updates: {
extendBy?: number, // Minutes to extend (for active windows)
newEndTime?: string, // Set specific end time
closeNow?: boolean, // Close window immediately
scope?: { // Update scope
add?: { // Add entities/tags
managementZones?: string[],
tags?: string[],
entities?: string[]
},
remove?: { // Remove entities/tags
managementZones?: string[],
tags?: string[],
entities?: string[]
}
},
reschedule?: { // Change start/end times (for scheduled windows)
startTime?: string,
endTime?: string
}
},
comment?: string // Reason for update
}Implementation:
- Fetch existing window from Settings 2.0
- Apply updates to the settings object
- Validate changes (e.g., can't move active window to the past)
- PUT
/api/v2/settings/objects/{objectId}with updated payload - Add comment/audit trail
- Return updated window summary
Tool 4: delete-maintenance-window
Purpose: Cancel or remove a maintenance window
Inputs:
{
environmentId: string,
windowId: string, // Settings object ID or friendly name
deleteType?: "this" | "future" | "all", // For recurring windows
force?: boolean // Allow deleting active window (closes first)
}Implementation:
- Fetch window from Settings 2.0
- If active and force=false, return error (must close first)
- If active and force=true, close then delete
- For recurring windows, handle deleteType:
- "this": Cancel only next occurrence (if supported by schema)
- "future": Cancel all future occurrences
- "all": Delete entire recurrence pattern
- DELETE
/api/v2/settings/objects/{objectId} - Return confirmation
Tool 5: get-maintenance-window-details
Purpose: Get full details of a specific maintenance window
Inputs:
{
environmentId: string,
windowId: string // Settings object ID or friendly name
}Implementation:
- GET
/api/v2/settings/objects/{objectId} - Parse and format for display:
- Schedule (start, end, recurrence)
- Scope (entities, tags, management zones)
- Status (active, scheduled, ended)
- Creator and timestamps
- Linked events or metadata
- Show matched entities count
- If recurring, show next 5 occurrences
Tool 6: suggest-maintenance-window
Purpose: Analyze deployment history and suggest optimal windows
Inputs:
{
environmentId: string,
entitySelector?: string, // Scope to analyze (service, host group, etc.)
analysisWindow?: string, // How far back to look (default: 60 days)
suggestRecurring?: boolean // Prefer recurring vs. one-time (default: true)
}Implementation:
- Query Events API for historical deployment events
- Filter by entity selector if provided
- Analyze patterns:
- Day of week frequency
- Time of day clustering
- Duration statistics (avg, max, p95)
- Detect recurring patterns
- Generate recommendation with:
- Suggested schedule
- Recommended duration
- Scope
- Confidence level
- Return suggestion for user approval
Tool 7: link-window-to-event
Purpose: Associate a maintenance window with a deployment/migration event
Inputs:
{
environmentId: string,
windowId: string,
eventId: string, // Event correlation ID
linkType?: "deployment" | "migration" | "maintenance" | "fleet_update"
}Implementation:
- Update maintenance window metadata with event reference
- Add comment to event linking to maintenance window
- Create bidirectional reference
- Return confirmation
MCP Resources
Resource: maintenance-window-guide
Purpose: Provide documentation on maintenance window features
Content:
- Overview of maintenance window types (scheduled, ad-hoc, recurring)
- Example natural language commands
- Scope targeting strategies (tags, management zones, DQL)
- Migration-specific window patterns
- Best practices for window sizing and timing
- Recurrence pattern examples
- Integration with fleet updates and CI/CD
Resource: maintenance-window-schema-reference
Purpose: Technical reference for Dynatrace maintenance window schema
Content:
- Settings 2.0 schema:
builtin:alerting.maintenance-window - Required and optional fields
- Scope syntax and examples
- Recurrence rule format
- Suppression behavior details
- Limitations and constraints
Example User Interactions
Scenario 1: Weekly Deployment Window
User: "We deploy to production every Tuesday at 6 PM for about 2 hours.
Set up a recurring maintenance window for all Production services."
MCP: Creating recurring maintenance window...
Deployment Schedule:
✓ Recurrence: Weekly on Tuesdays
✓ Time: 18:00-20:00 UTC (2 hours)
✓ Starting: Next Tuesday 2026-02-06
✓ Ending: Never (infinite recurrence)
Scope:
✓ Management Zone: Production
✓ All services: 42 services
✓ Includes: 67 process groups, 52 hosts
Next 3 occurrences:
1. Tuesday 2026-02-06 18:00-20:00 UTC
2. Tuesday 2026-02-13 18:00-20:00 UTC
3. Tuesday 2026-02-20 18:00-20:00 UTC
Maintenance window created: mw-recurring-prod-deploy
✓ Alerts will be suppressed during each window
✓ Problems on Production services won't trigger notifications
✓ You can modify or cancel this recurrence anytime
Setup complete!
Scenario 2: Emergency Hotfix with Forgotten Window
User: "Oh no, we deployed a hotfix 20 minutes ago and forgot to create
a maintenance window. We're getting flooded with alerts because
services are restarting. Help!"
MCP: Creating retroactive maintenance window...
⚠️ EMERGENCY WINDOW - Starting retroactively
Schedule:
✓ Start: 2026-02-04 14:20 UTC (20 minutes ago)
✓ End: 2026-02-04 16:20 UTC (1h 40min from now)
✓ Duration: 2 hours total
What scope should this cover?
User: "All services in Production management zone."
MCP: Creating maintenance window...
Scope:
✓ Management Zone: Production (42 services)
Maintenance window ACTIVE: mw-emergency-prod-hotfix
✓ Window backdated to 14:20 UTC
✓ Alert suppression NOW ACTIVE
✓ Window ends at 16:20 UTC (1h 40min remaining)
Note: Problems that already fired will remain, but:
- No NEW problems will trigger alerts
- No notifications will be sent
- UI will show maintenance window as active
Alert fatigue should reduce immediately!
Scenario 3: Migration Wave with Precise Scoping
User: "Saturday morning we're migrating Wave 2 of our migration to SaaS.
This includes all k8s workloads in the Analytics cluster,
tagged migration:wave2 and cluster:analytics.
Create a maintenance window from 6 AM to 12 PM."
MCP: Creating migration maintenance window...
Migration Details:
✓ Wave: Wave 2 - Analytics cluster
✓ Migration type: Managed → SaaS
Schedule:
✓ Date: Saturday 2026-02-08
✓ Time: 06:00-12:00 UTC (6 hours)
✓ Type: One-time (migration event)
Scope Analysis:
✓ Tag filters: migration:wave2 AND cluster:analytics
Matched entities (23 total):
- 6 services (Analytics API, Stream Processor, Query Engine, etc.)
- 15 process groups (k8s workloads)
- 2 hosts (k8s worker nodes: analytics-k8s-01, analytics-k8s-02)
Expected Suppression:
⚠️ HOST UNAVAILABLE - suppressed (hosts leaving Managed)
⚠️ ONEAGENT CONNECTION LOST - suppressed
⚠️ PROCESS GROUP UNAVAILABLE - suppressed
⚠️ SERVICE UNAVAILABLE - suppressed
✓ Other production workloads NOT affected
Maintenance window created: mw-migration-wave2-analytics
Would you like me to:
1. Create a migration event linked to this window?
2. Create a corresponding window in SaaS for arrival tracking?
User: "Yes to both."
Managed MCP: Creating migration artifacts in Managed environment...
✓ Maintenance window created: mw-migration-wave2-analytics
- Scope: Management Zone "Analytics Cluster"
- Tags: migration:wave2, migration:to_saas
- Schedule: Saturday 6:00-12:00 UTC
✓ Migration event created: evt-migration-wave2-analytics
- Type: CUSTOM_INFO
- Properties:
* status=scheduled
* target=saas-tenant-abc123
* migration_wave=wave2
- Correlation ID: mig-wave2-analytics
✓ Event and window linked together
═══════════════════════════════════════════════════════
NEXT STEP: SaaS Environment (via NLM orchestration)
═══════════════════════════════════════════════════════
This Managed MCP has completed its tasks. To create a corresponding
maintenance window in SaaS, ask your NLM to coordinate with SaaS MCP:
Suggested NLM prompt:
"Create arrival tracking maintenance window in SaaS for migration
mig-wave2-analytics on Saturday 6 AM to 12 PM UTC.
Scope: Incoming Analytics Cluster entities."
Migration window ready for Saturday!
Scenario 4: Deployment Taking Longer - Extend Window
User: "The deployment is running longer than expected.
Extend the current window by 1 hour."
MCP: Finding active maintenance window...
Active window: mw-recurring-prod-deploy
Current schedule: 18:00-20:00 UTC (ends in 25 minutes)
Extending by 1 hour...
✓ New end time: 21:00 UTC (ends in 1h 25min)
✓ Comment added: "Extended due to deployment delays"
✓ Duration: 3 hours (was 2 hours)
Maintenance window extended.
Alert suppression now active until 21:00 UTC.
Reminder: This is a recurring window. Extension only applies to
TODAY'S occurrence. Next Tuesday's window remains 2 hours.
Scenario 5: Multi-Environment Consistency
User: "Create the same weekly deployment window in all 3 of our Managed clusters:
managed-us, managed-eu, and managed-apac.
Every Wednesday 10 PM local time, 2 hours, Production services."
MCP: Creating maintenance window across 3 Managed environments...
Configuration:
✓ Recurrence: Weekly on Wednesdays
✓ Duration: 2 hours
✓ Scope: Management Zone "Production"
Time zone handling:
⚠️ Note: You said "local time" but each cluster is in different timezone.
Options:
1. Use each cluster's local timezone (22:00 local)
2. Use same UTC time (22:00 UTC = different local times)
3. Specify exact times per cluster
Which approach?
User: "Option 1 - each cluster's local time 10 PM."
MCP: Creating environment-specific windows...
managed-us (UTC-5):
✓ Window: mw-us-wed-deploy
✓ Schedule: Wednesdays 03:00-05:00 UTC (22:00-00:00 local)
✓ Scope: 42 services in Production
managed-eu (UTC+1):
✓ Window: mw-eu-wed-deploy
✓ Schedule: Wednesdays 21:00-23:00 UTC (22:00-00:00 local)
✓ Scope: 38 services in Production
managed-apac (UTC+8):
✓ Window: mw-apac-wed-deploy
✓ Schedule: Wednesdays 14:00-16:00 UTC (22:00-00:00 local)
✓ Scope: 31 services in Production
All 3 maintenance windows created!
Each activates at 22:00 in its local timezone.
Summary:
- US: Wednesdays 10 PM EST (03:00 UTC)
- EU: Wednesdays 10 PM CET (21:00 UTC)
- APAC: Wednesdays 10 PM SGT (14:00 UTC)
Scenario 6: Fleet Update with Window Suggestion
User: "Schedule OneAgent update for all Production hosts this Sunday at midnight."
MCP: Scheduling OneAgent fleet update...
Fleet Update Configuration:
✓ Target: All Production hosts (52 hosts)
✓ Schedule: Sunday 2026-02-09 00:00 UTC
✓ Update version: OneAgent 1.285.0
✓ Rollout strategy: Host-by-host (safer, slower)
Estimated duration: 2-3 hours for 52 hosts
💡 RECOMMENDATION: Create maintenance window?
During OneAgent updates, you may see:
- Brief host connectivity issues
- Process group restarts
- Monitoring data gaps (1-2 min per host)
- False AVAILABILITY problems
Suggested maintenance window:
✓ Start: Sunday 00:00 UTC (same as fleet update)
✓ Duration: 4 hours (buffer for safety)
✓ Scope: All Production hosts (52 hosts)
This will suppress expected problems during the update.
Create this window?
User: "Yes, create it."
MCP: Creating coordinated maintenance window...
✓ Window: mw-fleet-oneagent-prod-20260209
✓ Schedule: Sunday 2026-02-09 00:00-04:00 UTC (4 hours)
✓ Scope: Production management zone (52 hosts)
✓ Linked to: Fleet update job fleet-update-oneagent-20260209
Setup complete:
✓ Fleet update scheduled
✓ Maintenance window scheduled
✓ Both linked for tracking
On Sunday at midnight:
1. Maintenance window activates (alerts suppressed)
2. OneAgent update begins
3. Rollout proceeds host-by-host
4. Window closes at 04:00 UTC (or earlier if you close manually)
You'll be notified when the fleet update completes.
Scenario 7: Deployment Finished Early - Close Window
User: "Great news, deployment finished in 45 minutes instead of 2 hours.
Close the maintenance window now."
MCP: Finding active maintenance window...
Active window: mw-recurring-prod-deploy
Started: 18:00 UTC (45 minutes ago)
Scheduled end: 20:00 UTC (1h 15min remaining)
Closing window early...
✓ Window closed: 2026-02-04 18:45 UTC
✓ Actual duration: 45 minutes (vs. 2 hours planned)
✓ Saved: 1h 15min of suppression time
✓ Comment added: "Window closed early - deployment completed ahead of schedule"
Maintenance window CLOSED.
✓ Alert suppression ENDED
✓ Normal monitoring RESUMED
✓ Problem detection active again
Note: Problems may take 1-2 minutes to re-evaluate and appear if issues exist.
Deployment completed 37.5% faster than planned! 🚀
Non-Functional Requirements
Performance
- Maintenance window creation should complete within 3 seconds
- Multi-environment operations should use parallel API calls
- Window queries should return within 2 seconds
- Scope validation (entity count) should complete within 5 seconds
Security
- All API tokens remain server-side (not exposed to users)
- Validate user has appropriate Dynatrace permissions (
settings.read,settings.write) - Audit all maintenance window operations with user identity
- Prevent unauthorized scope expansion (e.g., can't create window for management zones user doesn't own)
Error Handling
- If entity selector matches zero entities, warn user (empty scope)
- If start time is in the past (except for retroactive emergency windows), confirm intent
- If overlapping windows exist for same scope, warn about redundancy
- Gracefully handle Settings 2.0 schema version differences between Managed versions
- If recurring pattern is invalid (e.g., "31st of every month"), suggest alternatives
Observability
- Log all maintenance window lifecycle operations (create, update, extend, close, delete)
- Track metrics: windows created per day, average duration, extension frequency
- Monitor API latency and error rates
- Alert if window creation fails repeatedly
Usability
- Natural language flexibility: accept "now", "tomorrow", "next Tuesday", "in 2 hours", etc.
- Support timezone-aware scheduling (UTC, local, specific timezones)
- Preview scope before creating window (show entity count)
- Confirm high-impact operations (e.g., multi-environment, long duration, broad scope)
Audit and Compliance
- Every maintenance window operation must include:
- User identity (who created/modified)
- Timestamp (when)
- Reason/description (why)
- Environment and scope (what/where)
- Support exporting window history for compliance review
- Preserve audit trail even after window is deleted
Out of Scope (Future Stories)
The following related features are not included in this story but are candidates for future work:
- Automatic Window Creation from CI/CD - Directly trigger window creation from Jenkins/GitLab/GitHub Actions
- Smart Window Duration Prediction - ML-based prediction of optimal window length based on deployment complexity
- Maintenance Window Templates - Pre-defined templates for common scenarios (weekly deploy, monthly maintenance, migration waves)
- Window Collision Detection - Prevent scheduling conflicting windows; suggest alternative times
- Notification Preferences - Custom notification behavior during windows (suppress all vs. critical only)
- Window Effectiveness Metrics - Track how many problems were suppressed, cost savings, etc.
- Approval Workflows - Require approval before creating production maintenance windows
- Window Handoff Automation - Automatically extend or adjust windows based on real-time deployment progress
Dependencies
Dynatrace APIs Required
- Settings 2.0 API (
/api/v2/settings/...)- Schema:
builtin:alerting.maintenance-window - GET
/api/v2/settings/schemas/builtin:alerting.maintenance-window- Schema definition - POST
/api/v2/settings/objects- Create window - PUT
/api/v2/settings/objects/{objectId}- Update window - DELETE
/api/v2/settings/objects/{objectId}- Delete window - GET
/api/v2/settings/objects?schemaIds=builtin:alerting.maintenance-window- Query windows
- Schema:
- Entities API v2 (
/api/v2/entities) - Validate entity selectors and count entities - Events API v2 (
/api/v2/events/ingest) - Link to deployment/migration events (optional)
Existing MCP Server Capabilities
- Multi-environment connection management (already implemented)
- Authentication via API tokens (already implemented)
- NLM integration for natural language parsing
- Event ingestion (from previous user story) - for linking windows to events
- Problem lifecycle management (from previous user story) - for coordinating window closure with problem resolution
External Integrations (Future)
- CI/CD tools (Jenkins, GitLab CI, GitHub Actions) - trigger window creation
- Fleet management system (future story) - coordinate with OneAgent/ActiveGate updates
- Calendar systems (Google Calendar, Outlook) - sync maintenance windows to team calendars
Success Metrics
Efficiency Gains
- Target: Reduce maintenance window creation time by 80% compared to manual UI workflow
- Measure: Time from user request to active window (avg. <10 seconds via MCP vs. 2-3 minutes in UI)
Alert Fatigue Reduction
- Target: 70% reduction in unnecessary alerts during deployments
- Measure: Compare problem notification count during windowed vs. non-windowed deployments
Migration Support
- Target: 95% of migration waves use maintenance windows
- Measure: Track migration events with linked maintenance windows
User Adoption
- Target: 80% of users create at least one maintenance window via MCP within first month
- Measure: Track
create-maintenance-windowtool usage
Recurring Window Efficiency
- Target: 60% of maintenance windows are recurring (not one-time)
- Measure: Track recurring vs. one-time window creation ratio
Multi-Environment Consistency
- Target: 50% of windows are created across multiple Managed clusters simultaneously
- Measure: Track multi-environment window creation
Testing Strategy
Unit Tests
- Validate Settings 2.0 payload construction for maintenance windows
- Test natural language parsing for schedules (relative and absolute times)
- Verify entity selector construction from tags and management zones
- Test recurrence pattern generation (daily, weekly, monthly)
- Validate scope validation logic (entity count, empty scope detection)
Integration Tests
- E2E: Create scheduled window, verify in Dynatrace, let it activate, verify suppression
- E2E: Create ad-hoc window immediately, verify it's active within seconds
- E2E: Extend active window, verify new end time
- E2E: Close window early, verify suppression ends
- E2E: Create recurring window, verify next occurrences calculated correctly
- Cross-environment: Create same window in multiple Managed clusters
Contract Tests
- Validate against latest
builtin:alerting.maintenance-windowschema from Managed - Test with different schema versions (ensure backward compatibility)
- Verify all required fields are included
- Test optional field combinations
User Acceptance Tests
- Scenario: DevOps engineer creates weekly deployment window
- Scenario: SRE creates emergency window during unplanned outage
- Scenario: Migration specialist creates wave-based migration windows
- Scenario: Platform engineer extends window mid-deployment
- Scenario: Team lead queries upcoming windows across environments
Load Tests
- Create 100+ windows across multiple environments
- Verify performance remains acceptable (< 5 seconds per operation)
- Test parallel window creation (multiple users simultaneously)
Risks and Mitigations
| Risk | Impact | Mitigation |
|---|---|---|
| Incorrect scope suppresses too many alerts | High - could hide critical production issues | Always preview entity count before creation; require confirmation for broad scope (>50 entities) |
| Maintenance window forgotten/not closed | Medium - extended suppression hides real problems | Send reminders when window approaches end time; auto-close option after deployment events |
| Timezone confusion causes wrong timing | High - window activates at wrong time | Always confirm timezone; show time in UTC AND local time; support explicit timezone specification |
| Recurring pattern miscalculation | Medium - windows at wrong times | Show preview of next 3-5 occurrences; allow user to verify before creation |
| Schema version differences across Managed clusters | Medium - window creation may fail on older clusters | Detect Managed version; adjust payload to compatible schema version; warn about unsupported features |
| Overlapping windows create confusion | Low - redundant suppression | Detect overlaps; warn user; suggest consolidating windows |
| Retroactive windows don't suppress existing problems | Low - expectations mismatch | Clearly communicate that retroactive windows prevent NEW alerts, don't clear existing problems |
Appendix: Maintenance Window Schema Reference
builtin:alerting.maintenance-window
Scope: environment
Key Properties:
{
enabled: boolean, // Active or disabled
generalProperties: {
name: string, // Display name
description?: string, // Optional description
maintenanceType: "PLANNED" | "UNPLANNED",
suppression: "DETECT_PROBLEMS_DONT_ALERT" | "DONT_DETECT_PROBLEMS"
},
schedule: {
scheduleType: "ONCE" | "DAILY" | "WEEKLY" | "MONTHLY",
start: string, // ISO 8601 timestamp
end: string, // ISO 8601 timestamp
zoneId: string, // Timezone (e.g., "UTC", "America/New_York")
recurrence?: {
daysOfWeek?: number[], // For weekly (1=Monday, 7=Sunday)
dayOfMonth?: number, // For monthly
durationMinutes?: number // Window length
}
},
scope: {
matches: [{
type: "MANAGEMENT_ZONE" | "ENTITY" | "ENTITY_NAME" | "TAG",
managementZoneId?: string,
entityIds?: string[],
tagKey?: string,
tagValue?: string
}]
}
}Full schema available via: GET /api/v2/settings/schemas/builtin:alerting.maintenance-window
Suppression Options:
DETECT_PROBLEMS_DONT_ALERT- Problems are detected but don't trigger notificationsDONT_DETECT_PROBLEMS- Problems are not detected at all during window
Supported Scope Types:
- Management Zone ID
- Explicit entity IDs
- Tag-based matching (key + value)
- Entity name pattern matching