-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
What feature you would like to be added?
A selective restart strategy for SparkApplication updates that avoids unnecessary full application restarts when only specific components (executors) or configurations (services/ingress) are modified.
Currently, any change to the SparkApplication spec triggers a complete application restart by setting the status to ApplicationStateInvalidating. This proposal introduces intelligent change detection and selective update mechanisms that respect Spark's Driver-Executor architecture constraints.
Why is this needed?
Current Pain Points
- Inefficient restarts: Changing executor
priorityClassNamerestarts the entire application including the driver - Production downtime: Long-running Spark Streaming applications lose all state during unnecessary restarts
- Resource waste: Healthy driver pods are terminated when only executor changes are needed
- Operational friction: Simple scaling or configuration updates require full application cycles
Business Impact
- Streaming applications: Critical for 24/7 data pipelines that cannot afford frequent restarts
- Cost optimization: Avoid terminating expensive, long-running computations
- DevOps efficiency: Enable faster configuration updates and scaling operations
- Resource utilization: Keep healthy components running during partial updates
Technical Context
From internal/controller/sparkapplication/event_filter.go:
if !equality.Semantic.DeepEqual(oldApp.Spec, newApp.Spec) {
// Force-set the application status to Invalidating which handles clean-up and application re-run.
newApp.Status.AppState.State = v1beta2.ApplicationStateInvalidating
// ... complete restart for ANY spec change
}Describe the solution you would like
🚨 Critical Architectural Constraint
Spark Driver cannot be restarted independently because:
- Driver maintains application state (SparkContext, task scheduling, RDD lineage)
- Driver-Executor connections are stateful and cannot be reestablished
- Executors detect Driver disconnection and terminate automatically
- New Driver instance loses all knowledge of existing Executors
Proposed Change Detection Categories
1. Full Restart Required (Driver-Related Changes)
- Driver Configuration:
driver.priorityClassName,driver.cores,driver.memory,driver.javaOptions - Application Logic:
type,mainClass,mainApplicationFile,arguments - Global Configuration:
sparkConf,hadoopConf(most keys) - Core Infrastructure:
sparkVersion,mode,deps,volumes - Shared Resources:
image,imagePullPolicy(if affects both)
2. Executor Dynamic Replacement (Spark 3.0+ Dynamic Allocation)
- Scaling:
executor.instances - Resource Updates:
executor.cores,executor.memory,executor.coreLimit - Pod Configuration:
executor.priorityClassName,executor.nodeSelector - Lifecycle:
executor.deleteOnTermination,executor.javaOptions - Dynamic Allocation Settings:
dynamicAllocation.*
3. Hot Updates (No Pod Restart Required)
- Service Configuration:
driver.serviceAnnotations,driver.serviceLabels - Ingress Configuration:
sparkUIOptions,driverIngressOptions - Monitoring:
monitoring.*(if external) - Metadata:
labels,annotations(on SparkApplication resource)
Implementation Approach
Enhanced Event Filter
func (f *EventFilter) detectSpecChanges(oldApp, newApp *v1beta2.SparkApplication) UpdateScope {
if f.hasDriverChanges(oldApp, newApp) {
return UpdateScopeFull // Driver changes require full restart
}
if f.hasExecutorChanges(oldApp, newApp) {
return UpdateScopeExecutorDynamic // Use Dynamic Allocation
}
if f.hasServiceChanges(oldApp, newApp) {
return UpdateScopeHot // Update K8s resources only
}
return UpdateScopeNone
}New Application States
const (
ApplicationStateExecutorScaling ApplicationStateType = "EXECUTOR_SCALING"
ApplicationStateHotUpdating ApplicationStateType = "HOT_UPDATING"
)Example Use Cases
✅ Executor Scaling/Replacement
spec:
executor:
instances: 5 # 3→5 scale up
priorityClassName: "high-priority" # Requires replacementResult: Dynamic Allocation gradually replaces executors, driver keeps running
✅ Service Hot Update
spec:
driver:
serviceAnnotations:
monitoring: "enabled"
sparkUIOptions:
servicePort: 8080Result: Update Kubernetes Service/Ingress directly, no pod restart
⚠️ Driver Change (Full Restart)
spec:
driver:
priorityClassName: "critical" # Driver config changeResult: Complete application restart (current behavior)
Describe alternatives you have considered
Alternative 1: External Scaling Controller
Create a separate controller that handles Dynamic Allocation outside the SparkApplication CRD.
- Pros: Cleaner separation of concerns
- Cons: Additional complexity, fragmented user experience
Alternative 2: Annotation-Based Hints
Use annotations to hint which changes should trigger selective updates.
- Pros: User control over restart behavior
- Cons: Requires user understanding of Spark internals, error-prone
Alternative 3: Immutable SparkApplication with Separate Scaling Resource
Make SparkApplication immutable and create separate ExecutorScale resource.
- Pros: Clear API boundaries
- Cons: Breaking change, increased API surface
Alternative 4: Keep Current Behavior
Continue with full restarts for all changes.
- Pros: Simple, predictable
- Cons: Poor operational experience for production workloads
Additional context
Technical Implementation Details
Executor Dynamic Allocation Mechanism
func (r *Reconciler) updateExecutorConfiguration(ctx context.Context, app *v1beta2.SparkApplication, changes ExecutorChanges) error {
// Scale down old executors
if err := r.requestExecutorRemoval(ctx, app, changes.ExecutorsToRemove); err != nil {
return err
}
// Scale up with new configuration
if err := r.requestExecutorAddition(ctx, app, changes.NewExecutorConfig); err != nil {
return err
}
return nil
}Hot Update Mechanism
func (r *Reconciler) hotUpdateServices(ctx context.Context, app *v1beta2.SparkApplication) error {
service := &corev1.Service{}
if err := r.client.Get(ctx, serviceKey, service); err != nil {
return err
}
service.Annotations = app.Spec.Driver.ServiceAnnotations
return r.client.Update(ctx, service)
}Limitations and Constraints
What CANNOT be done:
- Driver restart with executor preservation: Architecturally impossible
- Live executor pod configuration changes: Pod specs are immutable
- Cross-pod network reconfiguration: Requires full restart
What CAN be done:
- Executor replacement via Dynamic Allocation: Safe and supported by Spark
- Service/Ingress hot updates: No impact on running pods
- Metadata updates: Labels, annotations on K8s resources
Implementation Phases
- Phase 1: Change detection and categorization logic
- Phase 2: Dynamic Allocation integration for executor updates
- Phase 3: Hot update mechanism for service configurations
- Phase 4: Configuration options and rollback mechanisms
- Phase 5: Comprehensive testing and documentation
Backward Compatibility
- Default behavior remains unchanged (full restart)
- Selective restart enabled via annotation:
spark-operator.kubeflow.org/restart-strategy: "selective" - All existing applications continue working without modification
Requirements
- Spark Version: 3.5.2 (for Dynamic Allocation)
- Kubernetes Version: 1.31
- Spark Operator: v2.2.0 (current main branch)
Related Work
- Kubernetes StatefulSet Rolling Updates
- Apache Spark Dynamic Allocation
- Similar selective restart patterns in Kafka Operator, Elasticsearch Operator
This feature would significantly improve the operational experience for production Spark workloads by providing fine-grained control over restart behavior while respecting Spark's architectural constraints.