CronJob-Scale-Down-Operator

A Kubernetes operator that automatically scales down Deployments and StatefulSets during specific time windows (e.g., at night or on weekends) to save resources and costs.

Features

🕒 Cron-based Scheduling: Uses standard cron expressions with second precision
🌍 Timezone Support: Configure schedules in any timezone
📈 Flexible Scaling: Scale down and up on different schedules
🎯 Multiple Resource Types: Supports Deployments and StatefulSets for scaling
🧹 Resource Cleanup: Automatically delete test resources based on annotations
🏷️ Cleanup-Only Mode: Pure cleanup functionality without scaling any target resources
📊 Status Tracking: Monitor last execution times and current replica counts
🌐 Web UI Dashboard: Built-in web interface to monitor all cron jobs and their status
📈 Prometheus Metrics: Comprehensive metrics for scaling operations, cleanup activities, and system health
⚡ Efficient: Only reconciles when needed, with smart requeue timing
🛡️ Safe Testing: Dry-run mode for cleanup operations
🔧 Graceful Error Handling: Continues operation even when target resources are missing

Quick Start

Prerequisites

Kubernetes cluster (v1.16+)
kubectl configured
Cluster admin permissions

Installation

Option 1: Using Helm (Recommended)

📦 Helm charts have been migrated to a dedicated repository for better management.

Add the charts repository:

helm repo add cronschedules https://cronschedules.github.io/charts
helm repo update

Install the operator:

helm install cronjob-scale-down-operator cronschedules/cronjob-scale-down-operator

Install with custom values:

helm install cronjob-scale-down-operator cronschedules/cronjob-scale-down-operator \
  --set image.tag=0.3.0 \
  --set webUI.enabled=true \
  --set resources.requests.memory=128Mi

Verify installation:

kubectl get pods -l app.kubernetes.io/name=cronjob-scale-down-operator

📖 Chart Documentation: For detailed Helm chart documentation, values, and configuration options, visit the Charts Repository.

Option 2: Using Container Image

The operator is available as a pre-built container image from multiple registries:

# From Docker Hub:
docker pull cronschedules/cronjob-scale-down-operator:0.3.0

# From GitHub Container Registry:
docker pull ghcr.io/cronschedules/cronjob-scale-down-operator:0.3.0

Use these images in your custom deployments or with the provided Helm chart.

Option 3: Using kubectl

Install the CRDs and operator:

kubectl apply -f config/crd/bases/
kubectl apply -f config/rbac/
kubectl apply -f config/manager/

Quick Test

Create a test deployment:

kubectl apply -f examples/test-deployment.yaml

Apply a scaling schedule:

kubectl apply -f examples/quick-test.yaml

Monitor the scaling:

kubectl get cronjobscaledown -w
kubectl get deployment nginx-test -w

📦 Charts Repository Migration

Important Notice: Helm charts have been migrated to a dedicated repository for better management and hosting.

Migration Details

Previous Location: Charts were located in /charts directory of this repository
New Location: cronschedules/charts repository
Helm Repository URL: https://cronschedules.github.io/charts

Benefits of Migration

✅ Centralized Management: All charts in one dedicated repository
✅ Proper Hosting: GitHub Pages hosting for Helm repository
✅ Improved CI/CD: Better error handling and comprehensive testing
✅ Security: Automated security scanning with Checkov
✅ Standards: Following Helm chart repository best practices
✅ Automation: Fully automated releases and index updates

For Existing Users

If you were using the old chart location, please update your installation:

# Old way (deprecated)
# helm install cronjob-scale-down-operator ./charts/cronjob-scale-down-operator

# New way (recommended)
helm repo add cronschedules https://cronschedules.github.io/charts
helm repo update
helm install cronjob-scale-down-operator cronschedules/cronjob-scale-down-operator

Chart Documentation

For comprehensive chart documentation, configuration options, values, and advanced usage:

Examples

The examples/ directory contains various use cases:

Example	Description	Schedule
quick-test.yaml	Immediate testing	Every minute
basic-daily-schedule.yaml	Production workload	10 PM → 6 AM daily
weekend-shutdown.yaml	Weekend cost savings	Friday 6 PM → Monday 8 AM
development-testing.yaml	Dev environment	Every 30/45 seconds
multi-timezone.yaml	Global deployments	Multiple timezones
statefulset-example.yaml	Database scaling	StatefulSet support
resource-cleanup-example.yaml	Resource cleanup	Combined scaling + cleanup
cleanup-only-example.yaml	Cleanup only	Every 6 hours
orphan-cleanup-example.yaml	Orphan resource cleanup	Every 3 AM daily

Cleanup-Only Mode

The operator supports a cleanup-only mode where it manages resource cleanup without scaling any target resources. This is perfect for environments where you need automated cleanup of test resources, temporary objects, or expired configurations.

When to Use Cleanup-Only Mode

CI/CD Pipelines: Automatically clean up test resources after builds
Development Environments: Remove temporary test objects on a schedule
Resource Management: Clean up expired ConfigMaps, Secrets, or test deployments
Cost Optimization: Remove unused resources to save cluster costs

Cleanup-Only Configuration

apiVersion: cronschedules.elbazi.co/v1
kind: CronJobScaleDown
metadata:
  name: cleanup-only-job
  namespace: default
spec:
  # No targetRef needed for cleanup-only mode
  cleanupSchedule: "0 0 */6 * * *"  # Every 6 hours
  cleanupConfig:
    annotationKey: "test.example.com/cleanup-after"
    resourceTypes:
      - "ConfigMap"
      - "Secret"
      - "Service"
      - "Deployment"
    namespaces:
      - "test"
      - "staging"
    labelSelector:
      environment: "test"
    dryRun: false
  timeZone: "UTC"

Combined Scaling + Cleanup

You can also combine scaling and cleanup in a single resource:

apiVersion: cronschedules.elbazi.co/v1
kind: CronJobScaleDown
metadata:
  name: combined-scaler-cleanup
  namespace: default
spec:
  # Scaling configuration
  targetRef:
    name: my-app
    namespace: default
    kind: Deployment
    apiVersion: apps/v1
  scaleDownSchedule: "0 0 22 * * *"  # Scale down at 10 PM
  scaleUpSchedule: "0 0 6 * * *"     # Scale up at 6 AM
  
  # Cleanup configuration
  cleanupSchedule: "0 0 2 * * *"     # Clean up at 2 AM
  cleanupConfig:
    annotationKey: "cleanup-after"
    resourceTypes: ["ConfigMap", "Secret"]
    dryRun: false
  timeZone: "UTC"

Configuration

CronJobScaleDown Spec

apiVersion: cronschedules.elbazi.co/v1
kind: CronJobScaleDown
metadata:
  name: my-scaler
  namespace: default
spec:
  # Target resource to scale
  targetRef:
    name: my-deployment
    namespace: default
    kind: Deployment  # or StatefulSet
    apiVersion: apps/v1
  
  # When to scale down (cron format with seconds)
  scaleDownSchedule: "0 0 22 * * *"  # 10 PM daily
  
  # When to scale up (optional)
  scaleUpSchedule: "0 0 6 * * *"     # 6 AM daily
  
  # Timezone for schedule interpretation
  timeZone: "UTC"  # or "America/New_York", "Europe/London", etc.

Resource Cleanup Configuration

The operator now supports automatic cleanup of test resources based on annotations:

apiVersion: cronschedules.elbazi.co/v1
kind: CronJobScaleDown
metadata:
  name: test-cleanup
  namespace: default
spec:
  # Optional: You can still use scaling features alongside cleanup
  targetRef:
    name: my-deployment
    namespace: default
    kind: Deployment
    apiVersion: apps/v1
  scaleDownSchedule: "0 0 22 * * *"
  scaleUpSchedule: "0 0 6 * * *"
  
  # Cleanup configuration
  cleanupSchedule: "0 0 2 * * *"  # Run cleanup at 2 AM daily
  cleanupConfig:
    # Annotation key that marks resources for cleanup
    annotationKey: "cleanup-after"
    
    # Resource types to cleanup (now includes RBAC and failed resources)
    resourceTypes:
      - "Deployment"
      - "Service" 
      - "ConfigMap"
      - "Secret"
      - "StatefulSet"
      - "Pod"          # Useful for cleaning up evicted or failed pods
      - "Job"          # Useful for cleaning up failed jobs
      - "Role"
      - "RoleBinding"
      # Note: ClusterRole and ClusterRoleBinding are also supported (cluster-scoped)
    
    # Optional: Limit cleanup to specific namespaces
    namespaces:
      - "test"
      - "lab"
    
    # Optional: Additional label selector
    labelSelector:
      environment: "test"
    
    # NEW: Enable orphan resource cleanup (resources without cleanup annotation)
    cleanupOrphanResources: true
    # NEW: Maximum age for orphan resources before cleanup
    orphanResourceMaxAge: "168h"  # 7 days
    
    # Optional: Enable dry-run mode (default: false)
    dryRun: false
  
  timeZone: "UTC"

Cleanup Time Formats

Resources can be marked for cleanup using various time formats in the annotation value:

Duration: 24h, 7d, 30m (relative to resource creation time)
Absolute time: 2024-12-31T23:59:59Z (RFC3339 format)
Date: 2024-12-31 (cleanup at midnight on that date)
Immediate: Empty value "" (cleanup on next schedule run)

Example Resource with Cleanup Annotation

apiVersion: apps/v1
kind: Deployment
metadata:
  name: test-deployment
  namespace: test
  labels:
    environment: "test"
  annotations:
    cleanup-after: "24h"  # Delete 24 hours after creation
spec:
  # ... deployment spec

Orphan Resource Cleanup

The operator now supports cleaning up "orphan" resources - resources that don't have the cleanup annotation but are old and potentially forgotten. This is useful for environments where resources are created during testing but not properly annotated for cleanup.

How it works:

Enable cleanupOrphanResources: true in the cleanup configuration
Set orphanResourceMaxAge to define how old resources must be before cleanup
Only resources matching the label selector (if specified) will be considered
Resources without the cleanup annotation that are older than the max age will be deleted

Example scenario:

cleanupConfig:
  annotationKey: "cleanup-after"
  resourceTypes: ["ConfigMap", "Role", "RoleBinding"]
  cleanupOrphanResources: true
  orphanResourceMaxAge: "168h"  # 7 days
  labelSelector:
    app.kubernetes.io/managed-by: "test"

This configuration will:

Clean up resources WITH the cleanup-after annotation according to their specified cleanup time
Clean up resources WITHOUT the annotation that are labeled with app.kubernetes.io/managed-by: test and are older than 7 days

Supported Resource Types:

Standard resources: Deployment, StatefulSet, Service, ConfigMap, Secret
Workload resources: Pod, Job (useful for cleaning up failed/evicted resources)
RBAC resources: Role, RoleBinding, ClusterRole, ClusterRoleBinding

Safety considerations:

Orphan cleanup is opt-in (disabled by default)
Always test with dryRun: true first
Use label selectors to limit scope
Set appropriate max age to avoid deleting important resources

Schedule Format

The operator supports 6-field cron expressions with second precision:

┌─────────────second (0 - 59)
│ ┌───────────── minute (0 - 59)
│ │ ┌───────────── hour (0 - 23)
│ │ │ ┌───────────── day of month (1 - 31)
│ │ │ │ ┌───────────── month (1 - 12)
│ │ │ │ │ ┌───────────── day of week (0 - 6) (0 = Sunday)
│ │ │ │ │ │
* * * * * *

Common Schedule Examples

Schedule	Description
`"0 0 22 * * *"`	Every day at 10:00 PM
`"0 0 6 * * 1-5"`	Weekdays at 6:00 AM
`"0 0 18 * * 5"`	Every Friday at 6:00 PM
`"0 0 0 * * 0"`	Every Sunday at midnight
`"/30 * * * *"`	Every 30 seconds (testing)

Supported Timezones

Use standard IANA timezone names:

UTC
America/New_York
Europe/London
Europe/Berlin
Asia/Tokyo
Australia/Sydney

Monitoring

Check CronJobScaleDown Status

kubectl get cronjobscaledown -o wide
kubectl describe cronjobscaledown my-scaler

View Operator Logs

kubectl logs -n cronjob-scale-down-operator-system deployment/cronjob-scale-down-operator-controller-manager

Monitor Target Resources

kubectl get deployment my-deployment -w
kubectl get statefulset my-statefulset -w

Prometheus Metrics

The operator exposes comprehensive Prometheus metrics for monitoring scaling operations, cleanup activities, and system health:

# Access metrics endpoint
kubectl port-forward -n cronjob-scale-down-operator-system deployment/cronjob-scale-down-operator-controller-manager 8443:8443
curl -k https://localhost:8443/metrics

Key Metrics Categories:

Scaling Operations: cronjob_scale_down_operations_total, cronjob_scale_up_operations_total
Cleanup Operations: cronjob_cleaned_resources_total, cronjob_cleanup_operations_total
Resource Status: cronjob_target_resource_replicas_current, cronjob_scaled_down_resources_current
System Health: cronjob_reconciliation_errors_total, cronjob_reconciliation_duration_seconds
Schedules: cronjob_schedule_next_execution_timestamp, cronjob_schedule_last_execution_timestamp

For detailed metrics documentation, PromQL queries, and Grafana dashboard examples, see METRICS.md.

Web UI Dashboard

The operator includes a built-in web dashboard that provides real-time monitoring of all CronJobScaleDown resources and their target deployments/statefulsets.

Accessing the Web UI

By default, the web UI is available at http://localhost:8082 when running the operator locally. In a Kubernetes cluster, you can access it by:

Port forwarding (for development/testing):

kubectl port-forward -n cronjob-scale-down-operator-system deployment/cronjob-scale-down-operator-controller-manager 8082:8082

Then visit http://localhost:8082

Configure ingress (for production):

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: cronjob-scale-down-operator-ui
spec:
  rules:
  - host: cronjob-ui.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: cronjob-scale-down-operator-ui
            port:
              number: 8082

Web UI Features

📊 Real-time Dashboard: Overview of all CronJobScaleDown resources
📈 Status Monitoring: Current state of target deployments and statefulsets
🕒 Schedule Information: View scale-up/down schedules and timezones
📋 Replica Status: Visual indicators for ready vs desired replicas
📅 Action History: Timestamps of last scale operations
🔄 Auto-refresh: Updates every 30 seconds automatically
📱 Responsive Design: Works on desktop, tablet, and mobile

Customizing Web UI Port

You can customize the web UI port using the --webui-addr flag:

./manager --webui-addr=:8080

For more details about the web UI, see the Web UI Documentation.

Development

Building from Source

# Clone the repository
git clone https://github.com/z4ck404/cronjob-scale-down-operator.git
cd cronjob-scale-down-operator

# Build and run locally
make run

# Build Docker image
make docker-build IMG=my-registry/cronjob-scale-down-operator:latest

# Deploy to cluster
make deploy IMG=my-registry/cronjob-scale-down-operator:latest

Running Tests

# Unit tests
make test

# End-to-end tests
make test-e2e

Troubleshooting

Common Issues

Scaling not happening:
- Check timezone configuration
- Verify cron schedule syntax
- Check operator logs for errors
Permission errors:
- Ensure RBAC is properly configured
- Verify service account permissions
Target resource not found:
- Check namespace and resource name
- Verify resource exists and is accessible

Debug Commands

# Check CRD installation
kubectl get crd cronjobscaledowns.cronschedules.elbazi.co

# Verify operator deployment
kubectl get deployment -n cronjob-scale-down-operator-system

# Check events
kubectl get events --sort-by=.metadata.creationTimestamp

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests
Submit a pull request

License

Licensed under the Apache License, Version 2.0. See LICENSE for details.

Support

📧 Issues: GitHub Issues
📖 Documentation: docs/
💬 Discussions: GitHub Discussions

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
.devcontainer		.devcontainer
.github		.github
api/v1		api/v1
cmd		cmd
config		config
dist		dist
docs		docs
examples		examples
hack		hack
internal		internal
test		test
web/static		web/static
.dockerignore		.dockerignore
.gitignore		.gitignore
.golangci.yml		.golangci.yml
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
Makefile		Makefile
PROJECT		PROJECT
README.md		README.md
SECURITY.md		SECURITY.md
VERSION		VERSION
go.mod		go.mod
go.sum		go.sum

cronschedules/cronjob-scale-down-operator

Folders and files

Latest commit

History

Repository files navigation

CronJob-Scale-Down-Operator

Features

Quick Start

Prerequisites

Installation

Option 1: Using Helm (Recommended)

Option 2: Using Container Image

Option 3: Using kubectl

Quick Test

📦 Charts Repository Migration

Migration Details

Benefits of Migration

For Existing Users

Chart Documentation

Examples

Cleanup-Only Mode

When to Use Cleanup-Only Mode

Cleanup-Only Configuration

Combined Scaling + Cleanup

Configuration

CronJobScaleDown Spec

Resource Cleanup Configuration

Cleanup Time Formats

Example Resource with Cleanup Annotation

Orphan Resource Cleanup

Schedule Format

Common Schedule Examples

Supported Timezones

Monitoring

Check CronJobScaleDown Status

View Operator Logs

Monitor Target Resources

Prometheus Metrics

Web UI Dashboard

Accessing the Web UI

Web UI Features

Customizing Web UI Port

Development

Building from Source

Running Tests

Troubleshooting

Common Issues

Debug Commands

Contributing

License

Support

About

Topics

Resources

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 11

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages