This guide covers disk space management for Neo4j backups in Kubernetes environments.
Neo4j backups can consume significant disk space, especially in production environments with:
- Large databases
- Frequent backup schedules
- Multiple backup types (FULL, DIFF, AUTO)
- Long retention policies
The backup sidecar container automatically manages disk space with configurable retention policies:
apiVersion: neo4j.neo4j.com/v1alpha1
kind: Neo4jEnterpriseCluster
metadata:
name: production-cluster
spec:
# ... other configuration ...
podTemplate:
spec:
containers:
- name: backup-sidecar
env:
- name: BACKUP_RETENTION_DAYS
value: "14" # Keep backups for 14 days
- name: BACKUP_RETENTION_COUNT
value: "20" # Keep maximum 20 backupsDefault retention settings:
BACKUP_RETENTION_DAYS: 7 daysBACKUP_RETENTION_COUNT: 10 backups
The sidecar automatically:
- Removes backups older than retention days
- Keeps only the most recent N backups
- Runs cleanup before and after each backup
For test environments or emergency cleanup:
# Run the cleanup script
./hack/cleanup-test-resources.sh
# What it does:
# - Removes completed jobs older than 1 hour
# - Deletes failed and evicted pods
# - Identifies orphaned PVCs
# - Shows disk usage by namespace
# - Cleans Docker system (for Kind clusters)Check disk usage:
# Check PV usage
kubectl get pv -o custom-columns=NAME:.metadata.name,CAPACITY:.spec.capacity.storage,CLAIM:.spec.claimRef.name
# Check node disk usage
kubectl describe nodes | grep -A5 "Allocated resources:"
# Check specific PVC usage
kubectl exec <neo4j-pod> -- df -h /dataClean up old backups manually:
# Delete backups older than 7 days
kubectl exec <neo4j-pod> -c backup-sidecar -- \
find /data/backups -maxdepth 1 -type d -mtime +7 -exec rm -rf {} \;
# Keep only 5 most recent backups
kubectl exec <neo4j-pod> -c backup-sidecar -- bash -c \
'cd /data/backups && ls -t | tail -n +6 | xargs -r rm -rf'Calculate required storage:
Required Storage = Database Size × Backup Compression Ratio × Number of Retained Backups × Safety Factor
Example:
- Database Size: 100GB
- Compression Ratio: 0.3 (70% compression)
- Retained Backups: 10
- Safety Factor: 1.5
- Required: 100GB × 0.3 × 10 × 1.5 = 450GB
Optimize backup types:
# Daily full backups with short retention
apiVersion: neo4j.neo4j.com/v1alpha1
kind: Neo4jBackup
metadata:
name: daily-full
spec:
schedule: "0 2 * * *" # 2 AM daily
options:
backupType: FULL
compress: true
retention:
maxAge: "3d"
maxCount: 3
# Hourly differential backups
apiVersion: neo4j.neo4j.com/v1alpha1
kind: Neo4jBackup
metadata:
name: hourly-diff
spec:
schedule: "0 * * * *" # Every hour
options:
backupType: DIFF
compress: true
retention:
maxAge: "1d"
maxCount: 24Set up alerts for disk usage:
# Prometheus alert example
groups:
- name: neo4j-backups
rules:
- alert: BackupDiskSpaceHigh
expr: |
(1 - (node_filesystem_avail_bytes{mountpoint="/data"} /
node_filesystem_size_bytes{mountpoint="/data"})) > 0.8
for: 10m
annotations:
summary: "Backup disk usage above 80%"For production, consider external storage:
apiVersion: neo4j.neo4j.com/v1alpha1
kind: Neo4jBackup
metadata:
name: s3-backup
spec:
storage:
type: s3
bucket: my-neo4j-backups
path: production/cluster-1
retention:
maxAge: "30d" # S3 lifecycle policies handle cleanupapiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: backup-pvc
spec:
accessModes:
- ReadWriteOnce
storageClassName: fast-ssd # Use appropriate storage class
resources:
requests:
storage: 500GiSymptoms:
java.io.IOException: No space left on device
Quick fixes:
- Run cleanup script:
./hack/cleanup-test-resources.sh - Delete old backups:
kubectl exec <pod> -c backup-sidecar -- rm -rf /data/backups/old-* - Increase PVC size (if storage class supports expansion)
-
Set appropriate retention policies
env: - name: BACKUP_RETENTION_DAYS value: "3" # Shorter for test environments - name: BACKUP_RETENTION_COUNT value: "5" # Fewer backups for test
-
Use compressed backups
options: compress: true # Reduces backup size by 60-80%
-
Monitor disk usage proactively
# Add to monitoring scripts kubectl exec <pod> -- df -h /data | awk '$5+0 > 80 {print "WARNING: " $0}'
Effective disk space management requires:
- Automatic cleanup via sidecar retention policies
- Regular monitoring of disk usage
- Appropriate backup strategies (FULL vs DIFF)
- External storage for production environments
- Proactive cleanup in test environments
The backup sidecar's built-in cleanup functionality handles most scenarios automatically, but manual intervention may be needed for test environments or exceptional situations.