Skip to content

[Improvement][Worker-monitoring] Add disk usage monitoring for data.basedir.path directory #17670

@dill21yu

Description

@dill21yu

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Description

Current Behavior:
DolphinScheduler's current disk monitoring mechanism (max-disk-usage-percentage-thresholds) only monitors the disk usage of the entire disk partition, rather than monitoring the specific data.basedir.path directory
Problem Scenario:
The data.basedir.path directory (configurable in common.properties, e.g., /tmp/dolphinscheduler) stores task scripts and temporary files. When it resides on a separate disk partition, the current monitoring cannot detect its usage. If the disk gets full, tasks fail to write commands, causing execution failures.

Optimization Point:
Add independent disk usage monitoring for the data.basedir.path directory
Provide configuration options to set disk usage thresholds for this directory
Trigger overload protection when the threshold is exceeded, refusing to accept new tasks
Expose disk usage metrics for this directory through Prometheus metrics for external monitoring and alerting

worker:
server-load-protection:
enabled: true
max-disk-usage-percentage-thresholds: 0.8 # Overall disk
max-data-basedir-disk-usage-percentage-thresholds: 0.8 # data.basedir.path directory

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions