Skip to content

[Autoscaler] Support environment variable configuration for log rotation and deduplicate label deprecation warnings#63955

Open
daiping8 wants to merge 3 commits into
ray-project:masterfrom
daiping8:autoscaler_log
Open

[Autoscaler] Support environment variable configuration for log rotation and deduplicate label deprecation warnings#63955
daiping8 wants to merge 3 commits into
ray-project:masterfrom
daiping8:autoscaler_log

Conversation

@daiping8

@daiping8 daiping8 commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Motivation

This PR addresses two operational issues in the KubeRay autoscaler:

  1. Log Rotation Flexibility: Operators need to customize log rotation settings based on their deployment environment's disk space and retention policies. Hardcoded values don't work well across different environments (development vs production, small vs large clusters).

  2. Log Noise Reduction: The repeated deprecation warning for rayStartParams.labels clutters logs, making it harder to identify genuine issues. Since KubeRay v1.5+ recommends using the top-level Labels field, the warning should inform users once without spamming.

Implementation Details

Files Modified

  • python/ray/autoscaler/_private/kuberay/run_autoscaler.py
  • python/ray/autoscaler/_private/kuberay/autoscaling_config.py

Changes

1. Environment Variable Configuration for Log Rotation

File: run_autoscaler.py

Added support for RAY_ROTATION_MAX_BYTES and RAY_ROTATION_BACKUP_COUNT environment variables:

# Before (hardcoded)
setup_component_logger(
    max_bytes=LOGGING_ROTATE_BYTES,
    backup_count=LOGGING_ROTATE_BACKUP_COUNT,
)

# After (environment variable override)
max_bytes = int(os.getenv("RAY_ROTATION_MAX_BYTES", LOGGING_ROTATE_BYTES))
backup_count = int(os.getenv("RAY_ROTATION_BACKUP_COUNT", LOGGING_ROTATE_BACKUP_COUNT))
setup_component_logger(
    max_bytes=max_bytes,
    backup_count=backup_count,
)

Usage Example:

# In RayCluster CR
spec:
  headGroupSpec:
    rayStartParams:
      RAY_ROTATION_MAX_BYTES: "52428800"  # 50MB
      RAY_ROTATION_BACKUP_COUNT: "10"     # Keep 10 backups

2. Deduplicate Label Deprecation Warning

File: autoscaling_config.py

Added log_once() to ensure the warning is printed only once:

# Before (repeated on every iteration)
if labels_str:
    logger.warning(...)

# After (printed once)
if labels_str and log_once("raystartparams_labels_warning"):
    logger.warning(...)

The log_once() function from ray.util.debug uses an internal flag to ensure the message is logged only the first time the condition is met.

Breaking Changes

None. This is a backward-compatible enhancement:

  • Environment variables are optional (fallback to existing constants)
  • Warning behavior is reduced (from repeated to once), which is an improvement

Verification

Test Log Rotation Configuration

  1. Deploy a RayCluster with custom environment variables:
    kubectl set env deployment/raycluster-kuberay-head \
      RAY_ROTATION_MAX_BYTES=10485760 \
      RAY_ROTATION_BACKUP_COUNT=5
  2. Generate log activity and verify rotation occurs at 10MB instead of default
  3. Verify only 5 backup files are retained

Test Label Warning Deduplication

  1. Deploy a RayCluster with deprecated rayStartParams.labels:
    workerGroupSpecs:
    - rayStartParams:
        labels: "{\"app\": \"myapp\"}"
  2. Wait for multiple autoscaler iterations (e.g., 1 minute)
  3. Check autoscaler logs:
    kubectl logs <head-pod> -c autoscaler | grep "Ignoring labels"
  4. Expected: Warning appears only once
  5. Previous behavior: Warning appeared every 5 seconds

Unit Tests

No new tests added - this is a configuration enhancement that doesn't change core autoscaler logic. Existing tests should continue to pass.

Related Issues

Close #63954

@daiping8 daiping8 requested a review from a team as a code owner June 9, 2026 13:03

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request limits the warning about ignored labels in KubeRay to log only once and introduces environment variables to configure log rotation parameters. A review comment suggests defensively parsing these environment variables to prevent potential startup crashes due to malformed values.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread python/ray/autoscaler/_private/kuberay/run_autoscaler.py Outdated

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit 0511383. Configure here.

Comment thread python/ray/autoscaler/_private/kuberay/autoscaling_config.py Outdated
…ion and deduplicate label deprecation warnings

Signed-off-by: daiping8 <dai.ping88@zte.com.cn>
Signed-off-by: daiping8 <dai.ping88@zte.com.cn>
@ray-gardener ray-gardener Bot added core Issues that should be addressed in Ray Core observability Issues related to the Ray Dashboard, Logging, Metrics, Tracing, and/or Profiling community-contribution Contributed by the community labels Jun 9, 2026
@rueian rueian self-assigned this Jun 10, 2026
@rueian rueian added the go add ONLY when ready to merge, run all tests label Jun 10, 2026
Signed-off-by: daiping8 <dai.ping88@zte.com.cn>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-contribution Contributed by the community core Issues that should be addressed in Ray Core go add ONLY when ready to merge, run all tests observability Issues related to the Ray Dashboard, Logging, Metrics, Tracing, and/or Profiling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Autoscaler] Support Environment Variable Configuration for Autoscaler Log Rotation and Reduce Label Warning Noise

2 participants