Skip to content

Inconsistency in Volume Mounts Between oap-deployment.yaml and oap-job.yaml Causes Init Job to Fail#163

Merged
kezhenxu94 merged 2 commits intoapache:masterfrom
NoodlesWang2024:patch-1
Jul 31, 2025
Merged

Inconsistency in Volume Mounts Between oap-deployment.yaml and oap-job.yaml Causes Init Job to Fail#163
kezhenxu94 merged 2 commits intoapache:masterfrom
NoodlesWang2024:patch-1

Conversation

@NoodlesWang2024
Copy link
Copy Markdown
Contributor

@NoodlesWang2024 NoodlesWang2024 commented Jul 31, 2025

Bug Description
We've encountered an issue where the OAP server fails to start in init mode when custom Kubernetes metrics are enabled. The same configuration works perfectly for the server in no-init mode (the standard deployment).

The root cause appears to be an inconsistency in how the custom metrics files are mounted between the init Job and the standard Deployment.

Steps to Reproduce:

Give a config like:

otel-rules:
      k8s:
        custom-k8s-deployment-rules.yaml:
           ***

Install the helm.

Expected Behavior:
Both the init mode OAP server (Job) and the no-init mode OAP server (Deployment) start successfully.

Actual Behavior:
The standard OAP server pods (from oap-deployment.yaml) run successfully. However, the init-mode pod (from the Job template) fails with the following error, as it cannot find the mounted metric files:

org.apache.skywalking.oap.server.core.UnexpectedException: Some configuration files of enabled rules are not found, enabled rules: [k8s/*{.yaml,.yml}]

Analysis:
Upon inspecting the Helm templates, I found that the volume mount configurations in the OAP Job template differ from those in the oap-deployment.yaml.

I believe these configurations should be made consistent to ensure that both the Job and the Deployment pods could run successful.

Make this consistent with oap-deployment.yaml
@wu-sheng wu-sheng requested a review from kezhenxu94 July 31, 2025 09:28
@wu-sheng wu-sheng added this to the 4.8.0 milestone Jul 31, 2025
@wu-sheng wu-sheng added the bug Something isn't working label Jul 31, 2025
Copy link
Copy Markdown
Member

@kezhenxu94 kezhenxu94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, can you please move the volumes and volumes and volumeMounts sections to helper file and use it in both the init job and the deployment, this can avoid future issue when adding new volumes

@NoodlesWang2024
Copy link
Copy Markdown
Contributor Author

Thanks, can you please move the volumes and volumes and volumeMounts sections to helper file and use it in both the init job and the deployment, this can avoid future issue when adding new volumes

Make sense, Let me do it! ^~^

@NoodlesWang2024
Copy link
Copy Markdown
Contributor Author

Thanks, can you please move the volumes and volumes and volumeMounts sections to helper file and use it in both the init job and the deployment, this can avoid future issue when adding new volumes

I made it, pls review again!

Copy link
Copy Markdown
Member

@kezhenxu94 kezhenxu94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@kezhenxu94 kezhenxu94 merged commit e55e80e into apache:master Jul 31, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants