-
Notifications
You must be signed in to change notification settings - Fork 81
Add Cron VolcanoJob Concept #431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+429
−0
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,143 @@ | ||
| +++ | ||
| title = "Cron VolcanoJob" | ||
| date = 2025-11-19 | ||
| lastmod = 2025-11-19 | ||
|
|
||
| draft = false | ||
| toc = true | ||
| type = "docs" | ||
|
|
||
| linktitle = "Cron VolcanoJob" | ||
| [menu.docs] | ||
| parent = "concepts" | ||
| weight = 4 | ||
| +++ | ||
|
|
||
| ### Introduction | ||
| Cron VolcanoJob, also known as cronvcjob or cronvj, is a custom resource type in Volcano. Users can now periodically create and run Volcano Jobs based on predefined schedules, similar to Kubernetes native CronJobs, enabling scheduled execution of batch computing tasks (such as AI and big data workloads). | ||
| ### Example | ||
| ```shell | ||
| apiVersion: batch.volcano.sh/v1alpha1 | ||
| kind: CronJob | ||
| metadata: | ||
| name: volcano-cronjob-example | ||
| spec: | ||
| schedule: "*/5 * * * *" | ||
| concurrencyPolicy: Forbid | ||
| startingDeadlineSeconds: 60 | ||
| successfulJobsHistoryLimit: 5 | ||
| failedJobsHistoryLimit: 3 | ||
| jobTemplate: | ||
| spec: | ||
| schedulerName: volcano | ||
| tasks: | ||
| - replicas: 1 | ||
| name: "task-1" | ||
| template: | ||
| spec: | ||
| containers: | ||
| - name: busybox-container | ||
| image: busybox:latest | ||
| command: ["/bin/sh", "-c", "date; echo Hello from Volcano CronJob"] | ||
| restartPolicy: OnFailure | ||
| policies: | ||
| - event: PodEvicted | ||
| action: RestartJob | ||
| minAvailable: 1 | ||
| ``` | ||
| View Cron VolcanoJob | ||
| ```shell | ||
| kubectl get cronvcjob | ||
| ``` | ||
| View created job instances | ||
| ```shell | ||
| kubectl get vcjob | ||
| ``` | ||
| ### Key Fields | ||
| * schedule | ||
|
|
||
| Required. The cron schedule string for Volcano Job execution. Uses standard cron format. | ||
|
|
||
| * timeZone | ||
|
|
||
| Optional. The time zone name for the schedule. Defaults to the local time zone of the kube-controller-manager. | ||
|
|
||
| * concurrencyPolicy | ||
|
|
||
| Optional. Specifies how to manage concurrent executions of jobs created by the Cron VolcanoJob. Must be one of the following: | ||
| * Allow (default): Allow concurrent runs | ||
| * Forbid: Skip new run if previous job hasn‘t completed | ||
| * Replace: Cancel currently running job and start new | ||
|
|
||
| <!-- --> | ||
|
|
||
| * startingDeadlineSeconds | ||
|
|
||
| Optional. Deadline in seconds for starting the job if it misses its scheduled time. | ||
|
|
||
| * suspend | ||
|
|
||
| Optional. If set to true, all subsequent executions will be suspended. | ||
|
|
||
| * jobTemplate | ||
|
|
||
| Required. The template for creating Volcano Jobs. Contains the complete Volcano Job specification. | ||
|
|
||
| * successfulJobsHistoryLimit | ||
|
|
||
| Optional. Number of successful finished jobs to retain. Defaults to 3. | ||
|
|
||
| * failedJobsHistoryLimit | ||
|
|
||
| Optional. Number of failed finished jobs to retain. Defaults to 1. | ||
|
|
||
| <!-- --> | ||
|
|
||
| ### Usage | ||
| * Periodic Model Training | ||
|
|
||
| Automatically start distributed model training tasks daily during off-peak hours, utilizing cluster idle time for large-scale machine learning training. | ||
| ```shell | ||
| apiVersion: batch.volcano.sh/v1alpha1 | ||
| kind: CronJob | ||
| metadata: | ||
| name: daily-model-training | ||
| spec: | ||
| schedule: "0 2 * * *" # Run daily at 2 AM | ||
| concurrencyPolicy: Forbid | ||
| jobTemplate: | ||
| spec: | ||
| minAvailable: 4 | ||
| schedulerName: volcano | ||
| tasks: | ||
| - replicas: 1 | ||
| name: ps | ||
| template: | ||
| # Parameter server configuration | ||
| - replicas: 3 | ||
| name: worker | ||
| template: | ||
| # Training worker configuration | ||
| ``` | ||
|
|
||
| * Scheduled Resource Cleanup | ||
|
|
||
| Clean up temporary data and log files every Sunday evening to free up cluster storage space. | ||
| ```shell | ||
| apiVersion: batch.volcano.sh/v1alpha1 | ||
| kind: CronJob | ||
| metadata: | ||
| name: weekly-cleanup | ||
| spec: | ||
| schedule: "0 22 * * 0" # Run every Sunday at 10 PM | ||
| timeZone: "Asia/Shanghai" | ||
| jobTemplate: | ||
| spec: | ||
| minAvailable: 1 | ||
| schedulerName: volcano | ||
| tasks: | ||
| - replicas: 1 | ||
| name: cleanup | ||
| template: | ||
| # Cleanup task container configuration | ||
| ``` | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,143 @@ | ||
| +++ | ||
| title = "Cron VolcanoJob" | ||
| date = 2025-11-19 | ||
| lastmod = 2025-11-19 | ||
|
|
||
| draft = false | ||
| toc = true | ||
| type = "docs" | ||
|
|
||
| linktitle = "Cron VolcanoJob" | ||
| [menu.docs] | ||
| parent = "concepts" | ||
| weight = 4 | ||
| +++ | ||
|
|
||
| ### 定义 | ||
| Cron VolcanoJob, 简称cronvcjob,cronvj,是Volcano自定义的资源类型。用户现在可以根据预定义的调度计划定期创建和运行Volcano Job,类似于Kubernetes原生的CronJob,以实现批量计算任务(如AI和大数据)的定期执行。 | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| ### 样例 | ||
| ```shell | ||
| apiVersion: batch.volcano.sh/v1alpha1 | ||
| kind: CronJob | ||
| metadata: | ||
| name: volcano-cronjob-example | ||
| spec: | ||
| schedule: "*/5 * * * *" | ||
| concurrencyPolicy: Forbid | ||
| startingDeadlineSeconds: 60 | ||
| successfulJobsHistoryLimit: 5 | ||
| failedJobsHistoryLimit: 3 | ||
| jobTemplate: | ||
| spec: | ||
| schedulerName: volcano | ||
| tasks: | ||
| - replicas: 1 | ||
| name: "task-1" | ||
| template: | ||
| spec: | ||
| containers: | ||
| - name: busybox-container | ||
| image: busybox:latest | ||
| command: ["/bin/sh", "-c", "date; echo Hello from Volcano CronJob"] | ||
| restartPolicy: OnFailure | ||
| policies: | ||
| - event: PodEvicted | ||
| action: RestartJob | ||
| minAvailable: 1 | ||
| ``` | ||
| 查看 Cron VolcanoJob | ||
| ```shell | ||
| kubectl get cronvcjob | ||
| ``` | ||
| 查看创建的 job 实例 | ||
| ```shell | ||
| kubectl get vcjob | ||
| ``` | ||
| ### 关键字段 | ||
| * schedule | ||
|
|
||
| 必需。用于volcano job 执行的 cron 计划字符串。使用标准 cron 格式。 | ||
|
|
||
| * timeZone | ||
|
|
||
| 可选。调度计划的时区名称。默认为 kube-controller-manager 的本地时区。 | ||
|
|
||
| * concurrencyPolicy | ||
|
|
||
| 可选。指定如何管理 Cron VolcanoJob 创建的 job 的并发执行。为下列规则中的一种: | ||
| * Allow(默认):允许并发运行 | ||
| * Forbid:禁止并发运行,跳过新周期的执行 | ||
| * Replace:取消当前运行的 job,并用新的 job 替换它 | ||
|
|
||
| <!-- --> | ||
|
|
||
| * startingDeadlineSeconds | ||
|
|
||
| 可选。如果 job 错过其计划时间,启动 job 的截止时间(秒)。 | ||
|
|
||
| * suspend | ||
|
|
||
| 可选。如果设置为 true,所有后续执行将被暂停。 | ||
|
|
||
| * jobTemplate | ||
|
|
||
| 必需。用于创建 Volcano Job 的模板。包含完整的 Volcano Job 规范。 | ||
|
|
||
| * successfulJobsHistoryLimit | ||
|
|
||
| 可选。要保留的成功完成 job 的数量。默认为 3。 | ||
|
|
||
| * failedJobsHistoryLimit | ||
|
|
||
| 可选。要保留的失败完成 job 的数量。默认为 1。 | ||
|
|
||
| <!-- --> | ||
|
|
||
| ### 使用场景 | ||
| * 定期模型训练 | ||
|
|
||
| 每天凌晨自动启动分布式模型训练任务,利用集群空闲时段进行大规模机器学习训练。 | ||
| ```shell | ||
| apiVersion: batch.volcano.sh/v1alpha1 | ||
| kind: CronJob | ||
| metadata: | ||
| name: daily-model-training | ||
| spec: | ||
| schedule: "0 2 * * *" # 每天凌晨2点运行 | ||
| concurrencyPolicy: Forbid | ||
| jobTemplate: | ||
| spec: | ||
| minAvailable: 4 | ||
| schedulerName: volcano | ||
| tasks: | ||
| - replicas: 1 | ||
| name: ps | ||
| template: | ||
| # 参数服务器配置 | ||
| - replicas: 3 | ||
| name: worker | ||
| template: | ||
| # 训练worker配置 | ||
| ``` | ||
|
|
||
| * 定时资源清理 | ||
|
|
||
| 每周日晚上清理临时数据和日志文件,释放集群存储空间。 | ||
| ```shell | ||
| apiVersion: batch.volcano.sh/v1alpha1 | ||
| kind: CronJob | ||
| metadata: | ||
| name: weekly-cleanup | ||
| spec: | ||
| schedule: "0 22 * * 0" # 每周日22点运行 | ||
| timeZone: "Asia/Shanghai" | ||
| jobTemplate: | ||
| spec: | ||
| minAvailable: 1 | ||
| schedulerName: volcano | ||
| tasks: | ||
| - replicas: 1 | ||
| name: cleanup | ||
| template: | ||
| # 清理任务容器配置 | ||
| ``` | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid confusion with the native Kubernetes
CronJob, it would be helpful to add a note clarifying that this is a Volcano-specific custom resource. For example, you could add:'Note: While the resource
kindisCronJob, it is a custom resource defined by Volcano under thebatch.volcano.sh/v1alpha1API group, distinct from the native KubernetesCronJob. It can be managed usingkubectlwith its short namescronvcjoborcronvj.'