diff --git a/content/en/docs/architecture.md b/content/en/docs/architecture.md index 0294d27e..edbaf424 100644 --- a/content/en/docs/architecture.md +++ b/content/en/docs/architecture.md @@ -16,28 +16,379 @@ linktitle = "Architecture" weight = 2 +++ -## Overall Architecture +## Overview +{{
}} -{{
}} +Volcano is naturally compatible with Kubernetes, following its design philosophy and style while extending Kubernetes' native capabilities to provide comprehensive support for high-performance workloads such as machine learning, big data applications, scientific computing, and special effects rendering. Its architectural design fully considers scalability, high performance, and ease of use, built upon years of experience running various high-performance workloads at scale and incorporating best practices from the open-source community. +{{
}} -Volcano is designed for high-performance workloads running on Kubernetes. It follows the design and mechanisms of Kubernetes. +The core architecture consists of four main components: The Scheduler, as the system core, implements advanced features such as Gang scheduling and heterogeneous device scheduling through pluggable Actions and Plugins, providing fine-grained resource allocation for batch jobs; the ControllerManager is responsible for managing CRD resource lifecycles, containing three controllers—Queue, PodGroup, and VCJob—that respectively manage queue resources, Pod groups, and Volcano Jobs; the Admission component validates CRD API resources to ensure job configurations meet system requirements; while Vcctl serves as a command-line client tool, providing a friendly interface for managing and monitoring resources and jobs. +Volcano's layered architectural design enables it to seamlessly connect with mainstream computing frameworks such as Spark, TensorFlow, PyTorch, and Flink, while providing unified scheduling capabilities. Its modular design allows users to extend functionality according to their needs by adding custom scheduling strategies and resource management capabilities. Through this architectural design, Volcano achieves efficient resource utilization, precise job scheduling, and reliable system operation, providing solid infrastructure support for high-performance computing and large-scale batch processing. -{{
}} +## Component Introduction +Volcano consists of the following components: -Volcano consists of **scheduler** / **controllermanager** / **admission** / **vcctl**: +- **Scheduler**: Schedules jobs and matches the most suitable nodes through a series of actions and plugins. Unlike the Kubernetes native scheduler, it supports various job-specific scheduling algorithms. +- **Controller**: Manages the lifecycle of CRD resources, composed of multiple controllers. +- **Admission**: Responsible for validating CRD API resources. +- **Vcctl**: Volcano's command-line client tool. +- **Agent**: A component running on nodes responsible for resource monitoring and oversubscription management. It improves cluster resource utilization by identifying idle resources and allowing reasonable overcommitment. +- **Network-qos**: Manages network bandwidth allocation between online and offline workloads, ensuring network quality for online services while maximizing cluster bandwidth utilization. -##### Scheduler -Volcano Scheduler schedules jobs to the most suitable node based on actions and plug-ins. Volcano supplements Kubernetes to support multiple scheduling algorithms for jobs. +### Scheduler -##### ControllerManager (CM) -Volcano CMs manage the lifecycle of Custom Resource Definitions (CRDs). You can use the **Queue CM**, **PodGroup CM**, and **VCJob CM**. +#### Introduction -##### Admission -Volcano Admission is responsible for the CRD API validation. +The Volcano scheduler is a highly configurable and extensible Kubernetes scheduler designed specifically for handling complex workloads and special scheduling requirements. It provides advanced scheduling capabilities beyond the default Kubernetes scheduler, making it particularly suitable for high-performance computing, machine learning, and big data workloads. -##### vcctl -Volcano vcctl is the command line client for Volcano. +The scheduler operates by processing Pods whose `.spec.schedulerName` matches the configured scheduler name (default is "volcano"). It works in scheduling cycles, evaluating unscheduled Pods and finding the optimal node placement according to various scheduling policies and plugins. + +The Volcano scheduler supports custom plugin extensions, allows defining scheduling strategies through configuration files, and provides rich metrics and health check functionalities to ensure the reliability and observability of the scheduling system. + +#### Parameters + +##### Kubernetes Parameters + +| Parameter Name | Description | Default | Example | +| ---------------- | ---------------------------------------- | ------- | ----------------------------------------- | +| `master` | Kubernetes API server address | - | `--master=https://kubernetes.default.svc` | +| `kubeconfig` | Path to kubeconfig file | - | `--kubeconfig=/etc/kubernetes/admin.conf` | +| `kube-api-qps` | QPS limit for communication with K8s API | 2000.0 | `--kube-api-qps=1000` | +| `kube-api-burst` | Burst limit for K8s API communication | 2000 | `--kube-api-burst=1000` | + +##### TLS Certificate Parameters + +| Parameter Name | Description | Default | Example | +| ---------------------- | -------------------------------------------- | ------- | --------------------------------------------- | +| `ca-cert-file` | x509 certificate file for HTTPS | - | `--ca-cert-file=/etc/volcano/ca.crt` | +| `tls-cert-file` | Default x509 certificate file for HTTPS | - | `--tls-cert-file=/etc/volcano/tls.crt` | +| `tls-private-key-file` | x509 private key file matching tls-cert-file | - | `--tls-private-key-file=/etc/volcano/tls.key` | + +##### Scheduler Configuration Parameters + +| Parameter Name | Description | Default | Example | +| ----------------- | --------------------------------------------------- | --------- | ---------------------------------------------- | +| `scheduler-name` | .spec.SchedulerName for Pods handled by Volcano | "volcano" | `--scheduler-name=volcano-scheduler` | +| `scheduler-conf` | Absolute path to scheduler configuration file | - | `--scheduler-conf=/etc/volcano/scheduler.conf` | +| `schedule-period` | Time interval between scheduling cycles | 1s | `--schedule-period=2s` | +| `default-queue` | Default queue name for jobs | "default" | `--default-queue=system` | +| `priority-class` | Enable PriorityClass for pod group level preemption | true | `--priority-class=false` | + +##### Node Selection and Scoring Parameters + +| Parameter Name | Description | Default | Example | +| ---------------------------------- | ------------------------------------------------------------ | ------- | ------------------------------------------------------------ | +| `minimum-feasible-nodes` | Minimum number of feasible nodes to find and score | 100 | `--minimum-feasible-nodes=50` | +| `minimum-percentage-nodes-to-find` | Minimum percentage of nodes to find and score | 5 | `--minimum-percentage-nodes-to-find=10` | +| `percentage-nodes-to-find` | Percentage of nodes to score in each cycle; if <=0, adaptive percentage based on cluster size | 0 | `--percentage-nodes-to-find=20` | +| `node-selector` | Volcano only processes nodes with specified labels | - | `--node-selector=volcano.sh/role:train --node-selector=volcano.sh/role:serving` | +| `node-worker-threads` | Number of threads for synchronizing node operations | 20 | `--node-worker-threads=30` | + +##### Plugin and Storage Parameters + +| Parameter Name | Description | Default | Example | +| ---------------------- | ------------------------------------------------------------ | ------- | ------------------------------------------------------------ | +| `plugins-dir` | Directory of custom plugins to load (but not activate) | "" | `--plugins-dir=/etc/volcano/plugins` | +| `csi-storage` | Enable tracking available storage capacity provided by CSI drivers | false | `--csi-storage=true` | +| `ignored-provisioners` | List of storage provisioners to ignore during pod PVC request and preemption calculations | - | `--ignored-provisioners=rancher.io/local-path --ignored-provisioners=hostpath.csi.k8s.io` | + +##### Monitoring and Health Check Parameters + +| Parameter Name | Description | Default | Example | +| ----------------- | -------------------------------------- | -------- | -------------------------- | +| `enable-healthz` | Enable health check | false | `--enable-healthz=true` | +| `healthz-address` | Listen address for health check server | ":11251" | `--healthz-address=:11252` | +| `enable-metrics` | Enable metrics functionality | false | `--enable-metrics=true` | +| `listen-address` | Listen address for HTTP requests | ":8080" | `--listen-address=:8081` | + +##### Cache and Debug Parameters + +| Parameter Name | Description | Default | Example | +| ---------------- | ----------------------------------------------- | ------- | ----------------------------------- | +| `cache-dumper` | Enable cache dumper | true | `--cache-dumper=false` | +| `cache-dump-dir` | Target directory for dumping cache info to JSON | "/tmp" | `--cache-dump-dir=/var/log/volcano` | +| `version` | Display version and exit | false | `--version` | + +##### Leader Election Parameters + +| Parameter Name | Description | Default | Example | +| ----------------------- | ------------------------------------------------------------ | ---------------- | ------------------------------------- | +| `lock-object-namespace` | Namespace of lock object (deprecated, use --leader-elect-resource-namespace) | "volcano-system" | `--lock-object-namespace=kube-system` | + +### Controller + +#### Introduction + +Volcano Controller is a core component of Volcano, responsible for managing and coordinating batch jobs and resources in Kubernetes clusters. It adopts a multi-controller architecture, including job controller, queue controller, PodGroup controller, garbage collection controller, and other specialized modules that collectively handle different types of resource objects and business logic. + +Compared to traditional Kubernetes controllers, Volcano Controller provides richer batch job management capabilities, supporting job lifecycle management, resource queue maintenance, PodGroup coordination, and other advanced features, making it particularly suitable for high-performance computing and machine learning scenarios. Its framework-based design achieves good extensibility, allowing users to enable or disable specific controllers according to their needs. + +Volcano Controller deeply integrates with Kubernetes, using leader election mechanisms to ensure high availability, while providing health check and metric collection functions for system monitoring. Its flexible configuration options allow users to adjust parameters such as the number of worker threads, QPS limits, and resource management strategies. + +#### Parameters + +##### Kubernetes Parameters + +| Parameter Name | Description | Default | Example | +| ---------------- | ---------------------------------------- | ------- | ----------------------------------------- | +| `master` | Kubernetes API server address | - | `--master=https://kubernetes.default.svc` | +| `kubeconfig` | Path to kubeconfig file | - | `--kubeconfig=/etc/kubernetes/admin.conf` | +| `kube-api-qps` | QPS limit for communication with K8s API | 50.0 | `--kube-api-qps=100` | +| `kube-api-burst` | Burst limit for K8s API communication | 100 | `--kube-api-burst=200` | + +##### TLS Certificate Parameters + +| Parameter Name | Description | Default | Example | +| ---------------------- | -------------------------------------------- | ------- | --------------------------------------------- | +| `ca-cert-file` | x509 certificate file for HTTPS | - | `--ca-cert-file=/etc/volcano/ca.crt` | +| `tls-cert-file` | Default x509 certificate file for HTTPS | - | `--tls-cert-file=/etc/volcano/tls.crt` | +| `tls-private-key-file` | x509 private key file matching tls-cert-file | - | `--tls-private-key-file=/etc/volcano/tls.key` | + +##### Scheduler Configuration Parameters + +| Parameter Name | Description | Default | Example | +| --------------------------- | ------------------------------------------------------------ | --------- | ------------------------------------ | +| `scheduler-name` | .spec.SchedulerName for Pods handled by Volcano | "volcano" | `--scheduler-name=volcano-scheduler` | +| `worker-threads` | Number of threads for concurrent job operations | 3 | `--worker-threads=5` | +| `max-requeue-num` | Maximum number of requeues for jobs, queues, or commands | 15 | `--max-requeue-num=20` | +| `inherit-owner-annotations` | Whether to inherit owner's annotations when creating PodGroup | true | `--inherit-owner-annotations=false` | + +##### Dedicated Thread Parameters + +| Parameter Name | Description | Default | Example | +| ----------------------------- | -------------------------------------------- | ------- | ---------------------------------- | +| `worker-threads-for-podgroup` | Number of threads for PodGroup operations | 5 | `--worker-threads-for-podgroup=10` | +| `worker-threads-for-queue` | Number of threads for queue operations | 5 | `--worker-threads-for-queue=10` | +| `worker-threads-for-gc` | Number of threads for job garbage collection | 1 | `--worker-threads-for-gc=2` | + +##### Health Check and Monitoring Parameters + +| Parameter Name | Description | Default | Example | +| ----------------- | -------------------------------------- | -------- | -------------------------- | +| `healthz-address` | Listen address for health check server | ":11251" | `--healthz-address=:11252` | +| `enable-healthz` | Enable health check | false | `--enable-healthz=true` | +| `enable-metrics` | Enable metrics functionality | false | `--enable-metrics=true` | +| `listen-address` | Listen address for HTTP requests | ":8081" | `--listen-address=:8082` | + +##### Leader Election Parameters + +| Parameter Name | Description | Default | Example | +| --------------------------------- | ------------------------------------- | ----------------------- | -------------------------------------------------- | +| `lock-object-namespace` | Namespace of lock object (deprecated) | "volcano-system" | `--lock-object-namespace=kube-system` | +| `leader-elect` | Enable leader election | - | `--leader-elect=true` | +| `leader-elect-resource-name` | Leader election resource name | "vc-controller-manager" | - | +| `leader-elect-resource-namespace` | Leader election resource namespace | - | `--leader-elect-resource-namespace=volcano-system` | + +##### Controller Management Parameters + +| Parameter Name | Description | Default | Example | +| -------------- | ----------------------------- | --------------------- | ------------------------------------------------- | +| `controllers` | Specify controllers to enable | "*" (all controllers) | `--controllers=+job-controller,-queue-controller` | +| `version` | Display version and exit | false | `--version` | + +##### Controller List + +Volcano supports the following controllers, which can be enabled or disabled via the `controllers` parameter: + +- `gc-controller`: Garbage collection controller +- `job-controller`: Job controller +- `jobflow-controller`: Job flow controller +- `jobtemplate-controller`: Job template controller +- `pg-controller`: PodGroup controller +- `queue-controller`: Queue controller + +Use `*` to enable all controllers, `+[controller-name]` to explicitly enable a specific controller, and `-[controller-name]` to explicitly disable a specific controller. + +#### gc-controller + +The Garbage Collector is a specialized controller responsible for cleaning up completed Job resources. It works by monitoring Job creation and update events, identifying those that have completed (Completed, Failed, or Terminated) and have a TTL set (via .spec.ttlSecondsAfterFinished). + +The controller maintains a work queue, adding Jobs that need cleanup and processing them at appropriate times. It calculates the time elapsed since a Job's completion, and when that time reaches or exceeds the specified TTL, it deletes these Jobs through the API server. For Jobs that haven't yet expired, the controller requeues them with a timer set to process them again when their TTL expires. + +This mechanism ensures that completed Jobs don't permanently occupy cluster resources, while providing a flexible resource reclamation strategy that allows users to retain Job history records for a specified period before automatic cleanup. + +#### job-controller + +The Job Controller is a core controller responsible for managing and coordinating the lifecycle of Job resources. It works by monitoring changes to Job, Pod, PodGroup, and other resources, handling related events, and maintaining Job states. + +The controller processes Jobs according to their current state and triggered events, calling appropriate state handling functions to synchronize Job resources, including creating Pods, checking Pod status, and updating Job status. + +The controller implements a work queue system, using a hashing algorithm to distribute Jobs across different work queues for parallel processing. It also includes a delayed action mechanism that allows certain operations to execute after a specific delay, such as handling Pod failures or retries. It implements error handling and retry mechanisms, requeuing requests when operations fail and terminating Jobs and releasing resources after exceeding the maximum retry count. + +#### jobflow-controller + +The JobFlow Controller is primarily responsible for managing and coordinating the lifecycle of JobFlow CR. It works by monitoring changes to JobFlow, JobTemplate, and Job resources, handling related events, and maintaining JobFlow states. + +When a JobFlow is created or updated, the controller adds the request to a work queue and then performs appropriate state transition operations, calling corresponding handler functions to synchronize JobFlow resources, including creating dependent Jobs, checking Job status, and updating JobFlow status. + +It also handles error conditions, including retrying failed requests and recording events to help users understand the JobFlow's operational status. + +#### jobtemplate-controller + +The JobTemplate Controller is responsible for managing JobTemplate resources. It works by monitoring the creation events of JobTemplate and Job resources, handling related operations, and maintaining JobTemplate states. + +During initialization, the controller sets up various informers to monitor resource changes and creates work queues to process these events. + +When a JobTemplate is created, the controller adds it to the work queue and then performs synchronization operations, which may include creating actual Job resources based on the template. Similarly, when a Job is created, the controller also checks whether related JobTemplate statuses need to be updated. + +The controller implements error handling and retry mechanisms, requeuing requests when operations fail until reaching the maximum retry count. It also records events to help users understand the processing status of JobTemplates. + +#### pg-controller + +The PodGroup Controller is a specialized controller responsible for automatically creating and managing PodGroup resources for Pods using the Volcano scheduler. It works by monitoring the creation and update events of Pods and ReplicaSets, identifying those that need PodGroups but aren't yet associated with one. + +The controller maintains a work queue, adding Pod requests that need processing. When it detects Pods using the Volcano scheduler without a specified PodGroup, the controller automatically creates corresponding PodGroup resources, ensuring these Pods can be correctly processed by the Volcano scheduling system. + +Additionally, when the workload support feature is enabled, the controller also monitors ReplicaSet resources, creating unified PodGroups for Pods belonging to the same ReplicaSet, thereby supporting more complex workload patterns. The controller can also inherit annotations from Pod owners based on configuration, implementing more flexible resource management strategies. + +#### queue-controller + +The Queue Controller is a core controller responsible for managing and maintaining queue states in the Volcano scheduling system. It works by monitoring changes to Queue, PodGroup, and Command resources, coordinating queue lifecycle and state transitions. + +The controller maintains two work queues: one for processing queue state change requests and another for handling queue commands. When a queue state needs to change (such as opening, closing, or synchronizing), the controller calls the appropriate state handling function to execute the state transition. The controller also maintains the mapping relationship between queues and PodGroups, ensuring correct implementation of resource allocation and scheduling policies. + +When the QueueCommandSync parameter is enabled, the controller also monitors Command resources, supporting dynamic control of queue behavior through commands, such as opening or closing queues. The controller implements error handling and retry mechanisms, requeuing requests when operations fail and recording events and abandoning processing after exceeding the maximum retry count. This design ensures the consistency and reliability of queue states and is an important component of the Volcano scheduling system. + +### Admission + +#### Introduction + +Volcano Admission is a key component of the Volcano system, responsible for validating and modifying resource objects submitted to the Kubernetes cluster. It is implemented through Kubernetes' Webhook mechanism, capable of intercepting creation, update, and deletion requests for specific resources, and validating or modifying them according to predefined rules. + +Admission supports admission control for various resources, including Jobs, PodGroups, Pods, and Queues. Through configuration of different admission paths, it can flexibly control which resources need validation or modification. By default, it processes Pod resources managed by the Volcano scheduler (default name "volcano"). + +This component provides an HTTPS service to receive admission requests from the Kubernetes API server and supports health check functionality for monitoring its operational status. + +#### Parameters + +##### Kubernetes Parameters + +| Parameter Name | Description | Default | Example | +| ---------------- | -------------------------------------------------------- | ------- | ----------------------------------------- | +| `master` | Kubernetes API server address (overrides kubeconfig) | Empty | `--master=https://kubernetes.default.svc` | +| `kubeconfig` | Path to kubeconfig file with auth and master server info | Empty | `--kubeconfig=/etc/kubernetes/admin.conf` | +| `kube-api-qps` | QPS limit for communication with Kubernetes API server | 50.0 | `--kube-api-qps=100` | +| `kube-api-burst` | Burst limit for communication with Kubernetes API server | 100 | `--kube-api-burst=200` | + +##### TLS Certificate Parameters + +| Parameter Name | Description | Default | Example | +| ---------------------- | ------------------------------------------------ | ------- | --------------------------------------------- | +| `tls-cert-file` | Path to x509 certificate file for HTTPS service | Empty | `--tls-cert-file=/etc/volcano/tls.crt` | +| `tls-private-key-file` | Path to x509 private key file matching cert file | Empty | `--tls-private-key-file=/etc/volcano/tls.key` | +| `ca-cert-file` | Path to CA certificate file for HTTPS service | Empty | `--ca-cert-file=/etc/volcano/ca.crt` | + +##### Service Configuration Parameters + +| Parameter Name | Description | Default | Example | +| ---------------- | ------------------------------------------ | ------- | -------------------------- | +| `listen-address` | Address for Admission Controller to listen | Empty | `--listen-address=0.0.0.0` | +| `port` | Port for Admission Controller to use | 8443 | `--port=8443` | +| `version` | Display version information and exit | false | `--version` | + +##### Webhook Configuration Parameters + +| Parameter Name | Description | Default | Example | +| ---------------------- | --------------------------- | ------- | ------------------------------------------------------------ | +| `webhook-namespace` | Namespace of the webhook | Empty | `--webhook-namespace=volcano-system` | +| `webhook-service-name` | Name of the webhook service | Empty | `--webhook-service-name=volcano-admission` | +| `webhook-url` | URL of the webhook | Empty | `--webhook-url=https://volcano-admission.volcano-system:8443` | +| `admission-conf` | Path to webhook config file | Empty | `--admission-conf=/etc/volcano/admission.conf` | + +##### Admission Control Parameters + +| Parameter Name | Description | Default | Example | +| ------------------- | ---------------------------------------------------------- | ------------------------------------------------------------ | ------------------------------------------------- | +| `enabled-admission` | Enabled admission webhook paths, separated by commas | "/jobs/mutate,/jobs/validate,/podgroups/mutate,/pods/validate,/pods/mutate,/queues/mutate,/queues/validate" | `--enabled-admission=/jobs/mutate,/pods/validate` | +| `scheduler-name` | Volcano will handle Pods with matching .spec.SchedulerName | ["volcano"] | `--scheduler-name=volcano,custom-scheduler` | + +##### Health Check Parameters + +| Parameter Name | Description | Default | Example | +| ----------------- | ---------------------------------------- | -------- | -------------------------- | +| `enable-healthz` | Whether to enable health check | false | `--enable-healthz=true` | +| `healthz-address` | Address and port for health check server | ":11251" | `--healthz-address=:11252` | + +### Vcctl + +#### Introduction + +`vcctl` is a command-line tool provided by Volcano for managing and operating resources in a Volcano cluster. It offers a set of intuitive commands that enable users to easily query, create, modify, and delete Volcano resources such as Jobs, Queues, and PodGroups. + +The tool supports various operations, including listing, creating, deleting, and managing Volcano jobs, controlling job lifecycles (suspend, resume, run), querying and managing queue resources, viewing PodGroup status and details, and interacting with the Volcano scheduler. Through the command-line interface, users can quickly perform common management tasks without directly manipulating Kubernetes APIs or YAML files. This makes Volcano resource management simpler and more efficient, particularly suitable for batch processing jobs and high-performance computing scenarios. + +The `vcctl` tool adopts a command structure similar to `kubectl`, allowing Kubernetes users to quickly get started. It supports detailed help information, which users can access via `vcctl -h` or `vcctl [command] -h` to obtain detailed usage and option descriptions for each command. As an important component of the Volcano ecosystem, `vcctl` provides users with a convenient interface, simplifying the management and operation process for complex batch workloads. + +#### Parameters + +vcctl does not require any additional parameters; it can be started directly. + +### Agent + +#### Introduction + +Agent is a component that runs on Kubernetes nodes, primarily responsible for resource monitoring and oversubscription management. It significantly improves cluster resource utilization by identifying idle resources on nodes and allowing reasonable overcommitment. + +The Agent continuously monitors node resource usage, providing accurate resource information to the Volcano scheduling system to support more intelligent scheduling decisions. During resource contention, it can perform resource reclamation operations to ensure that the performance of critical workloads is not affected. + +Through deep integration with the Kubernetes CGroup system, Agent implements fine-grained resource control. It is an important supporting component for Volcano's efficient batch scheduling, particularly suitable for high-performance computing and machine learning scenarios. + +#### Parameters + +##### Kubernetes Parameters + +| Parameter Name | Type | Default | Description | +| ---------------------- | ------ | ------------------------ | ------------------------------------------------------------ | +| `--kube-cgroup-root` | string | `""` | Kubernetes cgroup root path. If CgroupsPerQOS is enabled, this is the root of the QOS cgroup hierarchy | +| `--kube-node-name` | string | env `KUBE_NODE_NAME` | Name of the node where the Agent is running | +| `--kube-pod-name` | string | env `KUBE_POD_NAME` | Name of the Pod where the Agent is running | +| `--kube-pod-namespace` | string | env `KUBE_POD_NAMESPACE` | Namespace of the Pod where the Agent is running | + +##### Health Check Parameters + +| Parameter Name | Type | Default | Description | +| ------------------- | ------ | ------- | ----------------------------------------- | +| `--healthz-address` | string | `""` | Address for health check server to listen | +| `--healthz-port` | int | `3300` | Port for health check server to listen | + +##### Resource Oversubscription Parameters + +| Parameter Name | Type | Default | Description | +| --------------------------- | ------ | ---------- | ------------------------------------------------------------ | +| `--oversubscription-policy` | string | `"extend"` | Oversubscription policy, determining how oversubscribed resources are reported and used. Default is `extend`, meaning report as extended resources | +| `--oversubscription-ratio` | int | `60` | Oversubscription ratio, determining how much idle resource can be overcommitted, in percentage | +| `--include-system-usage` | bool | `false` | Whether to consider system resource usage when calculating oversubscription resources and performing evictions | + +##### Feature Parameters + +| Parameter Name | Type | Default | Description | +| ---------------------- | -------- | ------- | ------------------------------------------------------------ | +| `--supported-features` | []string | `["*"]` | List of supported features. `*` means support all features enabled by default, `foo` means support feature named `foo`, `-foo` means don't support feature named `foo` | + +#### Network-qos + +##### Introduction + +The Network plugin is a network bandwidth management solution designed specifically for Kubernetes clusters, Used in conjunction with the Agent component,aimed at intelligently adjusting network resource allocation between different types of workloads. This plugin integrates with existing network plugins through the CNI mechanism, enabling fine-grained control over container network traffic. + +Its core functionality is to distinguish between online and offline workloads, and dynamically adjust the bandwidth limits for offline jobs based on the real-time bandwidth needs of online services. When the bandwidth usage of online services exceeds a preset watermark, the system automatically reduces the available bandwidth for offline jobs, ensuring the quality of service for critical business applications. When online service bandwidth demands are low, the system allows offline jobs to use more bandwidth resources, improving overall cluster resource utilization. + +The plugin is implemented based on Linux TC (Traffic Control) and eBPF technologies, providing efficient, low-overhead network traffic management capabilities. It is an important tool for ensuring service quality in hybrid deployment environments. + +##### Parameters + +| Parameter Name | Description | Default | Type | +| -------------------------- | ------------------------------------------------------------ | ------- | ------ | +| `CheckoutInterval` | Time interval for checking and updating bandwidth limits for offline jobs | None | string | +| `OnlineBandwidthWatermark` | Bandwidth threshold for online jobs, representing the maximum total bandwidth usage of all online Pods | None | string | +| `OfflineLowBandwidth` | Maximum network bandwidth that offline jobs can use when online job bandwidth usage exceeds the watermark | None | string | +| `OfflineHighBandwidth` | Maximum network bandwidth that offline jobs can use when online job bandwidth usage is below the watermark | None | string | +| `EnableNetworkQoS` | Whether to enable network QoS functionality | false | bool | + +## Helm + +If you need to view the complete Helm parameters, you can obtain them through [Volcano Helm](https://github.com/volcano-sh/volcano/blob/master/installer/helm/chart/volcano/values.yaml). diff --git a/content/zh/docs/architecture.md b/content/zh/docs/architecture.md index a5e7aa4f..56d630a7 100644 --- a/content/zh/docs/architecture.md +++ b/content/zh/docs/architecture.md @@ -14,29 +14,384 @@ linktitle = "架构" [menu.docs] parent = "home" weight = 2 -+++ -## 架构概览 ++++ +## 概览 -{{
}} +{{
}} -Volcano与Kubernetes天然兼容,并为高性能计算而生。它遵循Kubernetes的设计理念和风格。 +Volcano 与 Kubernetes 天然兼容,遵循其设计理念和风格,同时扩展了 Kubernetes 原生能力,为机器学习、大数据应用、科学计算和特效渲染等高性能工作负载提供完整支持机制。其架构设计充分考虑了可扩展性、高性能和易用性,建立在多年来大规模运行各种高性能工作负载的经验之上,并结合了开源社区的最佳实践。 {{
}} -Volcano由scheduler、controllermanager、admission和vcctl组成: +Volcano由四个主要组件构成:作为系统核心的 Scheduler 通过可插拔的 Action 和 Plugin 实现 Gang 调度、异构设备调度等高级特性,为批处理作业提供精细化资源分配;ControllerManager 负责管理 CRD 资源生命周期,包含 Queue、PodGroup 和 VCJob 三个控制器分别管理队列资源、Pod 组和 Volcano Job;Admission 组件对 CRD API 资源进行校验,确保作业配置符合系统要求;而 Vcctl 则作为命令行客户端工具,提供友好的接口管理和监控资源与作业。 + +Volcano 的分层架构设计使其能够无缝对接 Spark、TensorFlow、PyTorch、Flink 等主流计算框架,同时提供统一的调度能力。其模块化设计允许用户根据需求扩展功能,添加自定义的调度策略和资源管理能力。通过这种架构设计,Volcano 实现了资源的高效利用、作业的精确调度和系统的可靠运行,为高性能计算和大规模批处理提供了坚实的基础设施支持。 + +## 组件介绍 + +Volcano由以下几个组件构成: + +- **Scheduler**:通过一系列 action 和 plugin 调度 Job 并匹配最适节点,区别于Kubernetes原生调度器,支持多种Job专用调度算法 +- **Controller**:管理CRD资源生命周期,由多个控制器组成 +- **Admission**:负责CRD API资源的校验工作 +- **Vcctl**:Volcano的命令行客户端工具 +- **Agent**:运行在节点上的组件,负责资源监控和过度订阅管理。通过识别闲置资源并允许合理超卖,提高集群资源利用率。 +- **Network-qos**:管理在线和离线工作负载间的网络带宽分配,确保在线服务网络质量的同时最大化集群带宽利用率。 + +### Scheduler + +#### 介绍 + +Volcano 调度器是一个高度可配置和可扩展的 Kubernetes 调度器,专为处理复杂工作负载和特殊调度需求而设计。它提供了超越默认 Kubernetes 调度器的高级调度功能,使其特别适合高性能计算、机器学习和大数据工作负载。 + +调度器通过处理 `.spec.schedulerName` 与配置的调度器名称(默认为 "volcano")匹配的 Pod 来运行。它按照调度周期工作,评估未调度的 Pod 并根据各种调度策略和插件找到最佳的节点放置位置。 + +Volcano 调度器支持自定义插件扩展,可以通过配置文件定义调度策略,并提供了丰富的指标和健康检查功能,以确保调度系统的可靠性和可观测性。 + +#### 参数 + +##### Kubernetes参数 + +| 参数名 | 描述 | 默认值 | 示例 | +| ---------------- | ------------------------------------------ | ------ | ----------------------------------------- | +| `master` | Kubernetes API 服务器地址 | - | `--master=https://kubernetes.default.svc` | +| `kubeconfig` | kubeconfig 文件路径 | - | `--kubeconfig=/etc/kubernetes/admin.conf` | +| `kube-api-qps` | 与 Kubernetes API 服务器通信的 QPS 限制 | 2000.0 | `--kube-api-qps=1000` | +| `kube-api-burst` | 与 Kubernetes API 服务器通信的突发请求上限 | 2000 | `--kube-api-burst=1000` | + +##### TLS 证书参数 + +| 参数名 | 描述 | 默认值 | 示例 | +| ---------------------- | ------------------------------------- | ------ | --------------------------------------------- | +| `ca-cert-file` | HTTPS 的 x509 证书文件 | - | `--ca-cert-file=/etc/volcano/ca.crt` | +| `tls-cert-file` | HTTPS 的默认 x509 证书文件 | - | `--tls-cert-file=/etc/volcano/tls.crt` | +| `tls-private-key-file` | 与 tls-cert-file 匹配的 x509 私钥文件 | - | `--tls-private-key-file=/etc/volcano/tls.key` | + +##### 调度器配置参数 + +| 参数名 | 描述 | 默认值 | 示例 | +| ----------------- | -------------------------------------------------- | --------- | ---------------------------------------------- | +| `scheduler-name` | Volcano 处理的 Pod 的 .spec.SchedulerName | "volcano" | `--scheduler-name=volcano-scheduler` | +| `scheduler-conf` | 调度器配置文件的绝对路径 | - | `--scheduler-conf=/etc/volcano/scheduler.conf` | +| `schedule-period` | 每个调度周期之间的时间间隔 | 1s | `--schedule-period=2s` | +| `default-queue` | 作业的默认队列名称 | "default" | `--default-queue=system` | +| `priority-class` | 是否启用 PriorityClass 以提供 pod 组级别的抢占能力 | true | `--priority-class=false` | + +##### 节点选择和评分参数 + +| 参数名 | 描述 | 默认值 | 示例 | +| ---------------------------------- | ------------------------------------------------------------ | ------ | ------------------------------------------------------------ | +| `minimum-feasible-nodes` | 查找和评分的最小可行节点数 | 100 | `--minimum-feasible-nodes=50` | +| `minimum-percentage-nodes-to-find` | 查找和评分的最小节点百分比 | 5 | `--minimum-percentage-nodes-to-find=10` | +| `percentage-nodes-to-find` | 每个调度周期中要评分的节点百分比;如果 <=0,将根据集群大小计算自适应百分比 | 0 | `--percentage-nodes-to-find=20` | +| `node-selector` | Volcano 只处理带有指定标签的节点 | - | `--node-selector=volcano.sh/role:train --node-selector=volcano.sh/role:serving` | +| `node-worker-threads` | 同步节点操作的线程数 | 20 | `--node-worker-threads=30` | + +##### 插件和存储参数 + +| 参数名 | 描述 | 默认值 | 示例 | +| ---------------------- | ----------------------------------------------------- | ------ | ------------------------------------------------------------ | +| `plugins-dir` | vc-scheduler 将加载(但不激活)此目录中的自定义插件 | "" | `--plugins-dir=/etc/volcano/plugins` | +| `csi-storage` | 是否启用跟踪 CSI 驱动程序提供的可用存储容量 | false | `--csi-storage=true` | +| `ignored-provisioners` | 在计算 pod pvc 请求和抢占期间将被忽略的存储供应商列表 | - | `--ignored-provisioners=rancher.io/local-path --ignored-provisioners=hostpath.csi.k8s.io` | + +##### 监控和健康检查参数 + +| 参数名 | 描述 | 默认值 | 示例 | +| ----------------- | ------------------------ | -------- | -------------------------- | +| `enable-healthz` | 是否启用健康检查 | false | `--enable-healthz=true` | +| `healthz-address` | 健康检查服务器的监听地址 | ":11251" | `--healthz-address=:11252` | +| `enable-metrics` | 是否启用指标功能 | false | `--enable-metrics=true` | +| `listen-address` | HTTP 请求的监听地址 | ":8080" | `--listen-address=:8081` | + +##### 缓存和调试参数 + +| 参数名 | 描述 | 默认值 | 示例 | +| ---------------- | ------------------------------------ | ------ | ----------------------------------- | +| `cache-dumper` | 是否启用缓存转储器 | true | `--cache-dumper=false` | +| `cache-dump-dir` | 转储缓存信息到 JSON 文件时的目标目录 | "/tmp" | `--cache-dump-dir=/var/log/volcano` | +| `version` | 显示版本并退出 | false | `--version` | + +##### 领导者选举参数 + +| 参数名 | 描述 | 默认值 | 示例 | +| ----------------------- | ------------------------------------------------------------ | ---------------- | ------------------------------------- | +| `lock-object-namespace` | 锁对象的命名空间(已弃用,请使用 --leader-elect-resource-namespace) | "volcano-system" | `--lock-object-namespace=kube-system` | + +### Controller + +#### 介绍 + +Volcano Controller 是 Volcano 的核心组件,负责管理和协调 Kubernetes 集群中的批处理作业和资源。它采用多控制器架构,包含作业控制器、队列控制器、PodGroup 控制器、垃圾回收控制器等多个专用模块,共同处理不同类型的资源对象和业务逻辑。 + +与传统的 Kubernetes 控制器相比,Volcano Controller 提供了更丰富的批处理作业管理能力,支持作业生命周期管理、资源队列维护、PodGroup 协调等高级特性,特别适合高性能计算和机器学习等场景。它通过框架化设计实现了良好的可扩展性,允许用户根据需求启用或禁用特定控制器。 + +Volcano Controller 与 Kubernetes 深度集成,使用领导者选举机制确保高可用性,同时提供健康检查和指标收集功能便于系统监控。其灵活的配置选项允许用户调整工作线程数、QPS 限制和资源管理策略等参数。 + +#### 参数 + +##### Kubernetes参数 + +| 参数名 | 描述 | 默认值 | 示例 | +| ---------------- | ------------------------------------------ | ------ | ----------------------------------------- | +| `master` | Kubernetes API 服务器地址 | - | `--master=https://kubernetes.default.svc` | +| `kubeconfig` | kubeconfig 文件路径 | - | `--kubeconfig=/etc/kubernetes/admin.conf` | +| `kube-api-qps` | 与 Kubernetes API 服务器通信的 QPS 限制 | 50.0 | `--kube-api-qps=100` | +| `kube-api-burst` | 与 Kubernetes API 服务器通信的突发请求上限 | 100 | `--kube-api-burst=200` | + +##### TLS 证书参数 + +| 参数名 | 描述 | 默认值 | 示例 | +| ---------------------- | ------------------------------------- | ------ | --------------------------------------------- | +| `ca-cert-file` | HTTPS 的 x509 证书文件 | - | `--ca-cert-file=/etc/volcano/ca.crt` | +| `tls-cert-file` | HTTPS 的默认 x509 证书文件 | - | `--tls-cert-file=/etc/volcano/tls.crt` | +| `tls-private-key-file` | 与 tls-cert-file 匹配的 x509 私钥文件 | - | `--tls-private-key-file=/etc/volcano/tls.key` | + +##### 调度器配置参数 + +| 参数名 | 描述 | 默认值 | 示例 | +| --------------------------- | ------------------------------------------ | --------- | ------------------------------------ | +| `scheduler-name` | Volcano 处理的 Pod 的 .spec.SchedulerName | "volcano" | `--scheduler-name=volcano-scheduler` | +| `worker-threads` | 并发同步作业操作的线程数 | 3 | `--worker-threads=5` | +| `max-requeue-num` | 作业、队列或命令在队列中重新排队的最大次数 | 15 | `--max-requeue-num=20` | +| `inherit-owner-annotations` | 创建 PodGroup 时是否继承所有者的注释 | true | `--inherit-owner-annotations=false` | + +##### 专用线程参数 + +| 参数名 | 描述 | 默认值 | 示例 | +| ----------------------------- | -------------------------- | ------ | ---------------------------------- | +| `worker-threads-for-podgroup` | 同步 PodGroup 操作的线程数 | 5 | `--worker-threads-for-podgroup=10` | +| `worker-threads-for-queue` | 同步队列操作的线程数 | 5 | `--worker-threads-for-queue=10` | +| `worker-threads-for-gc` | 回收作业的线程数 | 1 | `--worker-threads-for-gc=2` | + +##### 健康检查和监控参数 + +| 参数名 | 描述 | 默认值 | 示例 | +| ----------------- | ------------------------ | -------- | -------------------------- | +| `healthz-address` | 健康检查服务器的监听地址 | ":11251" | `--healthz-address=:11252` | +| `enable-healthz` | 是否启用健康检查 | false | `--enable-healthz=true` | +| `enable-metrics` | 是否启用指标功能 | false | `--enable-metrics=true` | +| `listen-address` | HTTP 请求的监听地址 | ":8081" | `--listen-address=:8082` | + +##### 领导者选举参数 + +| 参数名 | 描述 | 默认值 | 示例 | +| --------------------------------- | -------------------------- | ----------------------- | -------------------------------------------------- | +| `lock-object-namespace` | 锁对象的命名空间(已弃用) | "volcano-system" | `--lock-object-namespace=kube-system` | +| `leader-elect` | 是否启用领导者选举 | - | `--leader-elect=true` | +| `leader-elect-resource-name` | 领导者选举资源名称 | "vc-controller-manager" | - | +| `leader-elect-resource-namespace` | 领导者选举资源命名空间 | - | `--leader-elect-resource-namespace=volcano-system` | + +##### 控制器管理参数 + +| 参数名 | 描述 | 默认值 | 示例 | +| ------------- | ------------------ | ----------------- | ------------------------------------------------- | +| `controllers` | 指定要启用的控制器 | "*"(所有控制器) | `--controllers=+job-controller,-queue-controller` | +| `version` | 显示版本并退出 | false | `--version` | + +##### 控制器列表 + +Volcano 支持以下控制器,可以通过 `controllers` 参数启用或禁用: + +- `gc-controller`: 垃圾回收控制器 +- `job-controller`: 作业控制器 +- `jobflow-controller`: 作业流控制器 +- `jobtemplate-controller`: 作业模板控制器 +- `pg-controller`: PodGroup 控制器 +- `queue-controller`: 队列控制器 + +使用 `*` 启用所有控制器,使用 `+[controller-name]` 显式启用特定控制器,使用 `-[controller-name]` 显式禁用特定控制器。 + +#### gc-controller + +Garbage Collector 是一个专门的控制器,负责清理已完成的 Job 资源。它通过监听 Job 的创建和更新事件,识别那些已经完成(Completed、Failed 或 Terminated)且设置了 TTL(.spec.ttlSecondsAfterFinished)的 Job。 + +控制器维护一个工作队列,将需要清理的 Job 加入队列并在适当的时间处理。它会计算 Job 完成后的经过时间,当达到或超过指定的 TTL 时,通过 API 服务器删除这些 Job。对于尚未到期的 Job,控制器会将其重新加入队列,并设置在 TTL 到期时再次处理。 + +这种机制确保了已完成的 Job 不会永久占用集群资源,同时提供了灵活的资源回收策略,使用户可以根据需要保留 Job 历史记录一段时间后自动清理。 + +#### job-controller + +Job Controller 是核心控制器,负责管理和协调 Job 资源的生命周期。它通过监听 Job、Pod、PodGroup 等资源的变化,处理相关事件并维护 Job 的状态。 + +控制器会根据 Job 的当前状态和触发的事件,调用相应的状态处理函数来同步 Job 资源,包括创建 Pod、检查 Pod 状态、更新 Job 状态等。 + +控制器实现了一个工作队列系统,使用哈希算法将 Job 分配到不同的工作队列中以实现并行处理。它还包含了一个延迟动作机制,允许某些操作在特定延迟后执行,如处理 Pod 失败或重试等情况。它还实现了错误处理和重试机制,当操作失败时会将请求重新加入队列,超过最大重试次数后会终止 Job 并释放资源。 + +#### jobflow-controller -- Scheduler -Volcano scheduler通过一系列的action和plugin调度Job,并为它找到一个最适合的节点。与Kubernetes default-scheduler相比,Volcano与众不同的 -地方是它支持针对Job的多种调度算法。 +jobflow-controller主要负责管理和协调 JobFlow CR的生命周期。它通过监听 JobFlow、JobTemplate 和 Job 资源的变化,处理相关事件并维护 JobFlow 的状态。 -- Controllermanager -Volcano controllermanager管理CRD资源的生命周期。它主要由**Queue ControllerManager**、 **PodGroupControllerManager**、 **VCJob -ControllerManager**构成。 +当 JobFlow 被创建或更新时,控制器会将请求加入工作队列,然后执行相应的状态转换操作,调用相应的处理函数来同步 JobFlow 资源,包括创建依赖的 Job、检查 Job 状态、更新 JobFlow 状态等。 -- Admission -Volcano admission负责对CRD API资源进行校验。 +它还负责处理错误情况,包括重试失败的请求和记录事件,以便用户了解 JobFlow 的运行状况 + +#### jobtemplate-controller + +jobTemplate Controller 是负责管理 JobTemplate 资源的控制器。它通过监听 JobTemplate 和 Job 资源的创建事件,处理相关操作并维护 JobTemplate 的状态。 + +控制器初始化时会设置各种 informer 来监听资源变化,并创建工作队列来处理这些事件。 + +当一个 JobTemplate 被创建时,控制器会将其加入工作队列,然后执行同步操作,可能包括根据模板创建实际的 Job 资源。同样地,当一个 Job 被创建时,控制器也会检查是否需要更新相关的 JobTemplate 状态。 + +控制器实现了错误处理和重试机制,当操作失败时会将请求重新加入队列,直到达到最大重试次数。它还负责记录事件,以便用户了解 JobTemplate 的处理状态。 + +#### pg-controller + +PodGroup Controller 是一个专门的控制器,负责为使用 Volcano 调度器的 Pod 自动创建和管理 PodGroup 资源。它通过监听 Pod 和 ReplicaSet 的创建和更新事件,识别那些需要 PodGroup 但尚未关联的 Pod。 + +控制器维护一个工作队列,将需要处理的 Pod 请求加入队列并处理。当检测到使用 Volcano 调度器但没有指定 PodGroup 的 Pod 时,控制器会自动创建相应的 PodGroup 资源,确保这些 Pod 能够被 Volcano 调度系统正确处理。 + +此外,当启用工作负载支持特性时,控制器还会监听 ReplicaSet 资源,为属于同一 ReplicaSet 的 Pod 创建统一的 PodGroup,从而支持更复杂的工作负载模式。控制器还可以根据配置从 Pod 所有者继承注解,实现更灵活的资源管理策略。 + +#### queue-controller + +Queue Controller 是一个核心控制器,负责管理和维护 Volcano 调度系统中的队列状态。它通过监听 Queue、PodGroup 和 Command 资源的变化,协调队列的生命周期和状态转换。 + +控制器维护两个工作队列:一个处理队列状态变更请求,另一个处理队列命令。当队列状态需要变更时(如开启、关闭或同步),控制器会调用相应的状态处理函数执行状态转换。控制器还维护了队列与 PodGroup 之间的映射关系,确保资源分配和调度策略的正确实施。 + +当启用 QueueCommandSync 参数时,控制器还会监听 Command 资源,支持通过命令方式动态控制队列行为,如开启或关闭队列。控制器实现了错误处理和重试机制,当操作失败时会将请求重新加入队列,超过最大重试次数后会记录事件并放弃处理。这种设计确保了队列状态的一致性和可靠性,是 Volcano 调度系统的重要组成部分。 + +### Admission + +#### 介绍 + +Volcano Admission 是 Volcano 系统的关键组件,负责验证和修改提交到 Kubernetes 集群的资源对象。它通过 Kubernetes 的 Webhook 机制实现,能够拦截特定资源的创建、更新和删除请求,并根据预定义的规则进行验证或修改。 + +Admission 支持多种资源的准入控制,包括 Jobs、PodGroups、Pods 和 Queues 等。通过配置不同的准入路径,可以灵活地控制哪些资源需要进行验证或变更。默认情况下,它会处理由 Volcano 调度器(默认名为 "volcano")管理的 Pod 资源。 + +该组件提供了 HTTPS 服务以接收 Kubernetes API 服务器的准入请求,并支持健康检查功能,便于监控其运行状态。 + +#### 参数 + +##### Kubernetes 参数 + +| 参数名 | 描述 | 默认值 | 示例 | +| ---------------- | ----------------------------------------------------- | ------ | ----------------------------------------- | +| `master` | Kubernetes API 服务器的地址(覆盖 kubeconfig 中的值) | 空 | `--master=https://kubernetes.default.svc` | +| `kubeconfig` | 包含认证和主服务器位置信息的 kubeconfig 文件路径 | 空 | `--kubeconfig=/etc/kubernetes/admin.conf` | +| `kube-api-qps` | 与 Kubernetes API 服务器通信时的 QPS 限制 | 50.0 | `--kube-api-qps=100` | +| `kube-api-burst` | 与 Kubernetes API 服务器通信时的突发请求限制 | 100 | `--kube-api-burst=200` | - Vcctl Volcano vcctl是Volcano的命令行客户端工具。 +##### TLS 证书参数 + +| 参数名 | 描述 | 默认值 | 示例 | +| ---------------------- | ------------------------------------------- | ------ | --------------------------------------------- | +| `tls-cert-file` | HTTPS 服务的 x509 证书文件路径 | 空 | `--tls-cert-file=/etc/volcano/tls.crt` | +| `tls-private-key-file` | 与 `tls-cert-file` 匹配的 x509 私钥文件路径 | 空 | `--tls-private-key-file=/etc/volcano/tls.key` | +| `ca-cert-file` | HTTPS 服务的 CA 证书文件路径 | 空 | `--ca-cert-file=/etc/volcano/ca.crt` | + +##### 服务配置参数 + +| 参数名 | 描述 | 默认值 | 示例 | +| ---------------- | ------------------------------------- | ------ | -------------------------- | +| `listen-address` | Admission Controller 服务器监听的地址 | 空 | `--listen-address=0.0.0.0` | +| `port` | Admission Controller 服务器使用的端口 | 8443 | `--port=8443` | +| `version` | 显示版本信息并退出 | false | `--version` | + +##### Webhook 配置参数 + +| 参数名 | 描述 | 默认值 | 示例 | +| ---------------------- | ---------------------- | ------ | ------------------------------------------------------------ | +| `webhook-namespace` | Webhook 所在的命名空间 | 空 | `--webhook-namespace=volcano-system` | +| `webhook-service-name` | Webhook 服务的名称 | 空 | `--webhook-service-name=volcano-admission` | +| `webhook-url` | Webhook 的 URL | 空 | `--webhook-url=https://volcano-admission.volcano-system:8443` | +| `admission-conf` | Webhook 的配置文件路径 | 空 | `--admission-conf=/etc/volcano/admission.conf` | + +##### 准入控制参数 + +| 参数名 | 描述 | 默认值 | 示例 | +| ------------------- | ------------------------------------------------------- | ------------------------------------------------------------ | ------------------------------------------------- | +| `enabled-admission` | 启用的准入 Webhook 路径,多个路径用逗号分隔 | "/jobs/mutate,/jobs/validate,/podgroups/mutate,/pods/validate,/pods/mutate,/queues/mutate,/queues/validate" | `--enabled-admission=/jobs/mutate,/pods/validate` | +| `scheduler-name` | Volcano 将处理 `.spec.SchedulerName` 与此参数匹配的 Pod | ["volcano"] | `--scheduler-name=volcano,custom-scheduler` | + +##### 健康检查参数 + +| 参数名 | 描述 | 默认值 | 示例 | +| ----------------- | ------------------------------ | -------- | -------------------------- | +| `enable-healthz` | 是否启用健康检查 | false | `--enable-healthz=true` | +| `healthz-address` | 健康检查服务器监听的地址和端口 | ":11251" | `--healthz-address=:11252` | + +### Vcctl + +#### 介绍 + +`vcctl` 是 Volcano 提供的命令行工具,用于管理和操作 Volcano 集群中的资源。它提供了一组直观的命令,使用户能够轻松地查询、创建、修改和删除 Volcano 资源,如作业(Job)、队列(Queue)和 PodGroup 等。 + +该工具支持多种操作,包括列出、创建、删除和管理 Volcano 作业,控制作业的生命周期(挂起、恢复、运行),查询和管理队列资源,查看 PodGroup 状态和详情,以及与 Volcano 调度器交互。通过命令行界面,用户可以快速执行常见的管理任务,而无需直接操作 Kubernetes API 或 YAML 文件。这使得 Volcano 资源管理变得更加简单和高效,特别适合于批处理作业和高性能计算场景。 + +`vcctl` 工具采用了类似于 `kubectl` 的命令结构,使 Kubernetes 用户能够快速上手。它支持详细的帮助信息,用户可以通过 `vcctl -h` 或 `vcctl [command] -h` 获取各命令的详细用法和选项说明。作为 Volcano 生态系统的重要组成部分,`vcctl` 为用户提供了一个便捷的接口,简化了复杂批处理工作负载的管理和操作流程。 + +#### 参数 + +vcctl不需要其他额外参数,直接启动即可。 + +### Agent + +#### 介绍 + +Agent 是运行在 Kubernetes 节点上的组件,主要负责资源监控和过度订阅管理。它通过识别节点上的闲置资源并允许合理超卖,显著提高集群资源利用率。 + +Agent 持续监控节点资源使用情况,为 Volcano 调度系统提供准确的资源信息,支持更智能的调度决策。在资源紧张时,它能够执行资源回收操作,确保关键工作负载的性能不受影响。 + +通过与 Kubernetes CGroup 系统的深度集成,Agent 实现了精细化的资源控制,是 Volcano 高效批处理调度的重要支撑组件,特别适合高性能计算和机器学习等场景。 + +#### 参数 + +##### Kubernetes 参数 + +| 参数名 | 类型 | 默认值 | 描述 | +| ---------------------- | ------ | ----------------------------- | ------------------------------------------------------------ | +| `--kube-cgroup-root` | string | `""` | Kubernetes 的 cgroup 根路径。如果启用了 CgroupsPerQOS,这是 QOS cgroup 层次结构的根 | +| `--kube-node-name` | string | 环境变量 `KUBE_NODE_NAME` | Agent 运行所在的节点名称 | +| `--kube-pod-name` | string | 环境变量 `KUBE_POD_NAME` | Agent 所在 Pod 的名称 | +| `--kube-pod-namespace` | string | 环境变量 `KUBE_POD_NAMESPACE` | Agent 所在 Pod 的命名空间 | + +##### 健康检查参数 + +| 参数名 | 类型 | 默认值 | 描述 | +| ------------------- | ------ | ------ | ------------------------ | +| `--healthz-address` | string | `""` | 健康检查服务器监听的地址 | +| `--healthz-port` | int | `3300` | 健康检查服务器监听的端口 | + +##### 资源过度订阅参数 + +| 参数名 | 类型 | 默认值 | 描述 | +| --------------------------- | ------ | ---------- | ------------------------------------------------------------ | +| `--oversubscription-policy` | string | `"extend"` | 过度订阅策略,决定过度订阅资源的报告和使用方式。默认为 `extend`,表示报告为扩展资源 | +| `--oversubscription-ratio` | int | `60` | 过度订阅比率,决定有多少闲置资源可以被超卖,单位为百分比 | +| `--include-system-usage` | bool | `false` | 是否在计算过度订阅资源和执行驱逐时考虑系统资源使用情况 | + +##### 功能特性参数 + +| 参数名 | 类型 | 默认值 | 描述 | +| ---------------------- | -------- | ------- | ------------------------------------------------------------ | +| `--supported-features` | []string | `["*"]` | 支持的特性列表。`*` 表示支持所有默认启用的特性,`foo` 表示支持名为 `foo` 的特性,`-foo` 表示不支持名为 `foo` 的特性 | + +#### Network-qos + +##### 介绍 + +Network插件是一个专为 Kubernetes 集群设计的网络带宽管理解决方案,与Agent组件一同使用,旨在智能调节不同类型工作负载之间的网络资源分配。该插件通过 CNI 机制与现有网络插件链式组合,实现对容器网络流量的精细控制。 + +它的核心功能是区分在线和离线工作负载,并根据在线服务的实时带宽需求动态调整离线作业的带宽上限。当在线服务的带宽使用超过预设水位线时,系统会自动降低离线作业的可用带宽,确保关键业务服务质量;当在线服务带宽需求较低时,系统则允许离线作业使用更多带宽资源,提高集群整体资源利用率。 + +该插件基于 Linux TC(Traffic Control)和 eBPF 技术实现,提供高效、低开销的网络流量管理能力,是混合部署环境中保障服务质量的重要工具。 + +###### 参数 + +| 参数名称 | 说明 | 默认值 | 类型 | +| -------------------------- | ------------------------------------------------------------ | ------ | ------ | +| `CheckoutInterval` | 检查和更新离线作业带宽限制的时间间隔 | 无 | string | +| `OnlineBandwidthWatermark` | 在线作业的带宽阈值,是所有在线 Pod 带宽使用的总和上限 | 无 | string | +| `OfflineLowBandwidth` | 当在线作业带宽使用超过水位线时,离线作业可使用的最大网络带宽 | 无 | string | +| `OfflineHighBandwidth` | 当在线作业带宽使用未达到水位线时,离线作业可使用的最大网络带宽 | 无 | string | +| `EnableNetworkQoS` | 是否启用网络 QoS 功能 | false | bool | + +## Helm + +如果需要查看完整的Helm参数,可以通过这里[Volcano Helm](https://github.com/volcano-sh/volcano/blob/master/installer/helm/chart/volcano/values.yaml)获取 diff --git a/static/img/arch_3.jpg b/static/img/arch_3.jpg new file mode 100644 index 00000000..71e9a003 Binary files /dev/null and b/static/img/arch_3.jpg differ