-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Open
Labels
Description
What feature you would like to be added?
Add a new CustomResourceDefinition (CRD) called SparkCluster to enable declarative lifecycle management of Spark Standalone clusters on Kubernetes.
Why is this needed?
Currently, the Spark Operator supports only cluster-mode Spark applications via the ScheduledSparkApplication and SparkApplication CRDs. However, there is no native way to deploy and manage a long-running Spark Standalone cluster (with dedicated Master and Worker nodes) on Kubernetes using the operator.
Describe the solution you would like
Add a new CRD called SparkCluster and implement the corressponding controller logic. An example manifest can be something like:
apiVersion: sparkoperator.k8s.io/v1alpha1
kind: SparkCluster
metadata:
name: spark
namespace: default
spec:
sparkVersion: 4.0.1
sparkConf:
spark.master.ui.port: "8080"
spark.master.ui.title: "Example Spark Cluster"
master:
template:
metadata:
labels:
spark.apache.org/version: 4.0.1
annotations:
spark.apache.org/version: 4.0.1
spec:
containers:
- name: spark-master
image: docker.io/library/spark:4.0.1
imagePullPolicy: Always
resources:
requests:
cpu: 1
memory: 4Gi
limits:
cpu: 1
memory: 4Gi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 185
runAsGroup: 185
fsGroup: 185
seccompProfile:
type: RuntimeDefault
workerGroups:
- name: group0
replicas: 2
template:
metadata:
labels:
spark.apache.org/version: 4.0.1
annotations:
spark.apache.org/version: 4.0.1
containers:
- name: spark-master
image: docker.io/library/spark:4.0.1
imagePullPolicy: Always
resources:
requests:
cpu: 1
memory: 4Gi
limits:
cpu: 1
memory: 4Gi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 185
runAsGroup: 185
fsGroup: 185
seccompProfile:
type: RuntimeDefault
- name: group1
replicas: 1
template:
metadata:
labels:
spark.apache.org/version: 4.0.1
annotations:
spark.apache.org/version: 4.0.1
containers:
- name: spark-master
image: docker.io/library/spark:4.0.1
imagePullPolicy: Always
resources:
requests:
cpu: 2
memory: 8Gi
limits:
cpu: 2
memory: 8Gi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 185
runAsGroup: 185
fsGroup: 185
seccompProfile:
type: RuntimeDefault
Describe alternatives you have considered
No response
Additional context
apache/spark-kubernetes-operator supports SparkCluster CRD.
Love this feature?
Give it a 👍 We prioritize the features with most 👍
ChenYi015