Skip to content

Decouple Spark Operator from ingress-nginx #2781

@jonerer

Description

@jonerer

What feature you would like to be added?

I would like the spark operator's Ingress management to be decoupled from ingress-nginx

Why is this needed?

Ingress-nginx is deprecated and will be EOL in March 2026. After that, usage of it will disappear

Describe the solution you would like

Currently, the Ingress objects are created by the Operator in runtime using some parameters inputted to the helm chart. For instance inputting this:

  uiIngress:
    enable: true
    ingressClassName: traefik

    urlFormat: /spark

    annotations:
      kubernetes.io/tls-acme: "true"
      traefik.ingress.kubernetes.io/router.middlewares: "oauth2-proxy-oauth-errors@kubernetescrd,traefik-oauth2-proxy-auth@kubernetescrd"

Will yield an Ingress object like this:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    kubernetes.io/tls-acme: "true"
    nginx.ingress.kubernetes.io/rewrite-target: /$2
    traefik.ingress.kubernetes.io/router.middlewares: oauth2-proxy-oauth-errors@kubernetescrd,traefik-oauth2-proxy-auth@kubernetescrd
  name: spark-pi2-ui-ingress
spec:
  ingressClassName: traefik
  rules:
  - http:
      paths:
      - backend:
          service:
            name: spark-pi-ui-svc
            port:
              number: 4040
        path: /spark(/|$)(.*)
        pathType: ImplementationSpecific

Note that the field "path:" has been altered from the helm value input, and the "rewrite-target" annotation have been added. For my specific setup, the annotation doesn't do anthing and the path attribute is broken (it's very Implementation Specific to ingress-nginx).

For my specific case I'm using traefik, which has two requirements: 1) path is not a regex, and 2) i need to be able to create a "Middleware" object for path rewriting.

I'm not suggesting that spark-operator should accomodate traefik specifically, just that it should get out of the way, and let the cluster operator do the setup that's needed for their cluster.

Suggestion:

  1. Have the "Ingress" object be created in the helm template rendering, not at runtime by the operator

  2. Don't alter the path input

  3. Make sure all the requests go into the "controller", and make sure that it in turn proxies the response from the active Application Pods

    Jupyterhub operator does something similar; the ingress goes into a "proxy" Pod like so:

  rules:
  - host: <host>
    http:
      paths:
      - backend:
          service:
            name: proxy-public
            port:
              name: http
        path: /
        pathType: Prefix

Describe alternatives you have considered

No response

Additional context

I'm not sure how important or useful the dashboard is in the spark-operator, since it's ephemeral and short-lived. This is just a suggestion from a k8s cluster operator on how to solve an existing issue.

Love this feature?

Give it a 👍 We prioritize the features with most 👍

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions