-
Notifications
You must be signed in to change notification settings - Fork 1.4k
feat: add rest api server proposal #2517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: Arsen Gumin <[email protected]> Signed-off-by: Arsen Gumin <[email protected]>
e807893
to
6a0e9e1
Compare
Signed-off-by: Arsen Gumin <[email protected]>
Signed-off-by: Arsen Gumin <[email protected]>
#### Story 1 | ||
As a data engineer, I want to submit Spark jobs by sending a single HTTP request from my CI pipeline, so I don’t need to install or configure `kubectl` on my build agents. | ||
|
||
#### Story 2 | ||
As a platform operator, I want to integrate Spark job submission into our internal web portal using REST calls, so that users can launch jobs without learning Kubernetes details. | ||
|
||
#### Story 3 | ||
As a user without Kubernetes expertise, I want to use a familiar HTTP API to submit Spark jobs, so I don’t need direct cluster access or knowledge of `kubectl` commands. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for preparing this proposal @aagumin!
For the Data Engineers and ML Engineers who would like to work with PySpark and interact with Spark Operator, but doesn't want to learn Kubernetes or CRDs, can't we integrate with Kubeflow SDK ?
Kubeflow SDK KEP: https://docs.google.com/document/d/1rX7ELAHRb_lvh0Y7BK1HBYAbA0zi9enB0F_358ZC58w/edit?tab=t.0
Repository: https://github.com/kubeflow/sdk
As we discussed in the proposal we can create dedicated SparkClient()
for CRUD operations, so users can quickly create their Spark Application and orchestrate them with Spark Operator without learning Kubernetes.
For example, folks are already working on it as part of this work: #2422
It is a great topic to discuss at the next Spark Operator call: https://bit.ly/3VGzP4n
Would love to hear your feedback
@aagumin @Shekharrajak @lresende @shravan-achar @akshaychitneni @vara-bonthu @yuchaoran2011 @bigsur0 @jacobsalway @ChenYi015 @franciscojavierarceo @astefanutti @rimolive !
Purpose of this PR
Proposed changes:
If this looks good, I can start working on the implementation.
Change Category
Rationale
Checklist
Additional Notes