Skip to content

Consolidate S3 usage in spark operator blueprint #868

@alanty

Description

@alanty

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

The Spark operator solution includes an S3 express bucket and a standard S3 bucket but the examples are split on where they are looking for source data and for the spark events. The spark history server is looking at the standard s3 bucket for events which isn't used in all the examples

Describe the solution you would like

Ideally all of the spark event logs would write to the same location and the S3 express bucket could be used for some examples to demonstrate the config/performance. Docs/code can be updated to make it clear which location is being used.

Describe alternatives you have considered

We've debated using solely one or the other bucket but having examples for both is handy.

Additional context

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions