Skip to content

dry run mode for --jira-ticket parmeter in paasta spark-run #164

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

sidsatyelp
Copy link
Member

@sidsatyelp sidsatyelp commented Jun 4, 2025

What this change does

  1. Extracts the jira_ticket handling to a separate private method. We still run all tests through the public interface get_spark_conf()
  2. Splits the requirements file into two - requirements-oss.txt with all the previous dependencies and requirements-yelp.txt to include yelp-clog. We need yelp-clog to write warning messages to monk when --jira-ticket parameter is not passed.
  3. Update setup.py accordingly to include extras. We would need to specify the version in paasta and any other dependent repo as service-configuration-lib[yelp] >= 3.3.3 going forward. PaaSTA, spark_tools depend on this lib.
  4. Sets up two test targets in the Makefile, one for our yelpy environment that includes yelp-clog and one for oss.
  5. When flag spark.yelp.jira_ticket.enabled is false in srv-configs, it logs a warning to monk with the user param passed to get_spark_conf(). Ensures that paasta validate and paasta m-f-d work as usual.
  6. When the flag is true, it will block non-exempt users
  7. Add jenkins to allowed users list (Proof that this user is indeed called jenkins)

Testing

When flag is disabled

# should launch with no warning but write to monk
paasta spark-run --aws-profile=dev-cloud-economics --cmd "spark-submit /code/integration_tests/s3.py"
# output: https://fluffy.yelpcorp.com/i/xZstCSFXx0L4sKBQnThv2p42DnC7vMZn.html

scribereader -e devc spark_jira_ticket -n 100
{"timestamp": 1749689450, "event": "jira_ticket_validation_warning", "level": "WARNING", "reason": "Ticket missing or invalid. See http://y/spark-jira-ticket-param", "user": "sids", "jira_ticket_provided": null}
{"timestamp": 1749693332, "event": "jira_ticket_validation_warning", "level": "WARNING", "reason": "Ticket missing or invalid. See http://y/spark-jira-ticket-param", "user": "sids", "jira_ticket_provided": "BLAH"}
{"timestamp": 1749670473, "event": "jira_ticket_validation_warning", "level": "WARNING", "reason": "Ticket missing or invalid. See http://y/spark-jira-ticket-param", "user": "sids", "jira_ticket_provided": null}

# should launch with labels attached to pods
paasta spark-run --aws-profile=dev-cloud-economics --jira-ticket=CLOUD-802 --cmd "spark-submit /code/integration_tests/s3.py"
# output: https://fluffy.yelpcorp.com/i/kLSxKxmtk2qB10mh2TTm123ZSHkCh8KN.html

# paasta validate works
╰─ paasta validate -y ~/repos/yelpsoa-configs -s spark
✓ Successfully validated schema: tron-pnw-prod.yaml
✓ Successfully validated schema: tron-pnw-devc.yaml
✓ Successfully validated schema: adhoc-pnw-prod.yaml
✓ Successfully validated schema: eks-pnw-prod.yaml
✓ Successfully validated schema: eks-pnw-devc.yaml
✓ Successfully validated schema: adhoc-pnw-devc.yaml
✓ Successfully validated schema: adhoc-pnw-prod-spark.yaml
✓ Successfully validated schema: adhoc-norcal-devc.yaml
✓ Successfully validated schema: service.yaml
✓ Successfully validated schema: kubernetes-pnw-prod.yaml
✓ Successfully validated schema: kubernetes-pnw-devc.yaml
✓ tron-pnw-prod.yaml is valid.
✓ tron-pnw-devc.yaml is valid.
✓ All PaaSTA Instances for are valid for all clusters
✓ All spark's instance names in cluster norcal-devc are unique
✓ All spark's instance names in cluster norcal-stagef are unique
✓ All spark's instance names in cluster pnw-devc are unique
✓ All spark's instance names in cluster pnw-prod are unique
✓ All spark's instance names in cluster pnw-prod-spark are unique
✓ No orphan secrets found

When flag is enabled

# should not launch
paasta spark-run --aws-profile=dev-cloud-economics --cmd "spark-submit /code/integration_tests/s3.py"
# output: https://fluffy.yelpcorp.com/i/qp6Q4200DsVGl3t9FKp7n2bMGt8l5mZ9.html

# should launch with labels attached to pods
paasta spark-run --aws-profile=dev-cloud-economics --jira-ticket=CLOUD-802 --cmd "spark-submit /code/integration_tests/s3.py"
# output:  https://fluffy.yelpcorp.com/i/N40kkpV06BqpXGvvPWNsFlcdwrFk5Pn3.html

# paasta validate works 
paasta validate -y ~/repos/yelpsoa-configs -s spark
✓ Successfully validated schema: tron-pnw-prod.yaml
✓ Successfully validated schema: tron-pnw-devc.yaml
✓ Successfully validated schema: adhoc-pnw-prod.yaml
✓ Successfully validated schema: eks-pnw-prod.yaml
✓ Successfully validated schema: eks-pnw-devc.yaml
✓ Successfully validated schema: adhoc-pnw-devc.yaml
✓ Successfully validated schema: adhoc-pnw-prod-spark.yaml
✓ Successfully validated schema: adhoc-norcal-devc.yaml
✓ Successfully validated schema: service.yaml
✓ Successfully validated schema: kubernetes-pnw-prod.yaml
✓ Successfully validated schema: kubernetes-pnw-devc.yaml
✓ tron-pnw-prod.yaml is valid.
✓ tron-pnw-devc.yaml is valid.
✓ All PaaSTA Instances for are valid for all clusters
✓ All spark's instance names in cluster norcal-devc are unique
✓ All spark's instance names in cluster norcal-stagef are unique
✓ All spark's instance names in cluster pnw-devc are unique
✓ All spark's instance names in cluster pnw-prod are unique
✓ All spark's instance names in cluster pnw-prod-spark are unique
✓ No orphan secrets found

jira_ticket: The Jira ticket provided by the user
"""
# Get the jira ticket validation setting
flag_enabled = self.mandatory_default_spark_srv_conf.get('spark.yelp.jira_ticket.enabled', 'false')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we storing a string or bool in srv-configs for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string

requirements.txt Outdated
@@ -4,3 +4,4 @@ pyyaml >= 3.0
typing-extensions==4.13.2
# To resolve the error: botocore 1.29.125 has requirement urllib3<1.27,>=1.25.4, but you'll have urllib3 2.0.1 which is incompatible.
urllib3==1.26.15
yelp-clog==7.2.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have no idea what we're doing with this requirements file - but that's probably a problem to fix later

that said: we'll probably need to split things up into a yelpy and oss requirements.txt and pick between them based on where tests are being run from (see Tron and PaaSTA for examples)

additionally, we'll need to figure out what to do with https://github.com/Yelp/service_configuration_lib/blob/master/setup.py#L30-L36 - which is how apps that use this library will know what dependencies to pull in

i think we'll either need to have install_requires be read from a file that we can pick between in setup.py based on where we're being run from (either that, or add yelp-clog as an extra)

@@ -217,3 +227,161 @@ def get_spark_driver_memory_overhead_mb(spark_conf: Dict[str, str]) -> float:
)
driver_mem_overhead_mb = driver_mem_mb * driver_mem_overhead_factor
return round(driver_mem_overhead_mb, 5)


def _load_default_service_configurations_for_clog() -> Optional[Dict[str, Any]]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm pretty sure we can obviate all of this if we do something like what we do in paasta:

try:
    import clog
except ImportError:
    clog = None

...

if clog is None:
    print("CLog logger unavailable, exiting.", file=sys.stderr)
    return

clog.config.configure(
    scribe_host="169.254.255.254",
    scribe_port=1463,
    monk_disable=False,
    scribe_disable=False,
)
...

clog.log_line(STREAM_WE_WANT_TO_LOG_TO, MESSAGE_WE_WANT_TO_LOG)

(see https://github.com/Yelp/paasta/blob/d02ed19fd118de1439bb5aa230ce058b51b8b835/paasta_tools/oom_logger.py)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this suggestion. It worked with one change
service_configuration_lib/spark_config.py:1059: error: Argument "scribe_port" to "configure" has incompatible type "int"; expected "Optional[str]" [arg-type]

@sidsatyelp sidsatyelp requested a review from nemacysts June 12, 2025 02:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants