Skip to content

Conversation

@jimchamp
Copy link
Collaborator

@jimchamp jimchamp commented Dec 26, 2025

Towards #10206

Makes the following changes to the cron wrapper script:

  • Updates script to read configurations from openlibrary.yml
  • Replaces job names with Sentry monitor slugs
  • Adds Sentry cron monitoring to the wrapped script
  • Removes Sentry exception capturing for failed jobs

Special Deployment Procedures

This branch must be deployed along with internetarchive/olsystem#297

Technical

Testing

Screenshot

Stakeholders

@jimchamp jimchamp added the Needs: Special Deploy This PR will need a non-standard deploy to production label Dec 26, 2025
@jimchamp jimchamp marked this pull request as ready for review December 27, 2025 01:03
Copilot AI review requested due to automatic review settings December 27, 2025 01:03
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the cron wrapper script to integrate with Sentry cron monitoring and migrate from INI-based configuration to YAML-based configuration (openlibrary.yml). The changes replace job names with Sentry monitor slugs and add native Sentry cron monitoring while removing exception capturing for failed jobs.

Key Changes:

  • Migrated configuration reading from ConfigParser (INI format) to YAML format
  • Integrated Sentry's cron monitor context manager for tracking job execution
  • Replaced job_name parameter with monitor_slug for better Sentry integration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +21 to 24
statsd_host_and_port = statsd_server.split(":")
self.statsd_client = self._setup_statsd(
statsd_host_and_port[0], statsd_host_and_port[1]
)
Copy link

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The statsd_server string is split without validation, which will cause an IndexError if the string is empty or doesn't contain a colon separator. Additionally, the port value needs to be converted to an integer for StatsClient. Add validation to ensure statsd_server is in the expected "host:port" format and convert the port to int before passing to _setup_statsd.

Suggested change
statsd_host_and_port = statsd_server.split(":")
self.statsd_client = self._setup_statsd(
statsd_host_and_port[0], statsd_host_and_port[1]
)
self.statsd_client = None
if statsd_server and ":" in statsd_server:
host, port_str = statsd_server.split(":", 1)
try:
port = int(port_str)
except ValueError:
port = None
if host and port is not None:
self.statsd_client = self._setup_statsd(host, port)

Copilot uses AI. Check for mistakes.
se = RuntimeError(f"Subprocess failed: {e}\n{e.stderr}")
sentry_sdk.capture_exception(se)
self.job_failed = True
sys.stderr.write(e.stderr)
Copy link

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stderr attribute may be None if the subprocess doesn't capture stderr or if the error occurs before stderr is populated. Attempting to write None to sys.stderr.write() will cause a TypeError. Add a check to ensure e.stderr is not None before writing it.

Suggested change
sys.stderr.write(e.stderr)
if e.stderr is not None:
sys.stderr.write(e.stderr)
else:
sys.stderr.write(str(e))

Copilot uses AI. Check for mistakes.
def _after_run(self):
if self.statsd_client:
self.job_timer.stop()
status = "failure" if self.job_failed else "stop"
Copy link

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The status label "stop" for successful job completion is unclear and inconsistent with conventional metric naming. Consider using "success" instead to make the metric's meaning more explicit and align with common monitoring conventions where job outcomes are typically labeled as "success" or "failure".

Suggested change
status = "failure" if self.job_failed else "stop"
status = "failure" if self.job_failed else "success"

Copilot uses AI. Check for mistakes.
statsd_cfg = dict(config["statsd"]) if config.has_section("statsd") else None
sentry_cfg = config.get("sentry_cron_jobs", {})
statsd_server = config.get("admin", {}).get("statsd_server", "")
config = None
Copy link

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting config to None immediately after reading it creates potential issues if the config needs to be referenced later and makes the code confusing. This line appears to be an attempt at memory cleanup but is unnecessary in Python's garbage collection model. Consider removing this line as it reduces code clarity without providing benefits.

Suggested change
config = None

Copilot uses AI. Check for mistakes.

def _read_config(config_path):
if not Path(config_path).exists():
raise FileNotFoundError("Missing cron-wrapper configuration file")
Copy link

Copilot AI Dec 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message "Missing cron-wrapper configuration file" is misleading because it doesn't indicate which file path was checked. Consider including the config_path in the error message to help users diagnose configuration issues more easily.

Suggested change
raise FileNotFoundError("Missing cron-wrapper configuration file")
raise FileNotFoundError(f"Missing cron-wrapper configuration file: {config_path}")

Copilot uses AI. Check for mistakes.
@mekarpeles mekarpeles self-assigned this Dec 29, 2025
@mekarpeles
Copy link
Member

mekarpeles commented Dec 29, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Needs: Special Deploy This PR will need a non-standard deploy to production

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants