This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Cicada is a centralized, distributed job scheduler for Pipelinewise schedules. It acts as a lightweight management layer between Linux CRON and executables, allowing jobs to be scheduled across multiple nodes via a central database rather than local cron.
Key architectural concepts:
- Nodes/Servers: Machines that register with Cicada and pull scheduling information from the central database. They execute
cicada exec_server_schedulesvia cron. - Schedules: Jobs defined in the database with cron expressions, parameters, and target servers.
- SmartScheduling: A Genetic Algorithm (GA) optimization module that shifts job start times to distribute load across a 24-hour period, avoiding resource conflicts.
make dev # Create venv with dev dependencies (black, flake8, pytest)
make # Create venv with only production dependenciesThe project uses a standard Python venv setup. The Makefile is the single source of truth for build commands.
make pytest # Run all tests with coverage (must be ≥80%)To run a single test file or specific test:
. venv/bin/activate
pytest tests/test_lib_scheduler.py -v
pytest tests/test_lib_scheduler.py::test_function_name -vmake flake8 # Lint (checks E9, F63, F7, F82 only, max line length 120)
make black # Format check (line length 120)Black is used for code style; run it with black --line-length 120 cicada/ tests/ --diff to preview changes before committing.
cicada/lib/scheduler.py
- Central scheduling logic: retrieving schedules, managing execution state, cron parsing
- Functions like
get_schedule_details(),get_all_schedule_ids_per_server(),get_server_id() - Uses
croniterfor cron expression parsing - Contains SQL queries for the main
schedulesandserverstables
cicada/lib/postgres.py
- Database connection management and helpers
- Connection pooling and statement execution
cicada/lib/utils.py
- Utility functions and decorators for exception handling and logging
cicada/cli.py
- Command dispatcher using argparse
- Routes subcommands to handlers in
cicada/commands/
Commands are located in cicada/commands/ and implement specific operations:
exec_server_schedules.py– Main loop executed by cron on each node; fetches and runs scheduled jobsupsert_schedule.py,show_schedule.py,delete_schedule.py– CRUD operations on schedulessmart_schedule.py– Invokes GA optimization (see SmartScheduling below)spread_schedules.py– Distributes schedules across serversrollback.py– Reverts SmartScheduling changes using checkpoint historyregister_server.py,archive_schedule_log.py,ping_slack.py– Administrative operations
Located in cicada/lib/SmartScheduling/
domain.py
Scheduledataclass: represents a schedule as a "schedule" (job) with properties:schedule_id,server_id,interval_mask(cron expression)frequency_minutes,median_runtime_minutesshift: offset in minutes applied to shift job start timeblocklisted: flag to exclude from GA optimization
config.py
GAConfigdataclass: hyperparameters for the genetic algorithmnum_generations,sol_per_pop,mutation_percent_genes, etc.
pygad.py
- Wraps the external
pygadlibrary (genetic algorithm) - Fitness function: evaluates how well a shift assignment distributes load
- Implements crossover and mutation operations on shifts
evaluation.py
- Scoring logic: calculates resource contention, overlap penalties, and fitness metrics
Key tables:
servers– Registered nodes with hostname, FQDN, IP addressschedules– Job definitions with cron expressions, parameters, execution stateschedule_logs– Historical execution records with runtime, status, outputsnapshots– Metadata about optimization/rollback events (reason, timestamp, server_id)schedule_backups– Schedule state snapshots: storesinterval_maskandsmart_interval_maskat each snapshot for potential rollbackschedule_changes– Linked-list audit trail of all changes to schedules (replaces older snapshot model); each entry hasprevious_change_idfor chain traversal,changes_delta(JSON) for what changed
Database setup SQL is in setup/db_and_user.sql and setup/schema.sql. Migration script: setup/migrate_snapshots_to_changes.sql. Example schedule setup for smart scheduling in setup/create_test_tap_setup.
- All scheduling uses standard cron format (5 fields: minute hour dom month dow)
croniterlibrary parses expressions and calculates next/previous execution times
- Jobs are executed as shell commands by
exec_server_schedules - Commands can include parameters via template substitution
- Outputs and exit codes are logged to
schedule_logstable
- Database connection details from
config/definitions.yml(user must create fromconfig/example.yml) - Each command may accept CLI flags (e.g.,
--schedule_id,--adhoc_execute)
- Load schedules: Fetch all schedules for a server via
get_schedules_per_server() - Create Schedule objects: Convert schedule details to Schedule instances; filter unsupported schedules (irregular cron, too frequent, blocklisted)
- Run GA optimization: PyGAD evolves shifts over N generations to minimize resource conflicts
- Apply and checkpoint: Save optimized shifts back to DB; record change entry via
record_schedule_change()for audit trail and rollback
Cicada supports two rollback mechanisms:
Full Rollback (--full flag):
- Sets
smart_interval_mask = NULLfor affected schedules, reverting to originalinterval_mask - Works per-schedule or per-server
- Records a
ROLLBACK_FULLchange entry inschedule_changes
Rollback to Specific Change (--change-id flag):
- Uses linked-list traversal via
compute_schedule_state_at_change()to reconstruct schedule state at any historical change - Requires
schedule_idandchange_id - Records a
ROLLBACK_TO_CHANGEentry documenting what was restored - Marks the target change as reverted
Change History (--history flag):
- Displays complete audit trail for a schedule via
get_schedule_history() - Each entry shows reason, timestamp, and delta (what changed)
Migration Note: Old snapshot/schedule_backups model supported only last 3 snapshots. New schedule_changes model retains unlimited history via linked-list structure.
Tests are in tests/ and use pytest with fixtures:
test_functional_main.py– Integration tests for the main execution looptest_functional_cli_entrypoint.py– CLI command teststest_functional_spread_schedules.py– SmartScheduling and load distribution teststest_lib_scheduler.py– Unit tests for scheduler utility functionstest_lib_postgres.py– Database connection tests
Mock fixtures often include a test PostgreSQL database or in-memory alternatives. Freezegun is used for time-based testing.
- Create a new file in
cicada/commands/with amain()function - Import and add an entry point in
cicada/cli.py - Add tests in
tests/test_functional_cli_entrypoint.py
- Edit
cicada/lib/scheduler.pyfor core logic changes (e.g., new state transitions) - Update
cicada/lib/SmartScheduling/domain.pyif Schedule validation rules change - Update tests in
test_lib_scheduler.pyto cover new behavior
- Modify SQL in
setup/schema.sql(note: existing deployments require migration scripts) - Update query strings in
scheduler.pyand corresponding test fixtures
- PostgreSQL only: Only PostgreSQL is supported (versions 12.9–15.14 verified)
- No external APIs: Uses only core Python and database; runs offline
- Cron safety: Jobs execute only when registered server node is running; they respect cron expressions and database state
- Rollback support: SmartScheduling changes can be rolled back via checkpoints stored in the database
- Line length: Maximum 120 characters (enforced by Black and Flake8)
- Code coverage: Must maintain ≥80% test coverage for commits