Skip to content

add config-driven scripts for htcondor#44

Open
gsharma99 wants to merge 7 commits intoiris-hep:hpc_workflow_managementfrom
gsharma99:feature/config-driven-pipeline
Open

add config-driven scripts for htcondor#44
gsharma99 wants to merge 7 commits intoiris-hep:hpc_workflow_managementfrom
gsharma99:feature/config-driven-pipeline

Conversation

@gsharma99
Copy link
Contributor

@gsharma99 gsharma99 commented Feb 5, 2026

Related issue #12

cc @JaySandesara

@gsharma99 gsharma99 force-pushed the feature/config-driven-pipeline branch from fb4656f to d2168f6 Compare February 5, 2026 02:41
@gsharma99 gsharma99 marked this pull request as draft February 5, 2026 07:43
@gsharma99 gsharma99 marked this pull request as ready for review February 5, 2026 07:46
@gsharma99 gsharma99 force-pushed the feature/config-driven-pipeline branch from c2333e6 to aab2c1b Compare February 6, 2026 05:33
@gsharma99 gsharma99 force-pushed the feature/config-driven-pipeline branch from b4bdfab to d58dae1 Compare February 6, 2026 05:45
@JaySandesara JaySandesara self-requested a review February 6, 2026 06:38
Copy link
Collaborator

@JaySandesara JaySandesara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preliminary changes suggested to fix bugs in .dag file

JOB LOAD FAIR_universe_Higgs_tautau/htcondor/job.sub
VARS LOAD CONFIG="config.pipeline.yaml" STEP="data_loader" CPUS="12" MEM="128GB" GPUS="0" DISK="64GB" No newline at end of file
# Global Config Variable
CONFIG = config.pipeline.yaml
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defining a user variable doesnt seem to work in DAG files - need to use the file name everywhere explicitly, or find some solution that works

# --- Step 1: Data Loading ---

JOB data_loader FAIR_universe_Higgs_tautau/htcondor/job.sub
VARS data_loader STEP="data_loader" CONFIG="$(CONFIG)" CPUS="1" MEM="8GB" GPUS="0" DISK="32GB"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$(CONFIG) -> config.pipeline.yaml since the variable thing doesnt work. Same for all steps

Copy link
Collaborator

@JaySandesara JaySandesara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor update

# --- Step 1: Data Loading ---

JOB data_loader FAIR_universe_Higgs_tautau/htcondor/job.sub
VARS data_loader STEP="data_loader" CONFIG="$(CONFIG)" CPUS="1" MEM="8GB" GPUS="0" DISK="32GB"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
VARS data_loader STEP="data_loader" CONFIG="$(CONFIG)" CPUS="1" MEM="8GB" GPUS="0" DISK="32GB"
VARS data_loader STEP="data_loader" CONFIG="$(CONFIG)" CPUS="1" MEM="64GB" GPUS="0" DISK="64GB"

transfer_input_files = pyproject.toml, src, FAIR_universe_Higgs_tautau, README.md
transfer_output_files = FAIR_universe_Higgs_tautau/saved_datasets
transfer_output_remaps = "saved_datasets = ./FAIR_universe_Higgs_tautau/saved_datasets"
transfer_output_files = FAIR_universe_Higgs_tautau/saved_datasets, FAIR_universe_Higgs_tautau/output
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since FAIR_universe_Higgs_tautau/output is not saved in all steps, both this and needs to be and FAIR_universe_Higgs_tautau/saved_datasets and any other outputs need to be passed as a list to the submit file. If not that, we need to have a separate .sub file for each step (maybe easiest). Could you create a unique job.sub file for each step, with names like job_data_loader.sub, etc. ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow.dag file can then have entries without the STEP input. E.g.

JOB data_loader FAIR_universe_Higgs_tautau/htcondor/job_data_loader.sub
VARS data_loader CONFIG="config.pipeline.yaml" CPUS="1" MEM="64GB" GPUS="0" DISK="64GB"

and in the job_data_loader.sub for example, we can pass the STEP argument explicitly depending on the job submit file:

executable = FAIR_universe_Higgs_tautau/htcondor/run_step.sh
arguments  = data_loader $(CONFIG)

+ input_features_1Jets \
+ input_features_2Jets \
+ input_features_nJets
cfg_full = load_config(args.config)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cfg_full = load_config(args.config)
cfg_full = load_config(args.config)["data_preprocessing"]

and do similar adjustments to all the other files as well. NO need to load the full config - more error prone

if "data_preprocessing" not in cfg_full:
raise KeyError("Config file missing 'data_preprocessing' section.")

feats = cfg_full["data_preprocessing"]["features"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
feats = cfg_full["data_preprocessing"]["features"]
feats = cfg_full["features"]

with the above change, this is how we use the config. Could you also rename cfg_full to something like config_workflow?

# Section for Data Preprocessing
data_preprocessing:

config_path: "./config.yaml"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
config_path: "./config.yaml"
config_path: "./config.yml"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And everywhere else too - change config.yaml to config.yml

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants