Skip to content

Input/Output Sandbox/Data Management#39

Merged
aldbr merged 41 commits intoDIRACGrid:mainfrom
Loxeris:Job-endpoint-output
Dec 12, 2025
Merged

Input/Output Sandbox/Data Management#39
aldbr merged 41 commits intoDIRACGrid:mainfrom
Loxeris:Job-endpoint-output

Conversation

@Loxeris
Copy link
Copy Markdown
Member

@Loxeris Loxeris commented Oct 8, 2025

  • Add sandbox outputs to execution hooks, with a default implementation.
  • Data catalogs are still defined in execution hooks, and implemented in plugins.
  • LFNs outputs can be overridden in the hints.
  • By default, we get the cwl outputs from the cwltool output.
  • If the output has a valid output path in the DataCatalog interface, or if it's defined in the hints, we output it as a LFN, else in a sandbox. This can be easily redefined in a plugin.

See #25

Copy link
Copy Markdown
Contributor

@aldbr aldbr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Loxeris!
As you can see in my comments, I am not fully understanding the concepts.
Could you add more documentation please? (I know there is none at the moment... 😅)
And could you explain why the approach is so different from what we have in the JDL please?

@Loxeris Loxeris force-pushed the Job-endpoint-output branch 2 times, most recently from 6b8dfc5 to 0917cbd Compare October 30, 2025 15:36
@Loxeris Loxeris changed the title Output Sandbox/Data Management Input/Output Sandbox/Data Management Oct 30, 2025
@mexanick
Copy link
Copy Markdown
Contributor

I'm not sure I understand the approach here. I was under the impression that we've agreed to reuse DIRAC components as much as possible, however I see re-implementation of them under the new DataManagement package... Besides, I thought we agreed on @arrabito proposal from #37 (comment):

Then I would remove from ExecutionHooksBasePlugin:
get_input_query
get_output_query
store_output
while keeping:
pre_process
post_process
data_catalog

with one caveat: instead of data catalog, it is a DataManager instance.

@Loxeris
Copy link
Copy Markdown
Member Author

Loxeris commented Nov 13, 2025

I thought we agreed on arrabito proposal from #37 (comment):

Then I would remove from ExecutionHooksBasePlugin:
get_input_query
get_output_query
store_output
while keeping:
pre_process
post_process
data_catalog

I kept get_input_query and get_output_query when rebasing to not break things but I think it should be removed.
For store_output I'm not 100% sure if it's better to move the code to post_process or if it could be reused by plugins.

@arrabito
Copy link
Copy Markdown
Contributor

We discussed with @Loxeris about removing get_input_query from ExecutionHooksBasePlugin.

First of all the name is misleading, as it does not get a query but a list of directories where input files are located, so if we want to keep it I propose to rename it to get_input_paths.

Then as reminder, this method is used to mimic the remote execution of data-processing transformations, i.e. transformations with input files.

It builds a list of input directories where the input files of a transformation are expected to be. Then the submission command creates the different jobs to treat these inputs as soon as they appear in these input data directories and it submits the jobs locally.

Note also that in the current implementation of the get_input_query method, transformation inputs are obtained through a hard-coded rule, specific to a given workflow/use case, so that it’s not very convenient.

Now it’s good to mimic the transformation execution locally, but I’m not sure that we need to keep such complexity, i.e. dynamic jobs creation as soon as input files appear.

As a reminder, the DIRAC TS supports 2 ways to specify transformations inputs :

  1. through a static list of LFNs
  2. through meta-data queries (which implies dynamic jobs creation)

So I propose that if we want to keep transformation local execution, we could just support either transformations without input files either data-processing transformations with a static list of input files (supposed to be already present on the local host).

We could then implement 2 submission client classes for transformations, as it’s done for job submission, i.e. one for Local Submission and one for DIRAC submission (I’m going to open a separate issue about the implementation of the DIRAC submission client for transformations).

For the local submission, the eventual input files could be specified in a dedicated field of the TransformationExecutionHooksHint (that should be added to group_size), i.e. something like :

input-data:
  
- class: File
    
   path: </path/to/local_file_1>
  
- class: File
    path: </path/to/local_file_2>

that would be retrieved from the input metadata yaml file passed to the CLI:

dirac-cwl transformation submit <workflow_path> [--metadata-path <metadata_path>]

and get_input_query method could be removed.

Any opinion?

@arrabito
Copy link
Copy Markdown
Contributor

About :
#39 (comment)
as it concerns a revision of the transformation submission, we could probably leave this the get_input_query method as it is in this PR and adress this issue in a separate PR.

@aldbr
Copy link
Copy Markdown
Contributor

aldbr commented Nov 24, 2025

Note also that in the current implementation of the get_input_query method, transformation inputs are obtained through a hard-coded rule, specific to a given workflow/use case, so that it’s not very convenient.

I am not sure to understand why it's not convenient, do you have an example?

Now it’s good to mimic the transformation execution locally, but I’m not sure that we need to keep such complexity, i.e. dynamic jobs creation as soon as input files appear.

It's mostly used when we launch a production with multiple transformations.
I think it's good to show that it's feasible, but I agree that it's not ideal because the current prototype relies on threads, whereas we will depend on diracx-tasks in diracx.
But why would we get rid of it for local tests?

So I propose that if we want to keep transformation local execution, we could just support either transformations without input files either data-processing transformations with a static list of input files (supposed to be already present on the local host).

So how do we deal with productions? Does it mean it's not so useful for you?

(I’m going to open a separate issue about the implementation of the DIRAC submission client for transformations).

Ok great!
Do we want to submit CWL transformations to DIRAC? Is that something expected on your side?

@arrabito
Copy link
Copy Markdown
Contributor

Note also that in the current implementation of the get_input_query method, transformation inputs are obtained through a hard-coded rule, specific to a given workflow/use case, so that it’s not very convenient.

I am not sure to understand why it's not convenient, do you have an example?

For me it's not practical because you have to implement a specific plugin for each different workflow.
But more important, when we will submit transformations to diracx, I guess that input queries will be treated by a dedicated task (as it's done now in DIRAC legacy by the InputDataQueryAgent), so the fact of implementing the logic of getting input files in a hook plugin does not seem correct to me. When we will submit productions/transformations to diracx we will not specify the get input files logic in the hook plugin or am I missing something?

Now it’s good to mimic the transformation execution locally, but I’m not sure that we need to keep such complexity, i.e. dynamic jobs creation as soon as input files appear.

It's mostly used when we launch a production with multiple transformations. I think it's good to show that it's feasible, but I agree that it's not ideal because the current prototype relies on threads, whereas we will depend on diracx-tasks in diracx. But why would we get rid of it for local tests?

I see, it's true that for production submission it makes sense to have such a feature. However, I don't really like the fact that the get input files logic is implemented in a hook plugin, because as I said before in diracx it will be handled by a diracx task. Maybe we should think at an alternative to keep this feature but not in the logic of the hook plugin.
Any idea?

So I propose that if we want to keep transformation local execution, we could just support either transformations without input files either data-processing transformations with a static list of input files (supposed to be already present on the local host).

So how do we deal with productions? Does it mean it's not so useful for you?

You mean dealing with productions for local execution?
I think it's useful, I didn't realized that it was needed for productions, but as I said before, I think that we should find an alternative implementation.

(I’m going to open a separate issue about the implementation of the DIRAC submission client for transformations).

Ok great! Do we want to submit CWL transformations to DIRAC? Is that something expected on your side?

I wrote DIRAC to mean diracx.

@mexanick
Copy link
Copy Markdown
Contributor

We discussed with @Loxeris about removing get_input_query from ExecutionHooksBasePlugin.

This method indeed should be removed. This kind of functionality is implemented by the DIRAC's DataManager that we mock here and that's a class member of the ExecutionHooksBasePlugin (currently called data_catalog, but this should be updated following our discussions, and this var should point to an instance of DataManager). Right now the get_input_query and get_output_query methods are kept for consistency with previous prototyping, but we need to identify the very core interface to DataManager and keep main customization in pre/post-process methods.

@aldbr
Copy link
Copy Markdown
Contributor

aldbr commented Nov 25, 2025

I responded into #61 (comment)

@Loxeris Loxeris marked this pull request as ready for review December 9, 2025 10:49
"""Auto-derive hook plugin identifier from class name."""
return cls.__name__

def download_lfns(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation of download_lfns violates the Single Responsibility Principle (SRP) for the execution hook plugin.

The hook's responsibility is solely job preparation metadata, executed client-side. Data management functionality, like downloading LFNs, belongs within the DataManager or FileCatalog architecture.

Implementing local downloads here creates unnecessary overhead, as data access within the job environment on the CE node is handled differently.

Please remove this logic from the hook plugin.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method simply calls the DataManager method for each input data.
I guess the method itself is not really needed and the code could be simply moved to pre-process.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that having this code anywhere in the execution hook is a good idea, as it will download files locally to the submitter PC and not to the computing element where the job will run. In the pre-process you may e.g. create a list of LFNs by querying the FC (via data manager) using hook's parameters for query constraints, but I don't think downloading the files should happen here. @aldbr @arrabito please comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The execution hook and the pre-process code isn't supposed to be run on the submitter PC outside of local testing as far as I understand. This is supposed to be to be used by the JobWrapper running on the computing element.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we also have hooks for transformations, that could e.g. configure how many inputs per job the transformation should have. In this case, they would run at the launch site, not on CE, aren't they?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure if this should be considered an Execution Hook. For me the hints related to transformation shouldn't need plugins and different pre/post-process.
Maybe the value of these hints can also be used during the pre-process and post-process of specific plugins? We might have to discuss this during next meeting.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that having this code anywhere in the execution hook is a good idea, as it will download files locally to the submitter PC and not to the computing element where the job will run. In the pre-process you may e.g. create a list of LFNs by querying the FC (via data manager) using hook's parameters for query constraints, but I don't think downloading the files should happen here. @aldbr @arrabito please comment

I think there is some misunderstanding indeed. I think that we all agree that data management functionalities should be delegated to a Data Manager, which is class member of the ExecutionHooksBasePlugin. But the actual downlaod of input files still happens in the pre-processing step, isn't it? Just the implementation of the download should not be part of the pre_process method but it's still in this method that we would call the Data Manager 'getFile' to download files. Is that correct?

Now, for the execution on the worker node, the Data Manager would be the DIRAC Data Manager, while for local execution the Data Manager would be a Dummy DataManager which deals with local files.

Is that correct?

Then about creating a list of LFNs by querying the FC (via data manager) using hook's parameters for query constraints. This is specific to transformations. The implementation it's again part of the Data Manager but it doesn't happen neither on the PC submitter, neither on the worker node, but in a TS Agent on DIRAC server.
So for me it's not part of the pre-processing.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the actual downlaod of input files still happens in the pre-processing step, isn't it? Just the implementation of the download should not be part of the pre_process method but it's still in this method that we would call the Data Manager 'getFile' to download files. Is that correct?

Yes indeed, this download_lfns method aims at being executed within the JobWrapper.pre_process(), before we execute the pre execution commands.
I have a few comments though:

Now, for the execution on the worker node, the Data Manager would be the DIRAC Data Manager, while for local execution the Data Manager would be a Dummy DataManager which deals with local files.
Is that correct?

From what I understand yes.

Then about creating a list of LFNs by querying the FC (via data manager) using hook's parameters for query constraints. This is specific to transformations. The implementation it's again part of the Data Manager but it doesn't happen neither on the PC submitter, neither on the worker node, but in a TS Agent on DIRAC server.
So for me it's not part of the pre-processing.

Yes indeed

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method is generic, and should go within the JobWrapper

True, I didn't think about that

Just wondering: do you implement your own method for the sake of simplicity? Because in the DIRAC JobWrapper we are using the DownloadInputData module (just in case you were not aware)

I wasn't aware, I still don't quite understand the link between InputDataResolution and DownloadInputData if I'm completely honest

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this is not easy to understand.
In vanilla DIRAC we support 2 ways of importing input data in the JobWrapper:

  • DownloadInputData: the JobWrapper downloads data from LFNs
  • InputDataProtocol: the JobWrapper streams the inputs (in LHCb, we create a PoolXMLCatalog containing, for each LFN, a list of PFNs that can be used (even in parallel) to get data of interest)

Note: if you look carefully, you will see that these 2 classes have the same structure (same public methods and signatures).

The JobWrapper either get the method from the job arguments (JDL), or from the InputDataResolution module that also look into the job arguments, or in the DIRAC configuration.


Parameters
----------
output_name : str
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand the parameters. It seems that multiple files can result in a single LFN?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the output name in the cwl file, referring to one or multiple files.
I think the LFN should be the folder(?) where the data is stored in the FileCatalog.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was left before just to reduce the size of the refactoring, but since now you're refactoring this part, we should update the signature (I believe, here it should be just removed, as it looks like it is superseded by src_path)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output_name is the name of the output in cwl and src_path the file or list of file associated to this output.
I have used output_name in the output_paths hint to map an output to a LFN directory and in the sandbox_output hint to know if the files should be stored in the output sandbox.

"description": "Registry key for the metadata implementation class",
"title": "Hook Plugin",
"type": "string"
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is generated file iirc, should not be modified, but regenerated in the CI.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh okay, I regenerated it using pixi run schemas but I didn't know it was done in the CI.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should remove it from the repo, and get it only generated automatically

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get it will be regenerated by the CI automatically, but why should it be removed from the repo ?
I think it's technically used in some workflows (i.e test/workflows/test_meta/test_meta.cwl), even though I don't think it's mandatory

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the schemas should be eventually published on a static website and be a part of deployment CI/CD. If needed locally, they should be generated as a post-install action, but they should not be stored in the version control as this is derived product, and only one source of truth should be provided. Leaving them committed separately may lead to a potential divergence between the schema and API implementation.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable to me but maybe this should have its own issue and PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some tests are based on them though.

What if:

  • one is modifying the ExecutionHooksHint model
  • pushing to the repo
  • CI detects that the new JSON schema is not in sync with this one and force the user to run the pixi commands and push again, until the new JSON schema matches with this one.

That would allow to keep a good dev experience, because if you don't keep this local copy:

  • one modifies the ExecutionHooksHint
  • push to the repo
  • CI tests are failing because the pydantic model changed
  • one has to understand what's wrong, then generate the new JSON schema, make tests but changing the reference to point to the local JSON schema which will not be uploaded to github.
  • one pushes again, tests are still failing because the JSON schema deployed is the current one, not the new one

Any opinion?

return files_path


def get_lfns(input_data: dict[str, Any]) -> dict[str, Path | list[Path]]:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we keep get_input_query, and in general stage data during the pre-processing, what's the purpose of this method? Isn't it a duplication of the get_input_query?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method parse inputs of the cwl file to differentiate LFNs and local files, on the client side, before submission.
This is necessary to have only lfns in the lfns_input of the JobInputModel.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While get_input_query builds the paths/LFNs of inputs on the FileCatalog needed for a transformation.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I don't remember: why do we need to separate lfns_input exactly? (at some point, we may need to have them separated indeed, to schedule the jobs where the input data is; I am just wondering if we should separate them on the client side)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not implemented yet, but what I had understood is that the conversion to JDL may need the list of LFNS / InputData for Scheduling purposes.

Copy link
Copy Markdown
Contributor

@aldbr aldbr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Loxeris for the hard work! (this is not an easy task!)

logger = logging.getLogger(__name__)


class MockSandboxStoreClient(SandboxStoreClient):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried to "mock" the diracx sandbox store client rather than the DIRAC one? I think it would be preferable since it exists and the new JobWrapper is meant to be integrated to diracx eventually.

https://github.com/DIRACGrid/diracx/blob/47c2b296d0b7b98504c8f142cbe5df67a321df83/diracx-api/src/diracx/api/jobs.py#L51-L125

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't try, it does seem preferable indeed.

break
if res and not res["OK"]:
raise RuntimeError(
f"Could not save file {src} with LFN {str(lfn)} : {res['Message']}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to specify store_output() in the QueryBasedPlugin?
I haven't checked carefully (so please let me know if I'm wrong), but it seems to be a copy paste of what is defined in ExecutionHooksHint.

If it's a duplicate, I would remove it from here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's nearly a copy paste, with the exception of cwl outputs not present in output_paths being stored on the FileCatalog, with the path returned by get_output_query() if any.
If (or when) get_output_query() is deleted, it would be the same as ExecutionHooksHint and would be removed.

"""Auto-derive hook plugin identifier from class name."""
return cls.__name__

def download_lfns(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the actual downlaod of input files still happens in the pre-processing step, isn't it? Just the implementation of the download should not be part of the pre_process method but it's still in this method that we would call the Data Manager 'getFile' to download files. Is that correct?

Yes indeed, this download_lfns method aims at being executed within the JobWrapper.pre_process(), before we execute the pre execution commands.
I have a few comments though:

Now, for the execution on the worker node, the Data Manager would be the DIRAC Data Manager, while for local execution the Data Manager would be a Dummy DataManager which deals with local files.
Is that correct?

From what I understand yes.

Then about creating a list of LFNs by querying the FC (via data manager) using hook's parameters for query constraints. This is specific to transformations. The implementation it's again part of the Data Manager but it doesn't happen neither on the PC submitter, neither on the worker node, but in a TS Agent on DIRAC server.
So for me it's not part of the pre-processing.

Yes indeed

new_paths[input_name] = [paths[lfn] for lfn in paths]
return new_paths

def update_inputs(self, inputs: Any, updates: dict[str, Path | list[Path]]):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As my previous comment, this should probably go in JobWrapper itself.

sandbox_path = Path("sandboxstore") / f"{sandbox}.tar.gz"
with tarfile.open(sandbox_path, "r:gz") as tar:
tar.extractall(job_path, filter="data")
self.execution_hooks_plugin._sandbox_store_client.downloadSandbox(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please try to use diracx-api methods (or at least the structure) if you can

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, don't you check your environment variable to use either the fake sandbox store methods or the real ones (from diracx)?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to use the diracx methods.

I didn't need to check the environment variable, because the version is chosen during the sandbox store instanciation, before this code runs.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes I see. Now I think the upload of the outputs in the sandbox should also happen in the JobWrapper because it's very generic.
So I think it makes sense to move the sandbox_store_client there. What do you think?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's pretty generic. I don't know the communities have a need to change the content of the sandbox. If not then moving it to the JobWrapper would be great.
Switching to the diracx implementation should remove the need for a sandbox_store_client instance entirely as the methods are just part of a module and not a class I think. (I'll have to find a way to mock them though).

file.path = file.path.split("/")[-1]
if not self.execution_hooks_plugin:
raise RuntimeError("Could not download input data")
self.execution_hooks_plugin.download_lfns(arguments, job_path)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said, I think for now the content of execution_hooks_plugin.download_lfns should go here, if I don't say anything wrong, it's something that all communities share, I don't see why we would override it (at least for now).

for input_name, group_size in transformation_execution_hooks.group_size.items():
# Get input query
logger.info(f"\t- Getting input query for {input_name}...")
assert isinstance(transformation_metadata, QueryBasedPlugin)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure to understand why you need that. Can you explain?
Because it's not something we want to keep in the code, unless I misunderstand something?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes I think I get it, it's because your moved get_input_query to QueryBasedPlugin right?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to have get_input_query in ExecutionHooksHintPlugin rather than in QueryBasedPlugin to avoid that (we are experimenting LHCb workflows on our side, and this would prevent us from executing a transformation)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, but I don't think there will be any difference between ExecutionHooksBasePlugin and QueryBasedPlugin if I move the query methods back.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please correct me if I'm wrong, but we would not have this isinstance line here if we move it back to ExecutionHooksBasePlugin, would we?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right it wouldn't be there.
I'm just questioning the purpose of QueryBasedPlugin if there's no difference with the base class

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes I see.
Well, I guess ExecutionHooksHintPlugin is expected to be abstract, unusable per se (well it is not abstract in practice).
And QueryBasedPlugin is just 1 concrete and simple example with no pre/post execution commands.

"description": "Registry key for the metadata implementation class",
"title": "Hook Plugin",
"type": "string"
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some tests are based on them though.

What if:

  • one is modifying the ExecutionHooksHint model
  • pushing to the repo
  • CI detects that the new JSON schema is not in sync with this one and force the user to run the pixi commands and push again, until the new JSON schema matches with this one.

That would allow to keep a good dev experience, because if you don't keep this local copy:

  • one modifies the ExecutionHooksHint
  • push to the repo
  • CI tests are failing because the pydantic model changed
  • one has to understand what's wrong, then generate the new JSON schema, make tests but changing the reference to point to the local JSON schema which will not be uploaded to github.
  • one pushes again, tests are still failing because the JSON schema deployed is the current one, not the new one

Any opinion?

],
)
def test_run_job_with_input_data(
cli_runner, cleanup, pi_test_files, cwl_file, inputs, destination_source_input_data
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not 100% to understand the different with test_job_run_success

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only difference is the preparation of the filecatalog directory

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we need to prepare the filecatalog directory in other test?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also done in the transformation tests IIRC.
The usual test_run_job_success workflows use local files as inputs, so they are uploaded in a sandbox instead. There's no inputs that are expected to be on the FileCatalog for these workflows.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, thanks for the clarification! So may be we could add with_input_sandbox in the name of the other test then?

@aldbr aldbr merged commit a1900b5 into DIRACGrid:main Dec 12, 2025
1 check passed
@aldbr aldbr added the points:13 TOO BIG - must split! label Dec 12, 2025
@aldbr aldbr added this to the Sprint4 milestone Dec 12, 2025
@Loxeris Loxeris deleted the Job-endpoint-output branch March 16, 2026 09:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

points:13 TOO BIG - must split!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants