Input/Output Sandbox/Data Management by Loxeris · Pull Request #39 · DIRACGrid/dirac-cwl

Loxeris · 2025-10-08T09:27:45Z

Add sandbox outputs to execution hooks, with a default implementation.
Data catalogs are still defined in execution hooks, and implemented in plugins.
LFNs outputs can be overridden in the hints.
By default, we get the cwl outputs from the cwltool output.
If the output has a valid output path in the DataCatalog interface, or if it's defined in the hints, we output it as a LFN, else in a sandbox. This can be easily redefined in a plugin.

aldbr

Thanks @Loxeris!
As you can see in my comments, I am not fully understanding the concepts.
Could you add more documentation please? (I know there is none at the moment... 😅)
And could you explain why the approach is so different from what we have in the JDL please?

src/dirac_cwl_proto/execution_hooks/core.py

src/dirac_cwl_proto/job/__init__.py

src/dirac_cwl_proto/execution_hooks/DataManagement/Sandbox.py

src/dirac_cwl_proto/execution_hooks/data_management/file_catalog.py

src/dirac_cwl_proto/execution_hooks/DataManagement/DataManager.py

src/dirac_cwl_proto/execution_hooks/DataManagement/FileCatalog.py

src/dirac_cwl_proto/execution_hooks/data_management/data_manager.py

src/dirac_cwl_proto/execution_hooks/data_management/file_catalog.py

mexanick · 2025-11-13T14:33:00Z

I'm not sure I understand the approach here. I was under the impression that we've agreed to reuse DIRAC components as much as possible, however I see re-implementation of them under the new DataManagement package... Besides, I thought we agreed on @arrabito proposal from #37 (comment):

Then I would remove from ExecutionHooksBasePlugin:
get_input_query
get_output_query
store_output
while keeping:
pre_process
post_process
data_catalog

with one caveat: instead of data catalog, it is a DataManager instance.

Loxeris · 2025-11-13T15:51:26Z

I thought we agreed on arrabito proposal from #37 (comment):

Then I would remove from ExecutionHooksBasePlugin:
get_input_query
get_output_query
store_output
while keeping:
pre_process
post_process
data_catalog

I kept get_input_query and get_output_query when rebasing to not break things but I think it should be removed.
For store_output I'm not 100% sure if it's better to move the code to post_process or if it could be reused by plugins.

arrabito · 2025-11-24T08:36:35Z

We discussed with @Loxeris about removing get_input_query from ExecutionHooksBasePlugin.

First of all the name is misleading, as it does not get a query but a list of directories where input files are located, so if we want to keep it I propose to rename it to get_input_paths.

Then as reminder, this method is used to mimic the remote execution of data-processing transformations, i.e. transformations with input files.

It builds a list of input directories where the input files of a transformation are expected to be. Then the submission command creates the different jobs to treat these inputs as soon as they appear in these input data directories and it submits the jobs locally.

Note also that in the current implementation of the get_input_query method, transformation inputs are obtained through a hard-coded rule, specific to a given workflow/use case, so that it’s not very convenient.

Now it’s good to mimic the transformation execution locally, but I’m not sure that we need to keep such complexity, i.e. dynamic jobs creation as soon as input files appear.

As a reminder, the DIRAC TS supports 2 ways to specify transformations inputs :

through a static list of LFNs
through meta-data queries (which implies dynamic jobs creation)

So I propose that if we want to keep transformation local execution, we could just support either transformations without input files either data-processing transformations with a static list of input files (supposed to be already present on the local host).

We could then implement 2 submission client classes for transformations, as it’s done for job submission, i.e. one for Local Submission and one for DIRAC submission (I’m going to open a separate issue about the implementation of the DIRAC submission client for transformations).

For the local submission, the eventual input files could be specified in a dedicated field of the TransformationExecutionHooksHint (that should be added to group_size), i.e. something like :

input-data:   
- class: File     
   path: </path/to/local_file_1>   
- class: File
    path: </path/to/local_file_2>

that would be retrieved from the input metadata yaml file passed to the CLI:

dirac-cwl transformation submit <workflow_path> [--metadata-path <metadata_path>]

and get_input_query method could be removed.

Any opinion?

arrabito · 2025-11-24T10:40:46Z

About :
#39 (comment)
as it concerns a revision of the transformation submission, we could probably leave this the get_input_query method as it is in this PR and adress this issue in a separate PR.

aldbr · 2025-11-24T13:58:03Z

Note also that in the current implementation of the get_input_query method, transformation inputs are obtained through a hard-coded rule, specific to a given workflow/use case, so that it’s not very convenient.

I am not sure to understand why it's not convenient, do you have an example?

Now it’s good to mimic the transformation execution locally, but I’m not sure that we need to keep such complexity, i.e. dynamic jobs creation as soon as input files appear.

It's mostly used when we launch a production with multiple transformations.
I think it's good to show that it's feasible, but I agree that it's not ideal because the current prototype relies on threads, whereas we will depend on diracx-tasks in diracx.
But why would we get rid of it for local tests?

So I propose that if we want to keep transformation local execution, we could just support either transformations without input files either data-processing transformations with a static list of input files (supposed to be already present on the local host).

So how do we deal with productions? Does it mean it's not so useful for you?

(I’m going to open a separate issue about the implementation of the DIRAC submission client for transformations).

Ok great!
Do we want to submit CWL transformations to DIRAC? Is that something expected on your side?

arrabito · 2025-11-24T15:55:24Z

Note also that in the current implementation of the get_input_query method, transformation inputs are obtained through a hard-coded rule, specific to a given workflow/use case, so that it’s not very convenient.

I am not sure to understand why it's not convenient, do you have an example?

For me it's not practical because you have to implement a specific plugin for each different workflow.
But more important, when we will submit transformations to diracx, I guess that input queries will be treated by a dedicated task (as it's done now in DIRAC legacy by the InputDataQueryAgent), so the fact of implementing the logic of getting input files in a hook plugin does not seem correct to me. When we will submit productions/transformations to diracx we will not specify the get input files logic in the hook plugin or am I missing something?

Now it’s good to mimic the transformation execution locally, but I’m not sure that we need to keep such complexity, i.e. dynamic jobs creation as soon as input files appear.

It's mostly used when we launch a production with multiple transformations. I think it's good to show that it's feasible, but I agree that it's not ideal because the current prototype relies on threads, whereas we will depend on diracx-tasks in diracx. But why would we get rid of it for local tests?

I see, it's true that for production submission it makes sense to have such a feature. However, I don't really like the fact that the get input files logic is implemented in a hook plugin, because as I said before in diracx it will be handled by a diracx task. Maybe we should think at an alternative to keep this feature but not in the logic of the hook plugin.
Any idea?

So I propose that if we want to keep transformation local execution, we could just support either transformations without input files either data-processing transformations with a static list of input files (supposed to be already present on the local host).

So how do we deal with productions? Does it mean it's not so useful for you?

You mean dealing with productions for local execution?
I think it's useful, I didn't realized that it was needed for productions, but as I said before, I think that we should find an alternative implementation.

(I’m going to open a separate issue about the implementation of the DIRAC submission client for transformations).

Ok great! Do we want to submit CWL transformations to DIRAC? Is that something expected on your side?

I wrote DIRAC to mean diracx.

mexanick · 2025-11-24T17:40:53Z

We discussed with @Loxeris about removing get_input_query from ExecutionHooksBasePlugin.

This method indeed should be removed. This kind of functionality is implemented by the DIRAC's DataManager that we mock here and that's a class member of the ExecutionHooksBasePlugin (currently called data_catalog, but this should be updated following our discussions, and this var should point to an instance of DataManager). Right now the get_input_query and get_output_query methods are kept for consistency with previous prototyping, but we need to identify the very core interface to DataManager and keep main customization in pre/post-process methods.

aldbr · 2025-11-25T16:14:16Z

I responded into #61 (comment)

src/dirac_cwl_proto/execution_hooks/plugins/core.py

mexanick · 2025-12-09T13:04:39Z

src/dirac_cwl_proto/execution_hooks/core.py

        """Auto-derive hook plugin identifier from class name."""
        return cls.__name__

+    def download_lfns(


This implementation of download_lfns violates the Single Responsibility Principle (SRP) for the execution hook plugin.

The hook's responsibility is solely job preparation metadata, executed client-side. Data management functionality, like downloading LFNs, belongs within the DataManager or FileCatalog architecture.

Implementing local downloads here creates unnecessary overhead, as data access within the job environment on the CE node is handled differently.

Please remove this logic from the hook plugin.

This method simply calls the DataManager method for each input data.
I guess the method itself is not really needed and the code could be simply moved to pre-process.

I don't think that having this code anywhere in the execution hook is a good idea, as it will download files locally to the submitter PC and not to the computing element where the job will run. In the pre-process you may e.g. create a list of LFNs by querying the FC (via data manager) using hook's parameters for query constraints, but I don't think downloading the files should happen here. @aldbr @arrabito please comment

The execution hook and the pre-process code isn't supposed to be run on the submitter PC outside of local testing as far as I understand. This is supposed to be to be used by the JobWrapper running on the computing element.

but we also have hooks for transformations, that could e.g. configure how many inputs per job the transformation should have. In this case, they would run at the launch site, not on CE, aren't they?

I'm not 100% sure if this should be considered an Execution Hook. For me the hints related to transformation shouldn't need plugins and different pre/post-process.
Maybe the value of these hints can also be used during the pre-process and post-process of specific plugins? We might have to discuss this during next meeting.

I don't think that having this code anywhere in the execution hook is a good idea, as it will download files locally to the submitter PC and not to the computing element where the job will run. In the pre-process you may e.g. create a list of LFNs by querying the FC (via data manager) using hook's parameters for query constraints, but I don't think downloading the files should happen here. @aldbr @arrabito please comment

I think there is some misunderstanding indeed. I think that we all agree that data management functionalities should be delegated to a Data Manager, which is class member of the ExecutionHooksBasePlugin. But the actual downlaod of input files still happens in the pre-processing step, isn't it? Just the implementation of the download should not be part of the pre_process method but it's still in this method that we would call the Data Manager 'getFile' to download files. Is that correct?

Now, for the execution on the worker node, the Data Manager would be the DIRAC Data Manager, while for local execution the Data Manager would be a Dummy DataManager which deals with local files.

Is that correct?

Then about creating a list of LFNs by querying the FC (via data manager) using hook's parameters for query constraints. This is specific to transformations. The implementation it's again part of the Data Manager but it doesn't happen neither on the PC submitter, neither on the worker node, but in a TS Agent on DIRAC server.
So for me it's not part of the pre-processing.

But the actual downlaod of input files still happens in the pre-processing step, isn't it? Just the implementation of the download should not be part of the pre_process method but it's still in this method that we would call the Data Manager 'getFile' to download files. Is that correct?

Yes indeed, this download_lfns method aims at being executed within the JobWrapper.pre_process(), before we execute the pre execution commands.
I have a few comments though:

I think this method is generic, and should go within the JobWrapper (do we want to let communities override that? I don't think so)

Just wondering: do you implement your own method for the sake of simplicity? Because in the DIRAC JobWrapper we are using the DownloadInputData module (just in case you were not aware):

https://github.com/DIRACGrid/DIRAC/blob/integration/src/DIRAC/WorkloadManagementSystem/Client/DownloadInputData.py

https://github.com/DIRACGrid/DIRAC/blob/4d153b55a3ca5176e8b3017768f18c9f617e5893/src/DIRAC/WorkloadManagementSystem/JobWrapper/JobWrapper.py#L729

At some point, we want to reuse that mechanism (LHCb does not download the inputs but stream them using another input data policy)

Now, for the execution on the worker node, the Data Manager would be the DIRAC Data Manager, while for local execution the Data Manager would be a Dummy DataManager which deals with local files.
Is that correct?

From what I understand yes.

Then about creating a list of LFNs by querying the FC (via data manager) using hook's parameters for query constraints. This is specific to transformations. The implementation it's again part of the Data Manager but it doesn't happen neither on the PC submitter, neither on the worker node, but in a TS Agent on DIRAC server.
So for me it's not part of the pre-processing.

Yes indeed

I think this method is generic, and should go within the JobWrapper

True, I didn't think about that

Just wondering: do you implement your own method for the sake of simplicity? Because in the DIRAC JobWrapper we are using the DownloadInputData module (just in case you were not aware)

I wasn't aware, I still don't quite understand the link between InputDataResolution and DownloadInputData if I'm completely honest

Yes this is not easy to understand.
In vanilla DIRAC we support 2 ways of importing input data in the JobWrapper:

DownloadInputData: the JobWrapper downloads data from LFNs

InputDataProtocol: the JobWrapper streams the inputs (in LHCb, we create a PoolXMLCatalog containing, for each LFN, a list of PFNs that can be used (even in parallel) to get data of interest)

Note: if you look carefully, you will see that these 2 classes have the same structure (same public methods and signatures).

The JobWrapper either get the method from the job arguments (JDL), or from the InputDataResolution module that also look into the job arguments, or in the DIRAC configuration.

src/dirac_cwl_proto/execution_hooks/core.py

mexanick · 2025-12-09T13:08:48Z

src/dirac_cwl_proto/execution_hooks/core.py

+
+        Parameters
+        ----------
+        output_name : str


I do not understand the parameters. It seems that multiple files can result in a single LFN?

this is the output name in the cwl file, referring to one or multiple files.
I think the LFN should be the folder(?) where the data is stored in the FileCatalog.

I think it was left before just to reduce the size of the refactoring, but since now you're refactoring this part, we should update the signature (I believe, here it should be just removed, as it looks like it is superseded by src_path)

output_name is the name of the output in cwl and src_path the file or list of file associated to this output.
I have used output_name in the output_paths hint to map an output to a LFN directory and in the sandbox_output hint to know if the files should be stored in the output sandbox.

src/dirac_cwl_proto/execution_hooks/core.py

src/dirac_cwl_proto/job/job_wrapper.py

mexanick · 2025-12-09T13:55:00Z

test/schemas/dirac-metadata.json

          "description": "Registry key for the metadata implementation class",
          "title": "Hook Plugin",
          "type": "string"
+        },


this is generated file iirc, should not be modified, but regenerated in the CI.

Oh okay, I regenerated it using pixi run schemas but I didn't know it was done in the CI.

I think we should remove it from the repo, and get it only generated automatically

I get it will be regenerated by the CI automatically, but why should it be removed from the repo ?
I think it's technically used in some workflows (i.e test/workflows/test_meta/test_meta.cwl), even though I don't think it's mandatory

the schemas should be eventually published on a static website and be a part of deployment CI/CD. If needed locally, they should be generated as a post-install action, but they should not be stored in the version control as this is derived product, and only one source of truth should be provided. Leaving them committed separately may lead to a potential divergence between the schema and API implementation.

This seems reasonable to me but maybe this should have its own issue and PR.

Some tests are based on them though.

What if:

one is modifying the ExecutionHooksHint model

pushing to the repo

CI detects that the new JSON schema is not in sync with this one and force the user to run the pixi commands and push again, until the new JSON schema matches with this one.

That would allow to keep a good dev experience, because if you don't keep this local copy:

one modifies the ExecutionHooksHint

push to the repo

CI tests are failing because the pydantic model changed

one has to understand what's wrong, then generate the new JSON schema, make tests but changing the reference to point to the local JSON schema which will not be uploaded to github.

one pushes again, tests are still failing because the JSON schema deployed is the current one, not the new one

Any opinion?

mexanick · 2025-12-10T08:57:13Z

src/dirac_cwl_proto/job/__init__.py

    return files_path


+def get_lfns(input_data: dict[str, Any]) -> dict[str, Path | list[Path]]:


if we keep get_input_query, and in general stage data during the pre-processing, what's the purpose of this method? Isn't it a duplication of the get_input_query?

This method parse inputs of the cwl file to differentiate LFNs and local files, on the client side, before submission.
This is necessary to have only lfns in the lfns_input of the JobInputModel.

While get_input_query builds the paths/LFNs of inputs on the FileCatalog needed for a transformation.

Sorry I don't remember: why do we need to separate lfns_input exactly? (at some point, we may need to have them separated indeed, to schedule the jobs where the input data is; I am just wondering if we should separate them on the client side)

It's not implemented yet, but what I had understood is that the conversion to JDL may need the list of LFNS / InputData for Scheduling purposes.

aldbr

Thanks @Loxeris for the hard work! (this is not an easy task!)

aldbr · 2025-12-10T15:37:13Z

src/dirac_cwl_proto/data_management_mocks/sandbox.py

+logger = logging.getLogger(__name__)
+
+
+class MockSandboxStoreClient(SandboxStoreClient):


Have you tried to "mock" the diracx sandbox store client rather than the DIRAC one? I think it would be preferable since it exists and the new JobWrapper is meant to be integrated to diracx eventually.

https://github.com/DIRACGrid/diracx/blob/47c2b296d0b7b98504c8f142cbe5df67a321df83/diracx-api/src/diracx/api/jobs.py#L51-L125

I didn't try, it does seem preferable indeed.

src/dirac_cwl_proto/execution_hooks/plugins/core.py

aldbr · 2025-12-10T15:59:59Z

src/dirac_cwl_proto/execution_hooks/plugins/core.py

+                            break
+                    if res and not res["OK"]:
+                        raise RuntimeError(
+                            f"Could not save file {src} with LFN {str(lfn)} : {res['Message']}"


Do we want to specify store_output() in the QueryBasedPlugin?
I haven't checked carefully (so please let me know if I'm wrong), but it seems to be a copy paste of what is defined in ExecutionHooksHint.

If it's a duplicate, I would remove it from here.

It's nearly a copy paste, with the exception of cwl outputs not present in output_paths being stored on the FileCatalog, with the path returned by get_output_query() if any.
If (or when) get_output_query() is deleted, it would be the same as ExecutionHooksHint and would be removed.

aldbr · 2025-12-10T16:30:36Z

src/dirac_cwl_proto/execution_hooks/core.py

        """Auto-derive hook plugin identifier from class name."""
        return cls.__name__

+    def download_lfns(


But the actual downlaod of input files still happens in the pre-processing step, isn't it? Just the implementation of the download should not be part of the pre_process method but it's still in this method that we would call the Data Manager 'getFile' to download files. Is that correct?

Yes indeed, this download_lfns method aims at being executed within the JobWrapper.pre_process(), before we execute the pre execution commands.
I have a few comments though:

I think this method is generic, and should go within the JobWrapper (do we want to let communities override that? I don't think so)

Just wondering: do you implement your own method for the sake of simplicity? Because in the DIRAC JobWrapper we are using the DownloadInputData module (just in case you were not aware):

https://github.com/DIRACGrid/DIRAC/blob/integration/src/DIRAC/WorkloadManagementSystem/Client/DownloadInputData.py

https://github.com/DIRACGrid/DIRAC/blob/4d153b55a3ca5176e8b3017768f18c9f617e5893/src/DIRAC/WorkloadManagementSystem/JobWrapper/JobWrapper.py#L729

At some point, we want to reuse that mechanism (LHCb does not download the inputs but stream them using another input data policy)

Now, for the execution on the worker node, the Data Manager would be the DIRAC Data Manager, while for local execution the Data Manager would be a Dummy DataManager which deals with local files.
Is that correct?

From what I understand yes.

Then about creating a list of LFNs by querying the FC (via data manager) using hook's parameters for query constraints. This is specific to transformations. The implementation it's again part of the Data Manager but it doesn't happen neither on the PC submitter, neither on the worker node, but in a TS Agent on DIRAC server.
So for me it's not part of the pre-processing.

Yes indeed

aldbr · 2025-12-10T16:31:58Z

src/dirac_cwl_proto/execution_hooks/core.py

+                    new_paths[input_name] = [paths[lfn] for lfn in paths]
+        return new_paths
+
+    def update_inputs(self, inputs: Any, updates: dict[str, Path | list[Path]]):


As my previous comment, this should probably go in JobWrapper itself.

aldbr · 2025-12-10T16:51:19Z

src/dirac_cwl_proto/job/job_wrapper.py

-            sandbox_path = Path("sandboxstore") / f"{sandbox}.tar.gz"
-            with tarfile.open(sandbox_path, "r:gz") as tar:
-                tar.extractall(job_path, filter="data")
+            self.execution_hooks_plugin._sandbox_store_client.downloadSandbox(


Please try to use diracx-api methods (or at least the structure) if you can

Out of curiosity, don't you check your environment variable to use either the fake sandbox store methods or the real ones (from diracx)?

I will try to use the diracx methods.

I didn't need to check the environment variable, because the version is chosen during the sandbox store instanciation, before this code runs.

Ah yes I see. Now I think the upload of the outputs in the sandbox should also happen in the JobWrapper because it's very generic.
So I think it makes sense to move the sandbox_store_client there. What do you think?

I guess it's pretty generic. I don't know the communities have a need to change the content of the sandbox. If not then moving it to the JobWrapper would be great.
Switching to the diracx implementation should remove the need for a sandbox_store_client instance entirely as the methods are just part of a module and not a class I think. (I'll have to find a way to mock them though).

aldbr · 2025-12-10T16:57:06Z

src/dirac_cwl_proto/job/job_wrapper.py

-            file.path = file.path.split("/")[-1]
+        if not self.execution_hooks_plugin:
+            raise RuntimeError("Could not download input data")
+        self.execution_hooks_plugin.download_lfns(arguments, job_path)


As I said, I think for now the content of execution_hooks_plugin.download_lfns should go here, if I don't say anything wrong, it's something that all communities share, I don't see why we would override it (at least for now).

aldbr · 2025-12-10T17:01:31Z

src/dirac_cwl_proto/transformation/__init__.py

        for input_name, group_size in transformation_execution_hooks.group_size.items():
            # Get input query
            logger.info(f"\t- Getting input query for {input_name}...")
+            assert isinstance(transformation_metadata, QueryBasedPlugin)


Not sure to understand why you need that. Can you explain?
Because it's not something we want to keep in the code, unless I misunderstand something?

Ah yes I think I get it, it's because your moved get_input_query to QueryBasedPlugin right?

I would prefer to have get_input_query in ExecutionHooksHintPlugin rather than in QueryBasedPlugin to avoid that (we are experimenting LHCb workflows on our side, and this would prevent us from executing a transformation)

Okay, but I don't think there will be any difference between ExecutionHooksBasePlugin and QueryBasedPlugin if I move the query methods back.

Please correct me if I'm wrong, but we would not have this isinstance line here if we move it back to ExecutionHooksBasePlugin, would we?

You're right it wouldn't be there.
I'm just questioning the purpose of QueryBasedPlugin if there's no difference with the base class

Ah yes I see.
Well, I guess ExecutionHooksHintPlugin is expected to be abstract, unusable per se (well it is not abstract in practice).
And QueryBasedPlugin is just 1 concrete and simple example with no pre/post execution commands.

aldbr · 2025-12-10T17:17:19Z

test/schemas/dirac-metadata.json

          "description": "Registry key for the metadata implementation class",
          "title": "Hook Plugin",
          "type": "string"
+        },


Some tests are based on them though.

What if:

one is modifying the ExecutionHooksHint model

pushing to the repo

CI detects that the new JSON schema is not in sync with this one and force the user to run the pixi commands and push again, until the new JSON schema matches with this one.

That would allow to keep a good dev experience, because if you don't keep this local copy:

one modifies the ExecutionHooksHint

push to the repo

CI tests are failing because the pydantic model changed

one has to understand what's wrong, then generate the new JSON schema, make tests but changing the reference to point to the local JSON schema which will not be uploaded to github.

one pushes again, tests are still failing because the JSON schema deployed is the current one, not the new one

Any opinion?

aldbr · 2025-12-10T17:22:24Z

test/test_workflows.py

+    ],
+)
+def test_run_job_with_input_data(
+    cli_runner, cleanup, pi_test_files, cwl_file, inputs, destination_source_input_data


Not 100% to understand the different with test_job_run_success

The only difference is the preparation of the filecatalog directory

Why don't we need to prepare the filecatalog directory in other test?

It's also done in the transformation tests IIRC.
The usual test_run_job_success workflows use local files as inputs, so they are uploaded in a sandbox instead. There's no inputs that are expected to be on the FileCatalog for these workflows.

Oh I see, thanks for the clarification! So may be we could add with_input_sandbox in the name of the other test then?

arrabito mentioned this pull request Oct 11, 2025

add jdl generator and refactoring #41

Merged

Loxeris force-pushed the Job-endpoint-output branch from 84f6dce to d09e285 Compare October 17, 2025 09:36

Loxeris mentioned this pull request Oct 20, 2025

Review DataCatalogInterface method names and signatures #37

Open

aldbr reviewed Oct 21, 2025

View reviewed changes

Loxeris force-pushed the Job-endpoint-output branch 2 times, most recently from 6b8dfc5 to 0917cbd Compare October 30, 2025 15:36

Loxeris changed the title ~~Output Sandbox/Data Management~~ Input/Output Sandbox/Data Management Oct 30, 2025

Loxeris had a problem deploying to github-pages October 30, 2025 15:36 — with GitHub Actions Failure

Loxeris had a problem deploying to github-pages November 3, 2025 09:59 — with GitHub Actions Failure

aldbr reviewed Nov 3, 2025

View reviewed changes

Loxeris had a problem deploying to github-pages November 3, 2025 17:36 — with GitHub Actions Failure

Loxeris had a problem deploying to github-pages November 3, 2025 21:45 — with GitHub Actions Failure

Loxeris force-pushed the Job-endpoint-output branch from ff019a7 to 6403918 Compare November 13, 2025 11:17

Loxeris had a problem deploying to github-pages November 13, 2025 11:18 — with GitHub Actions Failure

aldbr requested a review from mexanick November 13, 2025 13:22

mexanick reviewed Nov 13, 2025

View reviewed changes

src/dirac_cwl_proto/execution_hooks/data_management/data_manager.py Outdated Show resolved Hide resolved

mexanick reviewed Nov 13, 2025

View reviewed changes

src/dirac_cwl_proto/execution_hooks/data_management/file_catalog.py Outdated Show resolved Hide resolved

Loxeris force-pushed the Job-endpoint-output branch from 6403918 to 9020f2e Compare November 20, 2025 09:23

Loxeris had a problem deploying to github-pages November 20, 2025 09:23 — with GitHub Actions Failure

Loxeris force-pushed the Job-endpoint-output branch from 24e04d2 to ecdd23f Compare December 1, 2025 08:57

Loxeris had a problem deploying to github-pages December 1, 2025 08:57 — with GitHub Actions Failure

feat(Job Endpoint): add job submission with input data using LFN prefix

fc60245

feat: add output_se hint

6d2ad79

Loxeris force-pushed the Job-endpoint-output branch from ecdd23f to 6d2ad79 Compare December 5, 2025 15:09

Loxeris had a problem deploying to github-pages December 5, 2025 15:10 — with GitHub Actions Failure

fix: fix output_se in hook instanciation

d14ddd8

Loxeris had a problem deploying to github-pages December 5, 2025 16:48 — with GitHub Actions Failure

fix: fix sandbox mock

49f4ce0

Loxeris marked this pull request as ready for review December 9, 2025 10:49

Loxeris had a problem deploying to github-pages December 9, 2025 10:49 — with GitHub Actions Failure

mexanick suggested changes Dec 9, 2025

View reviewed changes

Loxeris added 2 commits December 9, 2025 15:43

refactor: refactor runtime_metadata to execution_hooks_plugin

06dcb11

fix: fix lfn path for files of the same cwl output

3ef7091

Loxeris had a problem deploying to github-pages December 9, 2025 15:21 — with GitHub Actions Failure

mexanick reviewed Dec 10, 2025

View reviewed changes

docs: cleanup docstring

d6df8f3

Loxeris had a problem deploying to github-pages December 10, 2025 11:17 — with GitHub Actions Failure

fix: various fixes and add logs info for outputs

d4f6eb2

Loxeris had a problem deploying to github-pages December 10, 2025 17:00 — with GitHub Actions Failure

aldbr reviewed Dec 10, 2025

View reviewed changes

Loxeris added 4 commits December 12, 2025 16:30

refactor: move generic methods to JobWrapper

fef32e8

refactor: change environment variable definition

bbc1200

fix: fix sandboxstoreClient used for sandbox downloads

eef88c9

fix: remove duplicate code

8b9e320

aldbr approved these changes Dec 12, 2025

View reviewed changes

aldbr merged commit a1900b5 into DIRACGrid:main Dec 12, 2025
1 check passed

aldbr added this to CWL-DiracX Integration Planning Dec 12, 2025

github-project-automation bot moved this to Done in CWL-DiracX Integration Planning Dec 12, 2025

aldbr added the points:13 TOO BIG - must split! label Dec 12, 2025

aldbr added this to the Sprint4 milestone Dec 12, 2025

aldbr mentioned this pull request Dec 12, 2025

Remove duplicate JSON schemas from test workflows and improve documentation #75

Open

6 tasks

Loxeris deleted the Job-endpoint-output branch March 16, 2026 09:48

		return files_path


		def get_lfns(input_data: dict[str, Any]) -> dict[str, Path \| list[Path]]:

		logger = logging.getLogger(__name__)


		class MockSandboxStoreClient(SandboxStoreClient):

Conversation

Loxeris commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aldbr left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mexanick commented Nov 13, 2025

Uh oh!

Loxeris commented Nov 13, 2025

Uh oh!

arrabito commented Nov 24, 2025

Uh oh!

arrabito commented Nov 24, 2025

Uh oh!

aldbr commented Nov 24, 2025

Uh oh!

arrabito commented Nov 24, 2025

Uh oh!

mexanick commented Nov 24, 2025

Uh oh!

aldbr commented Nov 25, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Loxeris commented Oct 8, 2025 •

edited

Loading