Refactor of checkDatasetFiles

## Summary
[checkDatasetFiles](https://github.com/SciCatProject/scicat-backend-next/blob/master/src/jobs/jobs.controller.utils.ts#L178-L249) is used when submitting a Job to check if the required files exist in the origdatablocks. Whenever one does not exist, it throws an error with the missing file(s). 

## Steps to Reproduce


## Current Behaviour
The current logic loops over the datasetList from the job payload, finds the corresponding origs, extracts the files from them from dataFileList[].path and intersects with the Jobs payload datasetList[].files. Triggers an error whenever any of the files is not found in the origs of the datasets

## Expected Behaviour
I believe this can be simplified with this (or similar): 
```js
const origs = await this.origDatablocksService.findAll({
            where: {and: [{ datasetId: {$in: ids }], {'dataFileList.path': {$in: dataFileList[].files}}]},
})

const origsDict = origs.reduce((previous, current) => previous[current.datasetId] = new WeakSet(current.dataFileList.map(f => f.path)), {})

const nonExisting = {};
for (const ds of datasetList) {
 if (ds.files.length === 0) continue
 ds.files.map(f => 
   if (!origsDict[ds.pid].has(f)) nonExisting[ds.pid].push(f)
 )
}
if (nonExisting) throw (nonExisting) --> needs loop for formatting here
```

Only one mongo query that improves handshake overhead. DS query removed (not sure I understood the need for it). Maintains use of dicts and sets for O(n*m) complexity (very similar to what's implemented already, nice!). 

The main improvement is reducing the mongo queries.

This could be further improved if `await this.origDatablocksService.findAll` returns too much, and one could use a [cursor](https://mongoosejs.com/docs/api/querycursor.html) and pop from datasetList[].files

something like this:
```js
const dsDict = datasetList.reduce((previous, current) => previous[current.pid] = new Set(current.files)), {})

nonExisting = {}
for await (const orig of OrigDatablocsk.find({and: [{ datasetId: {$in: ids }], {'dataFileList.path': {$in: dataFileList[].files}}]},).cursor()) {
  for (const f of orig.dataFileList) {
     if (dsDict[orig.datasetId].size() === 0) continue
     dsDict[orig.datasetId].delete(f)
   }
  if (dsDict[orig.datasetId].size() === 0) continue
 nonExisting[orig.datasetId] = dsDict[orig.datasetId]
}

if (nonExisting) throw (nonExisting) --> needs loop for formatting here

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor of checkDatasetFiles #2180

Summary

Steps to Reproduce

Current Behaviour

Expected Behaviour

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Refactor of checkDatasetFiles #2180

Description

Summary

Steps to Reproduce

Current Behaviour

Expected Behaviour

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions