Skip to content

Identify last access time for a record. #287

@psaiz

Description

@psaiz

There is an index in opensearch with the mapping between the the files and the records; opendata-prod-v0.2-records-recid_mapping

It would be nice to extend that information with the last access time per file. And we should cross check this information with the eos dump that we get on /eos/workspace/c/cernod/dumps/opendata/latest

We have requested to add the access date on that dump, which would make the task easier.

In the meantime, it would be nice to:

  • Setup a daily celery task that would go through the files in the eos dump.
  • For the files in that dump (ignoring anything under /eos/opendada//upload), identify the ones that are not in the index.
  • For those ones, check if they are new files that should be added to that index (if they exist in the database). If not, create a new index with the name of all these files (opendata-dev-v0.2-dark-files)

  • Now that the access date is added to the dump: for files that are in the index, update the index with the last accessed time

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions