Skip to content

Milestones

List view

  • # General Guide Since it is deployed directly from the repository, ``master`` branch will stay. We will have a separate ``main`` branch and will merge refactored code into ``main``. Once we finish an integration tests and confirms that it is doing at least the same job as the current implementation, we can set the ``main`` as a new master. TL;DR: Open a pull request into ``main``, not into ``master``. # Higher Level TODOs - [ ] Create a package ``scicat_ingestor`` and gradually move all the existing code to the package. It not for deployment so it just needs to be accessible to the entry point script (The command or a script that is used for deployment) - [ ] Setting up smaller scope tests environments - [ ] Setting up automated continuous integration tests environments - [ ] Update documentation - [ ] Turn ``main`` branch the new master branch and archive ``master`` Brainstorming what should be done for refactoring. Feel free to shoot down the ideas : D ## Encapsulating partial code into modules/functions For example, these can be split into another module...? https://github.com/SciCatProject/scicat-filewriter-ingest/blob/3c5c22ea4e606719e55b7999e407f18470b4e76c/online_ingestor.py#L116-L124 ### Kafka Consumer/Message Handling As it is carrying the core-logic, we maybe better wrap the confluent kafka interfaces so that we can easily test them as well...? For example, https://github.com/SciCatProject/scicat-filewriter-ingest/blob/3c5c22ea4e606719e55b7999e407f18470b4e76c/online_ingestor.py#L71-L77 This kind of error-handling can be separated into a smaller function. Or Wrap this into a function or automate this somehow...? https://github.com/SciCatProject/scicat-filewriter-ingest/blob/3c5c22ea4e606719e55b7999e407f18470b4e76c/online_ingestor.py#L32-L40 ### Logging Logging seems like to require some configuration steps but despite of the configuration the logging interface ``logger.log(info/debug/error)`` should be the same. But since the configuration of the logger requires file or link handling, it'd better be tested I guess...? ### Ingestor This guy definitely seems like to need some chopping. https://github.com/SciCatProject/scicat-filewriter-ingest/blob/3c5c22ea4e606719e55b7999e407f18470b4e76c/ingestor_lib.py#L222 For example, I think this should be wrapped into a separate file handling interface so that we can test them. https://github.com/SciCatProject/scicat-filewriter-ingest/blob/3c5c22ea4e606719e55b7999e407f18470b4e76c/ingestor_lib.py#L258-L261 ## Redefining Interfaces ### Nexus structure path https://github.com/SciCatProject/scicat-filewriter-ingest/blob/3c5c22ea4e606719e55b7999e407f18470b4e76c/ingestor_lib.py#L28-L33 This mixture of `filter` and `indexing` can maybe refactored into more explicit way...? But I also think it is already working conveniently already... hmm... ### Ingestor - Maybe not redefining, but just better type-hinting or wrapping some of the arguments into a tighter data structure. https://github.com/SciCatProject/scicat-filewriter-ingest/blob/3c5c22ea4e606719e55b7999e407f18470b4e76c/ingestor_lib.py#L222 - Some of `if` statements can be merged into one if we let ``get_instrument`` to accept ``defaultInstrument`` like ``get_nested_value_with_default``. https://github.com/SciCatProject/scicat-filewriter-ingest/blob/3c5c22ea4e606719e55b7999e407f18470b4e76c/ingestor_lib.py#L384-L392 ## Documentation / Docstrings Maybe we can collect the comments into docstrings. Copilot is good at it : D And the documentation in the readme might be sufficient but we probably need to clean that up too. Here is a list of some sections that might be needed in the documentation. - Deployment Environment/Method - Whole structure (briefly, maybe mermaid graph : D) - Logging options - API dependencies - Kafka Consumer/Message Handling It seems quite straight forward but since it's the core part of it. Maybe we can document it better. Shall we write some mermaid graph : D....??

    Overdue by 1 year(s)
    Due by April 5, 2024
    9/11 issues closed