Skip to content

Create a single source for resources shared by data_store and data_store_serialization test cases. #894

Open
@Pushkar-Bhuse

Description

@Pushkar-Bhuse

Is your feature request related to a problem? Please describe.
The resources used to define the test cases in data_store_test.py and data_store_serialization_test.py are the same in many cases. For example: The structure of _type_attributes for most entries in both the above mentioned files is quire similar in most cases

DataStore._type_attributes = {
        "ft.onto.base_ontology.Document": {
            "attributes": {
                "begin": {"index": 2, "type": (None, (int,))},
                "end": {"index": 3, "type": (None, (int,))},
                "payload_idx": {"index": 4, "type": (None, (int,))},
                "document_class": {"index": 5, "type": (list, (str,))},
                "sentiment": {"index": 6, "type": (dict, (str, float))},
                "classifications": {
                    "index": 7,
                    "type": (FDict, (str, Classification)),
                },
            },
            "parent_entry": "forte.data.ontology.top.Annotation",
        },
}

Describe the solution you'd like
In order to reduce this redundancy, there should be a central file that can store these configurations and a clear format for them to be accessed by these tests. Note that although the configurations look quite similar, there are subtle differences in some cases that are intentional. For example, in data_store_serialization_test.py, the _type_attributes for Document is given by

"ft.onto.base_ontology.Document": {
                "attributes": {
                    "begin": {"index": 2, "type": (None, (int,))},
                    "end": {"index": 3, "type": (None, (int,))},
                    "payload_idx": {"index": 4, "type": (None, (int,))},
                    "sentiment": {"index": 5, "type": (dict, (str, float))},
                    "classifications": {
                        "index": 6,
                        "type": (FDict, (str, Classification)),
                    },
                },
                "parent_entry": "forte.data.ontology.top.Annotation",
            },

Note that this configuration misses the document_class attribute intentionally. Thus, the proposed solution needs to have provisions to handle the slight changes in structure.

Additional Context

  • This is part of the data efficiency project
  • This PR should be made to the master branch.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions