|
| 1 | +====================== |
| 2 | +Advanced odML features |
| 3 | +====================== |
| 4 | + |
| 5 | +Working with odML Validations |
| 6 | +============================= |
| 7 | + |
| 8 | +odML Validations are a set of pre-defined checks that are run against an odML document automatically when it is saved or loaded. A document cannot be saved, if a Validation fails a check that is classified as an Error. Most validation checks are Warnings that are supposed to raise the overall data quality of the odml Document. |
| 9 | + |
| 10 | +When an odML document is saved or loaded, tha automatic validation will print a short report of encountered Validation Warnings and it is up to the user whether they want to resolve the Warnings. The odML document provides the ``validate`` method to gain easy access to the default validations. A Validation in turn provides not only a specific description of all encountered warnings or errors within an odML document, but it also provides direct access to each and every odML entity i.e. an ``odml.Section`` or an ``odml.Property`` where an issue has been found. This enables the user to quickly access and fix an encountered issue. |
| 11 | + |
| 12 | +A minimal example shows how a workflow using default validations might look like: |
| 13 | + |
| 14 | + >>> # Create a minimal document with Section issues: name and type are not assigned |
| 15 | + >>> doc = odml.Document() |
| 16 | + >>> sec = odml.Section(parent=doc) |
| 17 | + >>> odml.save(doc, "validation_example.odml.xml") |
| 18 | + |
| 19 | +This minimal example document will be saved, but will also print the following Validation report: |
| 20 | + |
| 21 | + >>> UserWarning: The saved Document contains unresolved issues. Run the Documents 'validate' method to access them. |
| 22 | + >>> Validation found 0 errors and 2 warnings in 1 Sections and 0 Properties. |
| 23 | + |
| 24 | +To fix the encountered warnings, users can access the validation via the documents' ``validate`` method: |
| 25 | + |
| 26 | + >>> validation = doc.validate() |
| 27 | + >>> for issue in validation.errors: |
| 28 | + >>> print(issue) |
| 29 | + |
| 30 | +This will show that the validation has encountered two Warnings and also displays the offending odml entity. |
| 31 | + |
| 32 | + >>> ValidationWarning: Section[73f29acd-16ae-47af-afc7-371d57898e28] 'Section type not specified' |
| 33 | + >>> ValidationWarning: Section[73f29acd-16ae-47af-afc7-371d57898e28] 'Name not assigned' |
| 34 | + |
| 35 | +To fix the "Name not assigned" warning the Section can be accessed via the validation entry and used to directly assign a human readable name to the Section in the original document. Re-running the validation will show, that the warning has been removed. |
| 36 | + |
| 37 | + >>> validation.errors[1].obj.name = "validation_example_section" |
| 38 | + >>> # Check that the section name has been changed in the document |
| 39 | + >>> print(doc.sections) |
| 40 | + >>> # Re-running validation |
| 41 | + >>> validation = doc.validate() |
| 42 | + >>> for issue in validation.errors: |
| 43 | + >>> print(issue) |
| 44 | + |
| 45 | +Similarly the second validation warning can be resolved before saving the document again. |
| 46 | + |
| 47 | +Please note that the automatic validation is run whenever a document is saved or loaded using the ``odml.save`` and ``odml.load`` functions as well as the ``ODMLWriter`` or the ``ODMLReader`` class. The validation is not run when using any of the lower level ``xmlparser``, ``dict_parser`` or ``rdf_converter`` classes. |
| 48 | + |
| 49 | +List of available default validations |
| 50 | +------------------------------------- |
| 51 | + |
| 52 | +The following contains a list of the default odml validations, their message and the suggested course of action to resolve the issue. |
| 53 | + |
| 54 | +| Validation: ``object_required_attributes`` |
| 55 | +| Message: "Missing required attribute 'xyz'" |
| 56 | +| Applies to: ``Document``, ``Section``, ``Property`` |
| 57 | +| Course of action: Add an appropriate value to attribute 'xyz' for the reported odml entity. |
| 58 | +
|
| 59 | +| Validation: ``section_type_must_be_defined`` |
| 60 | +| Message: "Section type not specified" |
| 61 | +| Applies to: ``Section`` |
| 62 | +| Course of action: Fill in the ``type`` attribute of the reported Section. |
| 63 | +
|
| 64 | +| Validation: ``section_unique_ids`` |
| 65 | +| Message: "Duplicate id in Section 'secA' and 'secB'" |
| 66 | +| Applies to: ``Section`` |
| 67 | +| Course of action: IDs have to be unique and a duplicate id was found. Assign a new id for the reported Section. |
| 68 | +
|
| 69 | +| Validation: ``property_unique_ids`` |
| 70 | +| Message: "Duplicate id in Property 'propA' and 'propB'" |
| 71 | +| Applies to: ``Property`` |
| 72 | +| Course of action: IDs have to be unique and a duplicate id was found. Assign a new id for the reported Property |
| 73 | +
|
| 74 | +| Validation: ``section_unique_name_type`` |
| 75 | +| Message: "name/type combination must be unique" |
| 76 | +| Applies to: ``Section`` |
| 77 | +| Course of action: The combination of Section.name and Section.type has to be unique on the same level. Change either name or type of the reported Section. |
| 78 | +
|
| 79 | +| Validation: ``object_unique_name`` |
| 80 | +| Message: "Object names must be unique" |
| 81 | +| Applies to: ``Document``, ``Section``, ``Property`` |
| 82 | +| Course of action: Property name has to be unique on the same level. Change the name of the reported Property. |
| 83 | +
|
| 84 | +| Validation: ``object_name_readable`` |
| 85 | +| Message: "Name not assigned" |
| 86 | +| Applies to: ``Section``, ``Property`` |
| 87 | +| Course of action: When Section or Property names are left empty on creation or set to None, they are automatically assigned the entities uuid. Assign a human readable name to the reported entity. |
| 88 | +
|
| 89 | +| Validation: ``property_terminology_check`` |
| 90 | +| Message: "Property 'prop' not found in terminology" |
| 91 | +| Applies to: ``Property`` |
| 92 | +| Course of action: The reported entity is linked to a repository but the repository is not available. Check if the linked content has moved. |
| 93 | +
|
| 94 | +| Validation: ``property_dependency_check`` |
| 95 | +| Message: "Property refers to a non-existent dependency object" or "Dependency-value is not equal to value of the property's dependency" |
| 96 | +| Applies to: ``Property`` |
| 97 | +| Course of action: The reported entity depends on another Property, but this dependency has not been satisfied. Check the referenced Property and its value to resolve the issue. |
| 98 | +
|
| 99 | +| Validation: ``property_values_check`` |
| 100 | +| Message: "Tuple of length 'x' not consistent with dtype 'dtype'!" or "Property values not of consistent dtype!". |
| 101 | +| Applies to: ``Property`` |
| 102 | +| Course of action: Adjust the values or the dtype of the referenced Propery. |
| 103 | +
|
| 104 | +| Validation: ``property_values_string_check`` |
| 105 | +| Message: "Dtype of property "prop" currently is "string", but might fit dtype "dtype"!" |
| 106 | +| Applies to: ``Property`` |
| 107 | +| Course of action: Check if the datatype of the referenced Property.values has been loaded correctly and change the Property.dtype if required. |
| 108 | +
|
| 109 | +| Validation: ``section_properties_cardinality`` |
| 110 | +| Message: "cardinality violated x values, y found)" |
| 111 | +| Applies to: ``Section`` |
| 112 | +| Course of action: A cardinality defined for the number of Properties of a Section does not match. Add or remove Properties until the cardinality has been satisfied or adjust the cardinality. |
| 113 | +
|
| 114 | +| Validation: ``section_sections_cardinality`` |
| 115 | +| Message: "cardinality violated x values, y found)" |
| 116 | +| Applies to: ``Section`` |
| 117 | +| Course of action: A cardinality defined for the number of Sections of a Section does not match. Add or remove Sections until the cardinality has been satisfied or adjust the cardinality. |
| 118 | +
|
| 119 | +| Validation: ``property_values_cardinality`` |
| 120 | +| Message: "cardinality violated x values, y found)" |
| 121 | +| Applies to: ``Property`` |
| 122 | +| Course of action: A cardinality defined for the number of Values of a Property does not match. Add or remove Values until the cardinality has been satisfied or adjust the cardinality. |
| 123 | +
|
| 124 | +| Validation: ``section_repository_present`` |
| 125 | +| Message: "A section should have an associated repository" or "Could not load terminology" or "Section type not found in terminology" |
| 126 | +| Applies to: ``Section`` |
| 127 | +| Course of action: Optional validation. Will report any section that does not specify a repository. Add a repository to the reported Section to resolve. |
| 128 | +
|
| 129 | +Custom validations |
| 130 | +------------------ |
| 131 | + |
| 132 | +Users can write their own validation and register them either with the default validation or add it to their own validation class instance. |
| 133 | + |
| 134 | +A custom validation handler needs to ``yield`` a ``ValidationError``. See the ``validation.ValidationError`` class for details. |
| 135 | + |
| 136 | +Custom validation handlers can be registered to be applied on "odML" (the odml Document), "section" or "property". |
| 137 | + |
| 138 | + >>> import odml |
| 139 | + >>> import odml.validation as oval |
| 140 | + >>> |
| 141 | + >>> # Create an example document |
| 142 | + >>> doc = odml.Document() |
| 143 | + >>> sec_valid = odml.Section(name="Recording-20200505", parent=doc) |
| 144 | + >>> sec_invalid = odml.Section(name="Movie-20200505", parent=doc) |
| 145 | + >>> subsec = odml.Section(name="Sub-Movie-20200505", parent=sec_valid) |
| 146 | + >>> |
| 147 | + >>> # Define a validation handler that yields a ValidationError if a section name does not start with 'Recording-' |
| 148 | + >>> def custom_validation_handler(obj): |
| 149 | + >>> validation_id = oval.IssueID.custom_validation |
| 150 | + >>> msg = "Section name does not start with 'Recording-'" |
| 151 | + >>> if not obj.name.startswith("Recording-"): |
| 152 | + >>> yield oval.ValidationError(obj, msg, oval.LABEL_ERROR, validation_id) |
| 153 | + >>> |
| 154 | + >>> # Create a custom, empty validation with an odML document 'doc' |
| 155 | + >>> custom_validation = oval.Validation(doc, reset=True) |
| 156 | + >>> # Register a custom validation handler that should be applied on all Sections of a Document |
| 157 | + >>> custom_validation.register_custom_handler("section", custom_validation_handler) |
| 158 | + >>> # Run the custom validation and return a report |
| 159 | + >>> custom_validation.report() |
| 160 | + >>> # Display the errors reported by the validation |
| 161 | + >>> print(custom_validation.errors) |
| 162 | + |
| 163 | +Defining and working with feature cardinality |
| 164 | +============================================= |
| 165 | + |
| 166 | +The odML format allows users to define a cardinality for |
| 167 | +the number of subsections and properties of Sections and |
| 168 | +the number of values a Property might have. |
| 169 | + |
| 170 | +A cardinality is checked when it is set, when its target is |
| 171 | +set and when a document is saved or loaded. If a specific |
| 172 | +cardinality is violated, a corresponding warning will be printed. |
| 173 | + |
| 174 | +Setting a cardinality |
| 175 | +--------------------- |
| 176 | + |
| 177 | +A cardinality can be set for sections or properties of sections |
| 178 | +or for values of properties. By default every cardinality is None, |
| 179 | +but it can be set to a defined minimal and/or a maximal number of |
| 180 | +an element. |
| 181 | + |
| 182 | +A cardinality is set via its convenience method: |
| 183 | + |
| 184 | + >>> # Set the cardinality of the properties of a Section 'sec' to |
| 185 | + >>> # a maximum of 5 elements. |
| 186 | + >>> sec = odml.Section(name="cardinality", type="test") |
| 187 | + >>> sec.set_properties_cardinality(max_val=5) |
| 188 | + |
| 189 | + >>> # Set the cardinality of the subsections of Section 'sec' to |
| 190 | + >>> # a minimum of one and a maximum of 2 elements. |
| 191 | + >>> sec.set_sections_cardinality(min_val=1, max_val=2) |
| 192 | + |
| 193 | + >>> # Set the cardinality of the values of a Property 'prop' to |
| 194 | + >>> # a minimum of 1 element. |
| 195 | + >>> prop = odml.Property(name="cardinality") |
| 196 | + >>> prop.set_values_cardinality(min_val=1) |
| 197 | + |
| 198 | + >>> # Re-set the cardinality of the values of a Property 'prop' to not set. |
| 199 | + >>> prop.set_values_cardinality() |
| 200 | + >>> # or |
| 201 | + >>> prop.val_cardinality = None |
| 202 | + |
| 203 | +Please note that a set cardinality is not enforced. Users can set less or more entities than are specified allowed via a cardinality. Instead whenever a cardinality is not met, a warning message is displayed and any unment cardinality will show up as a Validation warning message whenever a document is saved or loaded. |
| 204 | + |
| 205 | +View odML documents in a web browser |
| 206 | +==================================== |
| 207 | + |
| 208 | +By default all odML files are saved in the XML format without the capability to view |
| 209 | +the plain files in a browser. By default you can use the command line tool ``odmlview`` |
| 210 | +to view saved odML files locally. Since this requires the start of a local server, |
| 211 | +there is another option to view odML XML files in a web browser. |
| 212 | + |
| 213 | +You can use an additional feature of the ``odml.tools.XMLWriter`` to save an odML |
| 214 | +document with an embedded default stylesheet for local viewing: |
| 215 | + |
| 216 | + >>> import odml |
| 217 | + >>> from odml.tools import XMLWriter |
| 218 | + >>> doc = odml.Document() # minimal example document |
| 219 | + >>> filename = "viewable_document.xml" |
| 220 | + >>> XMLWriter(doc).write_file(filename, local_style=True) |
| 221 | + |
| 222 | +Now you can open the resulting file 'viewable_document.xml' in any current web-browser |
| 223 | +and it will render the content of the odML file. |
| 224 | + |
| 225 | +If you want to use a custom style sheet to render an odML document instead of the default |
| 226 | +one, you can provide it as a string to the XML writer. Please note, that it cannot be a |
| 227 | +full XSL stylesheet, the outermost tag of the XSL code has to be |
| 228 | +``<xsl:template match="odML"> [your custom style here] </xsl:template>``: |
| 229 | + |
| 230 | + >>> import odml |
| 231 | + >>> from odml.tools import XMLWriter |
| 232 | + >>> doc = odml.Document() # minimal example document |
| 233 | + >>> filename = "viewable_document.xml" |
| 234 | + >>> own_template = """<xsl:template match="odML"> [your custom style here] </xsl:template>""" |
| 235 | + >>> XMLWriter(doc).write_file(filename, custom_template=own_template) |
| 236 | + |
| 237 | +Please note that if the file is saved using the '.odml' extension and you are using |
| 238 | +Chrome, you will need to map the '.odml' extension to the browsers Mime-type database as |
| 239 | +'application/xml'. |
| 240 | + |
| 241 | +Also note that any style that is saved with an odML document will be lost, when this |
| 242 | +document is loaded again and changes to the content are added. In this case the required |
| 243 | +style needs to be specified again when saving the changed file as described above. |
0 commit comments