Draft relaxed schema compliance

# Context

There are emerging requirements for reusing the `cellxgene-schema CLI`schema+validator for scenarios that are more **relaxed** than CELLxGENE Discover's current requirements.

# Relaxation

The following sections **blue-sky** possible approaches to documenting **relaxed** requirements; however, the solution should be driven by concrete scenarios and not theory.

## Fine Granularity: Per Schema _variant_

A limited number of schema variants could be documented such as the "cross modality schema".  `schema_reference` could be reused for the curator to define the preferred schema for validation. 

---

## Fine Granularity: Per Metadata field

For each metadata field, the schema defines separate requirements for **strict** and **relaxed**.  Generally, **relaxed** will indicate that the field MUST NOT be present, but it's also possible to relax other requirements. 

---

## `uns` (Dataset Metadata)

### relaxed

<table><tbody>
    <tr>
      <th>Key</th>
      <td>relaxed</td>
    </tr>
    <tr>
      <th>Annotator</th>
      <td>Curator MAY annotate.</td>
    </tr>
    <tr>
      <th>Value</th>
        <td>
          <code>list[str]</code>. <code>str</code> values MUST match one or more of the values in the set:
          <ul>
          <li>"obs['cell_type_ontology_term_id']"</li>         
          <li>"obs['development_stage_ontology_term_id']"</li> 
          <li>...</li>
         </ul><br>If present, relaxed validation MUST be performed on the specified metadata field.
        </td>
    </tr>
</tbody></table>
<br>

---

Concrete example:  If the assay is **silver tier** _Visium Spatial Gene Expression_ then assuming that `cell_type_ontology_term_id` defined its relaxed validation as:

1. `cell_type_ontology_term_id` MUST NOT be present in `obs`
2. "cell_type_onotlogy_term_id" MUST be annotated in `uns['relaxed']`

Then the **silver tier** dataset would simply meet those requirements.

---


## Coarse Granularity: Per Dataset 

The schema documents a  **relaxed** subset of the current required fields. This subset may not include `cell_type_ontology_term_id` or perhaps `development_stage_ontology_term_id`.  If a current required field is not included in the **relaxed** subset, then it MUST NOT be present in the dataset.

Curators annotate whether **strict** or **relaxed** validation is desired.

---

## `uns` (Dataset Metadata)

### strict

<table><tbody>
    <tr>
      <th>Key</th>
      <td>strict</td>
    </tr>
    <tr>
      <th>Annotator</th>
      <td>Curator MUST annotate.</td>
    </tr>
    <tr>
      <th>Value</th>
        <td><code>bool</code>. This MUST be <code>True</code> for <b>strict</b> validation and MUST be <code>False</code> for <b>relaxed</b> validation.</td>
    </tr>
</tbody></table>
<br>

## References

* [Strict and Relaxed Mode](https://directory.apache.org/api/internal-design-guide/8-schema.html)
* [Categories of AIRR Schema Fields](https://docs.airr-community.org/en/latest/datarep/airr_schema_requirement_levels.html#categories-of-airr-schema-fields)
* [Compliance with the MiAIRR Data Standard](https://docs.airr-community.org/en/latest/datarep/airr_schema_requirement_levels.html#compliance-with-the-miairr-data-standard)
> Compliance to the MiAIRR Data Standard is currently a binary state, i.e., a data either is or is not compliant, there are not “grades” of compliance. However, additional requirements for specific use cases might be defined in the future.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Draft relaxed schema compliance #1025

Context

Relaxation

Fine Granularity: Per Schema variant

Fine Granularity: Per Metadata field

`uns` (Dataset Metadata)

relaxed

Coarse Granularity: Per Dataset

`uns` (Dataset Metadata)

strict

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Key	relaxed
Annotator	Curator MAY annotate.
Value	`list[str]`. `str` values MUST match one or more of the values in the set: "obs['cell_type_ontology_term_id']" "obs['development_stage_ontology_term_id']" ... If present, relaxed validation MUST be performed on the specified metadata field.

Key	strict
Annotator	Curator MUST annotate.
Value	`bool`. This MUST be `True` for strict validation and MUST be `False` for relaxed validation.

Draft relaxed schema compliance #1025

Description

Context

Relaxation

Fine Granularity: Per Schema variant

Fine Granularity: Per Metadata field

uns (Dataset Metadata)

relaxed

Coarse Granularity: Per Dataset

uns (Dataset Metadata)

strict

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`uns` (Dataset Metadata)

`uns` (Dataset Metadata)