Skip to content

Conversation

@mbaudis
Copy link
Member

@mbaudis mbaudis commented Sep 22, 2025

This PR adds an "excluded" property to "OntologyFilter" and "CustomFilter". See discussion at #63

Copy link
Collaborator

@gsfk gsfk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems sensible, although with lots of comments: my first reaction was to be offended that alphanumeric filters were left out, although any filter that can point to fields in the model can also point to excluded, eg:

{
  "filters": [
      {"id": "phenotypicFeatures.featureType.label", "operator": "=", "value": "Diabetes mellitus"},
      {"id": "phenotypicFeatures.excluded", "operator": "=", "value": true}
  ]
}

Although this presumes a fixed way to point to fields in the model, which we don't really have. This also won't work well if your implementation allows you to search multiple phenotypic features at the same time (you won't know which one to exclude) although this could be fixed with AND/OR filters.

So I think the pr makes sense, at least for ontology filters, since they don't really point anywhere, although you could presumably do this:

{
  "filters": [
    {"id": "HP:0000819"},
    {"id": "phenotypicFeatures.excluded", "operator": "=", "value": true}
  ]
}

... but this has similar problems, which will get worse when we add more excluded fields to the model.

I'm not sure adding excluded makes sense for alphanumerics or custom filters. Custom filters in particular are supposed to be open, so presumably don't need fields prescribed for them. (You could write a custom filter autoimmune_diseases_excluded, which makes an extra excluded field confusing.)

From what I what I can tell the pr only affects POST requests, when ontologies are commonly used with GET. Is this a POST-only feature?

@mbaudis
Copy link
Member Author

mbaudis commented Sep 23, 2025

@gsfk Thanks for the comments!

I'm not sure adding excluded makes sense for alphanumerics or custom filters. Custom filters in particular are supposed to be open, so presumably don't need fields prescribed for them. (You could write a custom filter autoimmune_diseases_excluded, which makes an extra excluded field confusing.)

O.k.; I agree w/ the sentiment. I'll remove the property from custom too.

From what I what I can tell the pr only affects POST requests, when ontologies are commonly used with GET. Is this a POST-only feature?

How would you do GET per filter? As mentioned in the call, and written ... previously, we need a definition of "what & how to stringify in filters". This IMO includes:

  • proper definition of alphanumerics, where the (filter) id should be treated like a CURIE prefix, i.e. separated by : - which had been documented in the online documentation but is not properly checked etc.
  • "excluded" as a ! filter prefix, which makes sense IMO but has to be agreed upon
  • etc.

I thought I'd made an issue for that but maybe it was only a talking point? Happy if anybody writes this up (e.g. stringified filter representation in GET requests).

... following @gsfk 's comments.
@mbaudis mbaudis changed the title Adding "excluded" property to "OntologyFilter" and "CustomFilter" Adding "excluded" property to "OntologyFilter" Sep 25, 2025
@jrambla
Copy link
Contributor

jrambla commented Oct 2, 2025

I want to doublecheck if we have considered properly the actual scenario:
An "excluded" phenotypic feature is actually present in the patient's record, but it is flagged as not observed. So, it is not a NULL but a "0 (zero)". Hence a simple search on such ontology term will return a false positive (FP) as it is in the record/document, but marked as "not present" (excluded).
Therefore the filter resolution should take this into account, and should be clearly stated in the documentation.
Is this part of this PR?

@mbaudis
Copy link
Member Author

mbaudis commented Oct 2, 2025

I want to doublecheck if we have considered properly the actual scenario: An "excluded" phenotypic feature is actually present in the patient's record, but it is flagged as not observed. So, it is not a NULL but a "0 (zero)". Hence a simple search on such ontology term will return a false positive (FP) as it is in the record/document, but marked as "not present" (excluded). Therefore the filter resolution should take this into account, and should be clearly stated in the documentation. Is this part of this PR?

myFilter, excluded:

  • myFilter has property, set to true -> match
  • myFilter has property, set to false or not set -> no match
  • myFilter has no value -> no match
  • myFilter does not have property -> false -> no match

So a match requires a positive assertion of an excluded property. So far only Phenotype, AFAIK. We can add a sentence to the description.

refinement of description for `OntologyFilter.excluded`
@jrambla
Copy link
Contributor

jrambla commented Nov 7, 2025

I was meaning that if you apply a filter on a term, w/o any mention to "excluded", and in your phenotype you have the term BUT excluded, a typical filter request WILL return such record as having the phenotype, while actually it has not.
Example:

{
    "featureType": {  "id": "HP:0002006", "label": "Facial cleft"},
    "excluded":  true
}

A request filtering by "HP:0002006" in a MongoDB search, e.g. will return that record as a match, given that the "excluded" flag is not part of the request. Therefore, I believe we should include a clear warning in the documentation to ALWAYS (?) include a check on the presence of a "excluded: true" flag in the phenotype before returning the results.
And I was requesting to include this warning in this PR, to make it more comprehensive.

@mbaudis
Copy link
Member Author

mbaudis commented Nov 7, 2025

@jrambla Well, the definition's default: false - if implemented correctly - prevents faulty matches. The description says:

queried property has an excluded property

However, it is difficult to avoid misuse/-interpretation by implementers, especially given varying schema interpretations in networked queries etc. ╮(╯▽╰)╭

I have changed the description - which is getting a bit lengthy, not a good sign - and also fixed a logical bug. WDYT, before pushing it, ?

...
          The use of `true` select for an explicit **negation** of the term, _i.e._
          only returns a match if the 
          
          * queried property has an `excluded` field 
          * the value of the `excluded` field in the is set to `true`

          A use of `excluded: false` avoids the match of positively negated
          values. Implementations MUST ensure that the `excluded` field is supported for
          the target property OR provide appropriate fallback handling of the request. For
          example, to implement the default `excluded: false` queries in MongoDB
          a request can use the `$not` operator 

          `db.test.find({"thisProperty.id": "VALUE", "thisProperty.excluded":{$ne:true}})`

          to exclude

          * documents with `thisProperty.excluded: false`
          * documents that do not have the `thisProperty.excluded` property

          ... as default.
...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. There is a sentence in the description that I don't understand: "field in the is set to true"
  2. I don't understand or agree in "excluded: false is equivalent to omitting the filter"
  3. I don't agree either in AlphanumericFilter can implement the functionality via operators (!)

@jrambla
Copy link
Contributor

jrambla commented Nov 24, 2025

One aspect to consider: In the last DevScout meeting, we agreed that we could make GET & POST diverge.
We'll keep GET just for simple stuff e.g. no complex logic, not profiles, nor excluded-like flags and POST for everything.
This could simplify some discussions like filter serialization, although that discussion doesn't belong to this PR.

@jrambla
Copy link
Contributor

jrambla commented Nov 24, 2025

I disagree in that alphanumeric or custom doesn't require "excluded".
This flag is just a convenience for a dictionary (being an ontology or not) not having to duplicate ALL its terms just with and without "excluded" concept.
For alphanumeric, it makes sense as the "excluded" flag is only present in phenotypicFeatures wich are only used with the "=" operator. Making it equivalent to the OntologyTerm filter.
As it is similar to write:

filters: [ {"id":"HP:0012469", "scope": "individuals"}]  <<<<<<  "Infantile spasms"

than

filters: [
        {
            "id": "phenotypicFeature.id",
            "operator": "=",
            "value": "HP:0012469",
            "scope": "individuals"
        }

Similarly with the custom filter, which purpose was to fill the gap for non-existing ontologies or terms.

filters: [
        {
            "id": "phenotypicFeature.id:myNewTermNotYetInAnyOntology",
            "scope": "individuals"
        }

All of them are applicable to phenotypicFeatures, hence under the "excluded" area of influence.

@Deepthi-v-s
Copy link

Hi all,

We had a discussion within our group (including Tony) regarding the "excluded" flag. We also agree with Jordi’s point that the "excluded" flag is mainly required for ontology-based filters, and that it is not necessary for alphanumeric or numeric filters. We are okay with this approach and support using the "excluded" flag for ontology terms.

Thanks & regards,
Deepthi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants