Skip to content

Comments

feat: improve linkml error messages for regex validation#599

Open
daniel-ji wants to merge 2 commits intomainfrom
daniel-ji/improve-validation-errors
Open

feat: improve linkml error messages for regex validation#599
daniel-ji wants to merge 2 commits intomainfrom
daniel-ji/improve-validation-errors

Conversation

@daniel-ji
Copy link
Contributor

@daniel-ji daniel-ji commented Feb 9, 2026

Currently when there is a pydantic regex validation error (for example, a pattern is not matched for an ontology id, the error message does not include the correct pattern / ontology for which the value should match. This error message is now improved, so that users (and LLM agents) know how to fix their patttern.

For example, previously, an error message would have looked like this:

FAIL: "/home/daniel/cryoet-data-portal-backend/ingestion_tools/dataset_configs/10444.yaml":
        - Value error, Value error, Invalid id format: CVC_0045: datasets.0.metadata.cell_strain.id

But now, it looks like this:

FAIL: "/home/daniel/cryoet-data-portal-backend/ingestion_tools/dataset_configs/10444.yaml":
        - Value error, Value error, Invalid id format: CVC_0045. Valid formats include: ['WBStrain[0-9]{8}$', '^NCBITaxon:[0-9]+$', '^CVCL_[A-Z0-9]{4,}$', '^CC-[0-9]{4}$']. Field description: The ontology identifier for the cell strain.: datasets.0.metadata.cell_strain.id

Also a small fix to the template.yaml file.

@daniel-ji daniel-ji requested a review from uermel February 9, 2026 17:37
return False


def get_patterns(field_linkml_meta: dict) -> list[str]:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how to actually get the patterns that are used for a given field: provide the linkml metadata for that field; and then search the global linkml metadata that contains all types (including their regex patterns for the custom types we have built for the ontologies)

return patterns


def add_regex_error_augmenter(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

returns a decorator that wraps the class so that when regex validation errors happen; their error messages are caught and wrapped with more details and then thrown again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant