Conversation
📝 WalkthroughWalkthroughAdded Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (4)
ms-imaging/1.0.0-dev/ms-imaging.yaml (4)
382-396: Redundantnot availablein values list.
allow_not_available: truealready permits this value. Includingnot availablein thevalueslist (line 390) is redundant and may cause confusion about how special values are validated. The same applies tocomment[line scan direction](line 413).Consider removing
not availablefrom the values lists to keep the schema consistent with other columns.Proposed fix
values: - top down - bottom up - left right - right left - - not available error_level: warning🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ms-imaging/1.0.0-dev/ms-imaging.yaml` around lines 382 - 396, The values validator for the field listing sequence directions currently includes the literal 'not available' while allow_not_available: true already permits that special value; remove 'not available' from the values array under the validators block (the values list containing 'top down', 'bottom up', 'left right', 'right left', 'not available') and likewise remove 'not available' from the analogous values list for comment[line scan direction], leaving allow_not_available: true in place so the special value is still allowed but not redundantly enumerated.
503-518: Pattern validator is effectively a non-empty check.The pattern
^.+$only validates that the value is non-empty, which is already implied by the column being defined. This is acceptable for free-text identifiers, but if more structure is expected (e.g., alphanumeric with underscores), consider a more specific pattern.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ms-imaging/1.0.0-dev/ms-imaging.yaml` around lines 503 - 518, The pattern validator on the comment[tissue section] field uses '^.+$' which only enforces non-empty input; update the validator for the comment[tissue section] entry to either remove the redundant pattern validator or replace it with a more specific regex that matches the expected identifier format (for example an alphanumeric/underscore pattern like '^[A-Za-z0-9_\\-]+$' or whichever structure is appropriate); modify the validators block for the comment[tissue section] field (validator_name: pattern and its params) so it either omits the pattern or supplies the tighter regex and update the params.description and examples to reflect the chosen constraint.
549-557: Pattern doesn't support ISO 8601 timezone offsets.The description mentions ISO 8601 format, but the pattern doesn't allow timezone designators (e.g.,
2025-07-01T09:30:00Zor2025-07-01T09:30:00+00:00). If timezone-aware timestamps should be accepted, consider extending the pattern.Extended pattern supporting timezone (if needed)
- pattern: '^\d{4}-\d{2}-\d{2}(T\d{2}:\d{2}(:\d{2})?)?$|^not available$' + pattern: '^\d{4}-\d{2}-\d{2}(T\d{2}:\d{2}(:\d{2})?(Z|[+-]\d{2}:\d{2})?)?$|^not available$'🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ms-imaging/1.0.0-dev/ms-imaging.yaml` around lines 549 - 557, The current pattern used in the validator (validator_name: pattern, params.pattern) rejects ISO 8601 timezone designators; update the regex in params.pattern to accept an optional timezone component (either 'Z' or an offset like +HH:MM / -HH:MM / +HH) after the time portion, and adjust the params.description/examples to reflect timezone-aware examples (e.g., 2025-07-01T09:30:00Z and 2025-07-01T09:30:00+02:00); locate and modify the pattern string in the validator params to add the optional timezone group and update examples accordingly.
96-129: Consider clarifying the distinction between synonymous values.The values list includes both full chemical names and their abbreviations (e.g.,
2,5-dihydroxybenzoic acidandDHB). While user-friendly for input, this may lead to inconsistent data across datasets.Also, the distinction between
noneandnot applicableis unclear —nonemight imply a MALDI experiment without matrix, whilenot applicableis for non-MALDI methods. Consider adding guidance in the description.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ms-imaging/1.0.0-dev/ms-imaging.yaml` around lines 96 - 129, Update the characteristics[matrix] description and validators to clarify synonymous entries and the difference between "none" and "not applicable": in the description for characteristics[matrix] (and/or its validators block) state that abbreviations (e.g., "DHB", "CHCA", "SA", "DHAP", "THAP", "DAN", "9-AA") are accepted as synonyms of their full chemical names and will be normalized to the canonical full name on ingest, and explicitly define that "not applicable" is for non-MALDI methods while "none" indicates a MALDI experiment where no matrix was applied; additionally consider adding a validator note or params comment in the validators -> values section to recommend normalization of synonyms and to warn when "none" is used for MALDI entries.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@ms-imaging/1.0.0-dev/ms-imaging.yaml`:
- Around line 576-580: The ontology reference is inconsistent: the field name
"comment[washing method]" currently has ontology_accession set to IMS:1002402
while the description references "IMS:1002400 hierarchy"; fix by making them
match—either change ontology_accession to IMS:1002400 to match the description,
or update the description text to reference IMS:1002402 (and adjust the
parent/child phrasing accordingly) so the accession and description refer to the
same IMS term.
- Around line 38-40: The manifest generator currently looks for the wrong key
name "exclusive_with" (in scripts/generate_manifest.py where the generator
checks for exclusive_with at the two spots around the handlers at lines
referenced in the review) so mutual-exclusion entries from templates are
dropped; update all checks and references in generate_manifest.py to use the
schema field name "mutually_exclusive_with" instead of "exclusive_with"
(including any parsing, merging, or validation branches that read or propagate
that key) so the generator copies template constraints into the produced
manifest; ensure both occurrences referenced in the review are changed and any
unit/test that asserts exclusive behavior is updated to use the new key name.
- Around line 608-659: Update the ontology accession for the field named
comment[resolving power]: change the ontology_accession value from MS:1000011 to
MS:1000800 in the comment[resolving power] block so it references the correct
PSI-MS term (mass resolving power); leave the comment[mass resolution] entry
(which uses MS:1000011) unchanged.
- Around line 417-418: Replace the invalid IMS:1000232 term used in two places:
for comment[matrix application method] change the ontology accession from
IMS:1000232 to MS:1000833 (PSI-MS "matrix application type"); for
characteristics[tissue modification] remove the IMS ontology reference and
reassign this field to a specimen/sample-prep ontology (e.g., OBI) by replacing
IMS:1000232 with the correct OBI accession for "tissue modification" (or an
appropriate sample-prep ontology term) so the term is sourced from a
specimen/sample-prep ontology rather than IMS.
---
Nitpick comments:
In `@ms-imaging/1.0.0-dev/ms-imaging.yaml`:
- Around line 382-396: The values validator for the field listing sequence
directions currently includes the literal 'not available' while
allow_not_available: true already permits that special value; remove 'not
available' from the values array under the validators block (the values list
containing 'top down', 'bottom up', 'left right', 'right left', 'not available')
and likewise remove 'not available' from the analogous values list for
comment[line scan direction], leaving allow_not_available: true in place so the
special value is still allowed but not redundantly enumerated.
- Around line 503-518: The pattern validator on the comment[tissue section]
field uses '^.+$' which only enforces non-empty input; update the validator for
the comment[tissue section] entry to either remove the redundant pattern
validator or replace it with a more specific regex that matches the expected
identifier format (for example an alphanumeric/underscore pattern like
'^[A-Za-z0-9_\\-]+$' or whichever structure is appropriate); modify the
validators block for the comment[tissue section] field (validator_name: pattern
and its params) so it either omits the pattern or supplies the tighter regex and
update the params.description and examples to reflect the chosen constraint.
- Around line 549-557: The current pattern used in the validator
(validator_name: pattern, params.pattern) rejects ISO 8601 timezone designators;
update the regex in params.pattern to accept an optional timezone component
(either 'Z' or an offset like +HH:MM / -HH:MM / +HH) after the time portion, and
adjust the params.description/examples to reflect timezone-aware examples (e.g.,
2025-07-01T09:30:00Z and 2025-07-01T09:30:00+02:00); locate and modify the
pattern string in the validator params to add the optional timezone group and
update examples accordingly.
- Around line 96-129: Update the characteristics[matrix] description and
validators to clarify synonymous entries and the difference between "none" and
"not applicable": in the description for characteristics[matrix] (and/or its
validators block) state that abbreviations (e.g., "DHB", "CHCA", "SA", "DHAP",
"THAP", "DAN", "9-AA") are accepted as synonyms of their full chemical names and
will be normalized to the canonical full name on ingest, and explicitly define
that "not applicable" is for non-MALDI methods while "none" indicates a MALDI
experiment where no matrix was applied; additionally consider adding a validator
note or params comment in the validators -> values section to recommend
normalization of synonyms and to warn when "none" is used for MALDI entries.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: fc1729c6-a1d8-4c25-9368-a8cdad7450fb
📒 Files selected for processing (2)
.gitignorems-imaging/1.0.0-dev/ms-imaging.yaml
| mutually_exclusive_with: | ||
| - ms-proteomics | ||
| - affinity-proteomics |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, find and check the generate_manifest.py file
find . -name "generate_manifest.py" -type f 2>/dev/null | head -5Repository: bigbio/sdrf-templates
Length of output: 95
🏁 Script executed:
#!/bin/bash
# Search for field handling in the codebase
rg "exclusive_with|mutually_exclusive_with" --type-list | grep -E "\.py|\.yaml"Repository: bigbio/sdrf-templates
Length of output: 454
🏁 Script executed:
#!/bin/bash
# Look at the manifest generator implementation
fd "generate_manifest.py" -type f -exec cat {} \;Repository: bigbio/sdrf-templates
Length of output: 235
🏁 Script executed:
#!/bin/bash
# Search for exclusive_with and mutually_exclusive_with in all files
rg "exclusive_with|mutually_exclusive_with" --no-headingRepository: bigbio/sdrf-templates
Length of output: 1191
🏁 Script executed:
#!/bin/bash
# Read the manifest generator file
wc -l scripts/generate_manifest.pyRepository: bigbio/sdrf-templates
Length of output: 97
🏁 Script executed:
#!/bin/bash
# Read the generate_manifest.py file to see context around exclusive_with handling
cat -n scripts/generate_manifest.py | head -176Repository: bigbio/sdrf-templates
Length of output: 6987
🏁 Script executed:
#!/bin/bash
# Look at the schema definition for this field
rg -A 5 "mutually_exclusive_with" sdrf-template.schema.json | head -30Repository: bigbio/sdrf-templates
Length of output: 293
🏁 Script executed:
#!/bin/bash
# Check if exclusive_with is mentioned in the schema
rg "exclusive_with" sdrf-template.schema.jsonRepository: bigbio/sdrf-templates
Length of output: 97
Manifest generator must use mutually_exclusive_with to match schema.
The manifest generator checks for exclusive_with (lines 64-65, 119-120 of scripts/generate_manifest.py), but the schema defines this field as mutually_exclusive_with. Since templates correctly use the schema-defined field name, mutual exclusivity constraints are silently dropped from the generated manifest.
Update the manifest generator to check for mutually_exclusive_with instead of exclusive_with to align with the schema and ensure template constraints are properly propagated.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ms-imaging/1.0.0-dev/ms-imaging.yaml` around lines 38 - 40, The manifest
generator currently looks for the wrong key name "exclusive_with" (in
scripts/generate_manifest.py where the generator checks for exclusive_with at
the two spots around the handlers at lines referenced in the review) so
mutual-exclusion entries from templates are dropped; update all checks and
references in generate_manifest.py to use the schema field name
"mutually_exclusive_with" instead of "exclusive_with" (including any parsing,
merging, or validation branches that read or propagate that key) so the
generator copies template constraints into the produced manifest; ensure both
occurrences referenced in the review are changed and any unit/test that asserts
exclusive behavior is updated to use the new key name.
| - name: comment[matrix application method] | ||
| ontology_accession: IMS:1000232 |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
IMS:1000232 ontology definition imaging mass spectrometry
💡 Result:
-
IMS:1000232is not a term in the imzML “Imaging MS Ontology” (imagingMS.obo, ontology idims, data-version 1.1.0). A search within the canonicalimagingMS.oboin theimzML/imzMLGitHub repo finds noIMS:1000232entry. [1]- If you meant
MS:1000232(PSI-MS CV), that is a different ontology/prefix (PSI-MS usesMS:accessions, notIMS:).
- If you meant
-
Definition of “imaging mass spectrometry” (IMS / MSI) (conceptual, not the missing
IMS:1000232term):- “Mass spectrometry imaging is the method of scanning a sample of interest and generating an ‘image’ of the intensity distribution of a specific analyte/ion.” [2]
- A peer‑reviewed description similarly characterizes IMS as using mass spectra acquired across spatial positions to visualize spatial molecular distributions. [3]
Sources
[1] imzML GitHub — imagingMS.obo (Imaging Mass Spectrometry Ontology) https://github.com/imzML/imzML/blob/master/imagingMS.obo
[2] MS‑Imaging.org — Background (imzML) https://www.ms-imaging.org/imzml/background/
[3] “imzML: Imaging Mass Spectrometry Markup Language…” (PubMed) https://pubmed.ncbi.nlm.nih.gov/21063949/
🏁 Script executed:
# Check the exact lines in the file to confirm IMS:1000232 usage
cat -n ms-imaging/1.0.0-dev/ms-imaging.yaml | sed -n '354p;418p'Repository: bigbio/sdrf-templates
Length of output: 150
🏁 Script executed:
# Search for all IMS:1000232 references in the file
rg 'IMS:1000232' ms-imaging/1.0.0-dev/ms-imaging.yaml -nRepository: bigbio/sdrf-templates
Length of output: 144
🌐 Web query:
MS:1000232 PSI-MS ontology definition
💡 Result:
PSI-MS / Mass Spectrometry Ontology term MS:1000232
- Preferred label/name: peak intensity
- IRI:
http://purl.obolibrary.org/obo/MS_1000232 - Status: obsolete
- Obsolete reason / replacement: “This term was made obsolete because it was replaced by base peak intensity (
MS:1000505).” [1]
Source: Mass Spectrometry Ontology (PSI-MS) generated documentation entry for MS_1000232. [1]
🌐 Web query:
IMS ontology "tissue modification" imaging mass spectrometry
💡 Result:
In imaging mass spectrometry, “IMS ontology” usually refers to the imzML controlled vocabulary imagingMS.obo (also called the Imaging MS Ontology / Mass Spectrometry Imaging Ontology), which defines instrument/imaging-run metadata terms (e.g., pixel size, stage position, scan pattern, etc.) used inside imzML files—not biological concepts like tissue processing. The imzML project describes these imaging-specific CV parameters as being stored in imagingMS.obo. [1]
If you’re looking for a term like “tissue modification” (e.g., fixation, embedding, derivatization, washing, antigen retrieval, etc.), that kind of sample-prep / specimen metadata is generally outside the scope of the imzML imagingMS.obo vocabulary; tooling around imzML commonly notes that sample metadata would require additional ontologies outside the package’s scope. [2]
Practical next steps:
- Use the CardinalIO ontology helpers to confirm what exists in the imzML “ims” CV (e.g.,
find_terms("tissue", "ims")/find_terms("modif", "ims")), and you’ll likely find nothing relevant for “tissue modification,” because the CV is imaging-run–focused. [2] - Model “tissue modification” with a separate ontology (often OBI and/or other specimen/sample-prep vocabularies) and keep it alongside imzML metadata rather than inside imagingMS.obo. [2]
Sources
[1] MS-Imaging.org imzML background (imzML imaging-specific controlled vocabulary stored in imagingMS.obo) (ms-imaging.org)
[2] CardinalIO manual (imzML-related ontologies provided; sample metadata noted as outside scope / would require other ontologies) (bioc.r-universe.dev)
Citations:
- 1: https://www.ms-imaging.org/imzml/background/
- 2: https://bioc.r-universe.dev/CardinalIO/doc/manual.html
🌐 Web query:
IMS ontology "matrix application method" imaging mass spectrometry
💡 Result:
In the HUPO-PSI Mass Spectrometry (PSI-MS) controlled vocabulary/ontology (often used to annotate MALDI/IMS metadata), what you’re looking for as “matrix application method” is captured as:
matrix application type— MS:1000833 (IRI:http://purl.obolibrary.org/obo/MS_1000833)- Relationship: part_of some
MALDI matrix application - Child terms (i.e., the actual application “methods”) include:
sprayed MALDI matrix preparation(MS:1000838)printed MALDI matrix preparation(MS:1000837)spin coating MALDI sample preparation(MS:1001992)dried droplet MALDI matrix preparationprecoated MALDI sample platesandwich MALDI sample preparation,overlayer MALDI sample preparation,vacuum drying MALDI sample preparation, etc. (akswnc7.informatik.uni-leipzig.de)
- Relationship: part_of some
Why it matters in imaging mass spectrometry (IMS / MALDI-MSI): matrix application is widely noted as a critical sample-prep step, and the method of matrix application influences achievable spatial resolution and data quality. (pmc.ncbi.nlm.nih.gov)
Citations:
- 1: https://akswnc7.informatik.uni-leipzig.de/dstreitmatter/archivo/purl.obolibrary.org/obo--ms--owl/2022.03.24-101928/obo--ms--owl_type%3DgeneratedDocu.html
- 2: https://pmc.ncbi.nlm.nih.gov/articles/PMC3864574/?utm_source=openai
IMS:1000232 is not a valid term in the IMS ontology and must be corrected at both locations.
- Line 354 (
characteristics[tissue modification]): "Tissue modification" is outside the scope of the IMS ontology, which covers only imaging-run metadata. Use a specimen/sample-prep ontology (e.g., OBI) instead. - Line 418 (
comment[matrix application method]): ReplaceIMS:1000232withMS:1000833(PSI-MS: "matrix application type"), which is the correct term for MALDI matrix application methods in imaging mass spectrometry.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ms-imaging/1.0.0-dev/ms-imaging.yaml` around lines 417 - 418, Replace the
invalid IMS:1000232 term used in two places: for comment[matrix application
method] change the ontology accession from IMS:1000232 to MS:1000833 (PSI-MS
"matrix application type"); for characteristics[tissue modification] remove the
IMS ontology reference and reassign this field to a specimen/sample-prep
ontology (e.g., OBI) by replacing IMS:1000232 with the correct OBI accession for
"tissue modification" (or an appropriate sample-prep ontology term) so the term
is sourced from a specimen/sample-prep ontology rather than IMS.
| - name: comment[washing method] | ||
| ontology_accession: IMS:1002402 | ||
| description: >- | ||
| Wash protocol applied to the tissue section prior to MSI acquisition. | ||
| IMS:1002400 hierarchy (e.g. Carnoy's wash, chloroform wash, ethanol wash). |
There was a problem hiding this comment.
Inconsistent ontology reference.
The ontology_accession is IMS:1002402 but the description references "IMS:1002400 hierarchy". Align these by either updating the accession to the parent term or correcting the description.
Proposed fix (if IMS:1002402 is the correct accession)
description: >-
Wash protocol applied to the tissue section prior to MSI acquisition.
- IMS:1002400 hierarchy (e.g. Carnoy's wash, chloroform wash, ethanol wash).
+ IMS:1002402 (e.g. Carnoy's wash, chloroform wash, ethanol wash).
Affects which lipid/metabolite classes are detected. Use 'not applicable'
if no washing was performed.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ms-imaging/1.0.0-dev/ms-imaging.yaml` around lines 576 - 580, The ontology
reference is inconsistent: the field name "comment[washing method]" currently
has ontology_accession set to IMS:1002402 while the description references
"IMS:1002400 hierarchy"; fix by making them match—either change
ontology_accession to IMS:1002400 to match the description, or update the
description text to reference IMS:1002402 (and adjust the parent/child phrasing
accordingly) so the accession and description refer to the same IMS term.
| - name: comment[resolving power] | ||
| ontology_accession: MS:1000011 | ||
| description: Detector resolving power at a reference m/z. Format '{value} at {mz} m/z'. Use 'not available' if unrecorded. Corresponds to METASPACE Detector_Resolving_Power field. | ||
| requirement: optional | ||
| allow_not_applicable: false | ||
| allow_not_available: true | ||
| validators: | ||
| - validator_name: pattern | ||
| params: | ||
| pattern: '^\d+(\.\d+)? at \d+(\.\d+)? m/z$|^not available$' | ||
| description: Resolving power at reference m/z or 'not available' | ||
| examples: | ||
| - 140000 at 200 m/z | ||
| - 50000 at 400 m/z | ||
| - 30000 at 1000 m/z | ||
| - not available | ||
|
|
||
| - name: comment[normalization method] | ||
| ontology_accession: IMS:1002600 | ||
| description: Spectral intensity normalization method applied during data processing. Use 'not applicable' if no normalization was applied. | ||
| requirement: optional | ||
| allow_not_applicable: true | ||
| allow_not_available: true | ||
| validators: | ||
| - validator_name: values | ||
| params: | ||
| values: | ||
| - TIC | ||
| - RMS | ||
| - median | ||
| - none | ||
| - not applicable | ||
| - not available | ||
| error_level: warning | ||
| description: Spectral normalization method | ||
|
|
||
| - name: comment[mass resolution] | ||
| ontology_accession: MS:1000011 | ||
| description: Mass resolution of the instrument at the reported m/z. MS:1000011. Format '{value} ppm', '{value} Da', or '{value} mDa'. Use 'not available' if unrecorded. | ||
| requirement: optional | ||
| allow_not_applicable: false | ||
| allow_not_available: true | ||
| validators: | ||
| - validator_name: number_with_unit | ||
| params: | ||
| units: [ppm, Da, mDa] | ||
| error_level: warning | ||
| description: Mass resolution with unit | ||
| examples: | ||
| - 5 ppm | ||
| - 0.5 Da | ||
| - 50 mDa |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
PSI-MS MS:1000011 mass resolution ontology definition
💡 Result:
PSI-MS term MS:1000011 (“mass resolution”) — definition: “Smallest mass difference between two equal magnitude peaks so that the valley between them is a specified fraction of the peak height.” [1]
🌐 Web query:
PSI-MS ontology resolving power term accession
💡 Result:
In the PSI-MS (MS) ontology, the appropriate term for “resolving power” is mass resolving power, accession MS:1000800. [1], [2]
Note: an older/duplicate “resolving power” wording is treated as obsolete/redundant and points to MS:1000800. [2]
Sources:
[1] imzML specification showing example MS:1000800 (mass resolving power)
[2] PSI-MS ontology listing indicating replacement by mass resolving power (MS:1000800)
Change comment[resolving power] ontology_accession from MS:1000011 to MS:1000800.
The PSI-MS ontology defines MS:1000011 as "mass resolution" (peak separation ability), while MS:1000800 is the correct accession for "mass resolving power" (dimensionless ratio). The comment[resolving power] field at line 609 should use MS:1000800, not MS:1000011.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ms-imaging/1.0.0-dev/ms-imaging.yaml` around lines 608 - 659, Update the
ontology accession for the field named comment[resolving power]: change the
ontology_accession value from MS:1000011 to MS:1000800 in the comment[resolving
power] block so it references the correct PSI-MS term (mass resolving power);
leave the comment[mass resolution] entry (which uses MS:1000011) unchanged.
There was a problem hiding this comment.
Pull request overview
Adds a new technology-layer SDRF template for mass spectrometry imaging (MSI) experiments (ms-imaging), intended to extend sample-metadata with MSI acquisition and sample-prep columns and validation rules.
Changes:
- Introduces
ms-imagingtemplate v1.0.0-devwith MSI-specific columns/validators for MALDI/DESI/SIMS. - Updates
.gitignoreto ignore.DS_Store.
Reviewed changes
Copilot reviewed 1 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
ms-imaging/1.0.0-dev/ms-imaging.yaml |
New MSI technology template extending sample-metadata with spatial acquisition + instrument/prep metadata and validation. |
.gitignore |
Ignores macOS .DS_Store files. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| contributors: | ||
| - Lisa Boatner (lboatner@chanzuckerberg.com) - Chan Zuckerberg Initiative | ||
| version: 1.0.0-dev | ||
| extends: sample-metadata |
There was a problem hiding this comment.
extends is inconsistent with the rest of the repo: other templates pin/constraint the parent (e.g., extends: sample-metadata@>=1.0.0). Using plain sample-metadata may make inheritance ambiguous when sample-metadata versions change; consider switching to sample-metadata@>=1.0.0 for consistency.
| extends: sample-metadata | |
| extends: sample-metadata@>=1.0.0 |
| - validator_name: combination_of_columns_no_duplicate_validator | ||
| params: | ||
| column_name: | ||
| - source name | ||
| - assay name | ||
| column_name_warning: | ||
| - source name | ||
| - assay name |
There was a problem hiding this comment.
This template defines combination_of_columns_no_duplicate_validator for source name + assay name, but that same uniqueness check already exists in the inherited base template. Consider removing this validator here (or extending it with additional columns if you intended a stricter uniqueness constraint) to avoid redundant validation/reporting.
| - validator_name: combination_of_columns_no_duplicate_validator | |
| params: | |
| column_name: | |
| - source name | |
| - assay name | |
| column_name_warning: | |
| - source name | |
| - assay name |
| - assay name | ||
| columns: | ||
| - name: technology type | ||
| ontology_accession: EFO:0010003 |
There was a problem hiding this comment.
technology type uses ontology_accession: EFO:0010003, but the same column in base, ms-proteomics, and affinity-proteomics uses PRIDE:0000663. If tooling relies on this accession for consistent column semantics, consider keeping PRIDE:0000663 here as well (you can still override allowed values via the values validator).
| ontology_accession: EFO:0010003 | |
| ontology_accession: PRIDE:0000663 |
| description: Line scan direction (IMS:1000480 horizontal or IMS:1000481 vertical) | ||
|
|
||
| - name: comment[matrix application method] | ||
| ontology_accession: IMS:1000232 |
There was a problem hiding this comment.
comment[matrix application method] is assigned ontology_accession: IMS:1000232, which is also used for characteristics[tissue modification]. Given the different concepts, this looks like a copy/paste error; consider updating to the correct IMS accession for matrix application method (or omitting ontology_accession if it’s not well-defined).
| ontology_accession: IMS:1000232 |
| - name: comment[washing method] | ||
| ontology_accession: IMS:1002402 | ||
| description: >- | ||
| Wash protocol applied to the tissue section prior to MSI acquisition. | ||
| IMS:1002400 hierarchy (e.g. Carnoy's wash, chloroform wash, ethanol wash). | ||
| Affects which lipid/metabolite classes are detected. Use 'not applicable' | ||
| if no washing was performed. | ||
| requirement: optional | ||
| allow_not_applicable: true | ||
| allow_not_available: true | ||
| validators: | ||
| - validator_name: values | ||
| params: | ||
| values: | ||
| - Carnoy's wash | ||
| - chloroform wash | ||
| - ethanol wash | ||
| - graded ethanol series | ||
| - ammonium formiate wash | ||
| - none | ||
| - not applicable | ||
| - not available | ||
| error_level: warning | ||
| description: Tissue wash protocol (IMS:1002400 hierarchy) | ||
|
|
There was a problem hiding this comment.
In comment[washing method], the text references the IMS:1002400 hierarchy, but the column’s ontology_accession is IMS:1002402. Also, the guidance says to use not applicable when no washing was performed, but the allowed values include none—these should be made consistent to avoid confusing annotators.
| - chloroform wash | ||
| - ethanol wash | ||
| - graded ethanol series | ||
| - ammonium formiate wash |
There was a problem hiding this comment.
Typo in allowed value: ammonium formiate wash should be ammonium formate wash.
| - ammonium formiate wash | |
| - ammonium formate wash |
|
@lmboat I love the idea; a couple of things before doing the full review of the template:
This is absolutly fantastic and thanks for this PR. |
|
@lmboat thank you so much for the template in MS- image. But looking at the template you mentioned in bigbio/proteomics-sample-metadata#450 matrix is tagged as comment[matrix] but for the pull request you have proposed as characteristics[matrix]. Just want to make sure it's not an error. |
@nithujohn good catch! I missed the fields in the unmerged example. comment[matrix] should be a comment rather than characteristics since the matrix is a sample preparation/analytical parameter rather than an intrinsic biological property of the sample. I will update the PR to reflect this, thank you! |
Summary
Related to bigbio/proteomics-sample-metadata#450 and bigbio/proteomics-sample-metadata#451.
This PR adds
ms-imaging, a new technology-layer SDRF template for mass spectrometry imaging (MSI) experiments, building on the community discussion in those threads.This is a first attempt intended to start the discussion.
Template design
technology;usable_alone: true, combines with sample templates (human,vertebrates, etc.)sample-metadata(notms-proteomics) to avoid inheriting solution-phase columnsms-proteomics,affinity-proteomicsimaging mass spectrometry)Key columns
technology typeimaging mass spectrometry/MALDI imaging mass spectrometry/DESI imaging mass spectrometry/SIMS imaging mass spectrometry. NTR needed for EFO/MS ontology support.comment[pixel size x/y]comment[imaging ion source]ontology: ms, parent MS:1000008.comment[analyzer]ontology: ms, parent MS:1000443.comment[scan pattern]not applicablefor non-raster instruments (FT-ICR, SIMS).comment[scan type]continuous(IMS:1000030) /processed(IMS:1000031).not applicablefor non-imzML data.comment[mz range start/end]not availablefor continuous imzML (m/z axis in binary ibd).characteristics[matrix]not applicablefor DESI/SIMS.characteristics[sample stabilization]comment[linescan sequence]comment[section thickness]not applicablefor non-tissue substrates.comment[laser shots per spectrum]not applicablefor DESI/SIMS.comment[mounting method]comment[washing method]Ontology approach
The IMS ontology (
imagingMS.obo) is not currently indexed in OLS. Therefore:valuesenum orpatternvalidators, with IMS accessions cited in column descriptionsimaging ion source,instrument,analyzer) useontology: msand are validated against OLS as in existing templatesAn NTR to PSI-MS is needed to add IMS imaging acquisition parameters (scan pattern, pixel size, linescan sequence) under the PSI-MS hierarchy to enable
ontology: msvalidation. This is noted in the template documentation.Known limitations
comment[region of interest]) may be needed in a future version.characteristics[matrix]accepts both full compound names and abbreviations (e.g.DHBand2,5-dihydroxybenzoic acid). Both forms are common in the community; a future version could canonicalize one form.technology typevalues are not yet backed by EFO or MS ontology terms (NTR needed).Files
ms-imaging/1.0.0-dev/ms-imaging.yamlNotes
proteomics-sample-metadataif helpful.Summary by CodeRabbit
New Features
Chores