Skip to content

add ms-imaging sdrf template v1.0.0-dev#15

Open
lmboat wants to merge 1 commit intobigbio:mainfrom
lmboat:feature/add-ms-imaging-template
Open

add ms-imaging sdrf template v1.0.0-dev#15
lmboat wants to merge 1 commit intobigbio:mainfrom
lmboat:feature/add-ms-imaging-template

Conversation

@lmboat
Copy link

@lmboat lmboat commented Mar 6, 2026

Summary

Related to bigbio/proteomics-sample-metadata#450 and bigbio/proteomics-sample-metadata#451.

This PR adds ms-imaging, a new technology-layer SDRF template for mass spectrometry imaging (MSI) experiments, building on the community discussion in those threads.

This is a first attempt intended to start the discussion.

Template design

  • Layer: technology; usable_alone: true, combines with sample templates (human, vertebrates, etc.)
  • Extends: sample-metadata (not ms-proteomics) to avoid inheriting solution-phase columns
  • Mutually exclusive with: ms-proteomics, affinity-proteomics
  • Modalities covered: MALDI, DESI, SIMS (and generic imaging mass spectrometry)
  • 32 new columns: 13 required, 9 recommended, 10 optional

Key columns

Column Requirement Notes
technology type required imaging mass spectrometry / MALDI imaging mass spectrometry / DESI imaging mass spectrometry / SIMS imaging mass spectrometry. NTR needed for EFO/MS ontology support.
comment[pixel size x/y] required Raster step size in µm. IMS:1000046/1000047.
comment[imaging ion source] required PSI-MS term, ontology: ms, parent MS:1000008.
comment[analyzer] required PSI-MS term, ontology: ms, parent MS:1000443.
comment[scan pattern] required IMS:1000041 hierarchy (meandering, one way, flyback, random access). not applicable for non-raster instruments (FT-ICR, SIMS).
comment[scan type] required continuous (IMS:1000030) / processed (IMS:1000031). not applicable for non-imzML data.
comment[mz range start/end] required Numeric only, no units. not available for continuous imzML (m/z axis in binary ibd).
characteristics[matrix] required MALDI matrix compound (free text, warning-level enum). not applicable for DESI/SIMS.
characteristics[sample stabilization] required Fresh frozen, FFPE, ethanol-fixed, etc. IMS:1002050 hierarchy.
comment[linescan sequence] recommended Top down (IMS:1000401) / bottom up (IMS:1000400) / left right (IMS:1000402) / right left (IMS:1000403).
comment[section thickness] recommended µm. IMS:1002107. not applicable for non-tissue substrates.
comment[laser shots per spectrum] recommended Integer. IMS:1006001. not applicable for DESI/SIMS.
comment[mounting method] optional Thaw mounting (IMS:1002201) / tape mounting (IMS:1002202).
comment[washing method] optional Carnoy's, ethanol, chloroform, etc. IMS:1002402.

Ontology approach

The IMS ontology (imagingMS.obo) is not currently indexed in OLS. Therefore:

  • IMS-specific columns use values enum or pattern validators, with IMS accessions cited in column descriptions
  • PSI-MS columns (imaging ion source, instrument, analyzer) use ontology: ms and are validated against OLS as in existing templates
  • Matrix compound names are an enumerated warning-level list; individual compound terms were moved from IMS to PSI-MS per the OBO remark but do not appear as a browsable hierarchy

An NTR to PSI-MS is needed to add IMS imaging acquisition parameters (scan pattern, pixel size, linescan sequence) under the PSI-MS hierarchy to enable ontology: ms validation. This is noted in the template documentation.

Known limitations

  • One row per imzML file is assumed. Datasets containing multiple regions or images within a single imzML are not currently distinguishable at this level of granularity. Sub-file spatial metadata (e.g. comment[region of interest]) may be needed in a future version.
  • characteristics[matrix] accepts both full compound names and abbreviations (e.g. DHB and 2,5-dihydroxybenzoic acid). Both forms are common in the community; a future version could canonicalize one form.
  • technology type values are not yet backed by EFO or MS ontology terms (NTR needed).
  • Analyte-type specific columns (e.g. lipids, glycans, metabolites) are not currently included. The template is scoped to proteomics-first MSI workflows. Extension templates for lipidomics/metabolomics MSI could build on this in a future version.

Files

  • ms-imaging/1.0.0-dev/ms-imaging.yaml

Notes

  • Template design was informed by the MSI annotation discussion and suggestions from #450 and #451.
  • An annotated example SDRF for PXD011104 is available and can be submitted to proteomics-sample-metadata if helpful.
  • All feedback welcome, especially on column naming, required/recommended/optional assignments, ontology approach, and any MSI workflows or modalities not well represented here!

Summary by CodeRabbit

  • New Features

    • Introduced comprehensive mass spectrometry imaging (MSI) metadata schema supporting MALDI, DESI, and SIMS technologies. Includes spatial acquisition parameters, instrument specifications, detector settings, sample preparation details, and validation rules with ontology-driven constraints.
  • Chores

    • Updated system configuration files.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 6, 2026

📝 Walkthrough

Walkthrough

Added .DS_Store to the .gitignore file and introduced a new comprehensive YAML SDRF template for mass spectrometry imaging (MSI) experiments. The template defines MSI-specific metadata columns, validation rules, ontology references (PSI-MS, IMS, EFO, METASPACE), and cross-field constraints for spatial acquisition parameters and modality-specific configurations.

Changes

Cohort / File(s) Summary
Git Configuration
.gitignore
Added .DS_Store to the ignore list to exclude macOS system files from version control.
MSI SDRF Template
ms-imaging/1.0.0-dev/ms-imaging.yaml
New comprehensive YAML schema template for mass spectrometry imaging metadata. Defines MSI-specific columns (technology type, imaging ion source, instrument, polarity, pixel sizes, mz range bounds, scan patterns, analyzer type, tissue characteristics, matrix, mounting, etc.), validation rules with cardinality and allowed values, ontology constraints (PSI-MS, IMS, EFO references), cross-field constraints (mutual exclusivity, combinations), and explicit units/formats (µm, ppm, Da, mDa, ISO dates).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Hops of joy for macOS files dismissed,
A grand schema for imaging—none will be missed!
Metadata flows with ontology's grace,
MSI experiments find their rightful place. ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding a new ms-imaging SDRF template at version 1.0.0-dev, which is the primary purpose of this PR.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (4)
ms-imaging/1.0.0-dev/ms-imaging.yaml (4)

382-396: Redundant not available in values list.

allow_not_available: true already permits this value. Including not available in the values list (line 390) is redundant and may cause confusion about how special values are validated. The same applies to comment[line scan direction] (line 413).

Consider removing not available from the values lists to keep the schema consistent with other columns.

Proposed fix
         values:
           - top down
           - bottom up
           - left right
           - right left
-          - not available
         error_level: warning
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ms-imaging/1.0.0-dev/ms-imaging.yaml` around lines 382 - 396, The values
validator for the field listing sequence directions currently includes the
literal 'not available' while allow_not_available: true already permits that
special value; remove 'not available' from the values array under the validators
block (the values list containing 'top down', 'bottom up', 'left right', 'right
left', 'not available') and likewise remove 'not available' from the analogous
values list for comment[line scan direction], leaving allow_not_available: true
in place so the special value is still allowed but not redundantly enumerated.

503-518: Pattern validator is effectively a non-empty check.

The pattern ^.+$ only validates that the value is non-empty, which is already implied by the column being defined. This is acceptable for free-text identifiers, but if more structure is expected (e.g., alphanumeric with underscores), consider a more specific pattern.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ms-imaging/1.0.0-dev/ms-imaging.yaml` around lines 503 - 518, The pattern
validator on the comment[tissue section] field uses '^.+$' which only enforces
non-empty input; update the validator for the comment[tissue section] entry to
either remove the redundant pattern validator or replace it with a more specific
regex that matches the expected identifier format (for example an
alphanumeric/underscore pattern like '^[A-Za-z0-9_\\-]+$' or whichever structure
is appropriate); modify the validators block for the comment[tissue section]
field (validator_name: pattern and its params) so it either omits the pattern or
supplies the tighter regex and update the params.description and examples to
reflect the chosen constraint.

549-557: Pattern doesn't support ISO 8601 timezone offsets.

The description mentions ISO 8601 format, but the pattern doesn't allow timezone designators (e.g., 2025-07-01T09:30:00Z or 2025-07-01T09:30:00+00:00). If timezone-aware timestamps should be accepted, consider extending the pattern.

Extended pattern supporting timezone (if needed)
-          pattern: '^\d{4}-\d{2}-\d{2}(T\d{2}:\d{2}(:\d{2})?)?$|^not available$'
+          pattern: '^\d{4}-\d{2}-\d{2}(T\d{2}:\d{2}(:\d{2})?(Z|[+-]\d{2}:\d{2})?)?$|^not available$'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ms-imaging/1.0.0-dev/ms-imaging.yaml` around lines 549 - 557, The current
pattern used in the validator (validator_name: pattern, params.pattern) rejects
ISO 8601 timezone designators; update the regex in params.pattern to accept an
optional timezone component (either 'Z' or an offset like +HH:MM / -HH:MM / +HH)
after the time portion, and adjust the params.description/examples to reflect
timezone-aware examples (e.g., 2025-07-01T09:30:00Z and
2025-07-01T09:30:00+02:00); locate and modify the pattern string in the
validator params to add the optional timezone group and update examples
accordingly.

96-129: Consider clarifying the distinction between synonymous values.

The values list includes both full chemical names and their abbreviations (e.g., 2,5-dihydroxybenzoic acid and DHB). While user-friendly for input, this may lead to inconsistent data across datasets.

Also, the distinction between none and not applicable is unclear — none might imply a MALDI experiment without matrix, while not applicable is for non-MALDI methods. Consider adding guidance in the description.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ms-imaging/1.0.0-dev/ms-imaging.yaml` around lines 96 - 129, Update the
characteristics[matrix] description and validators to clarify synonymous entries
and the difference between "none" and "not applicable": in the description for
characteristics[matrix] (and/or its validators block) state that abbreviations
(e.g., "DHB", "CHCA", "SA", "DHAP", "THAP", "DAN", "9-AA") are accepted as
synonyms of their full chemical names and will be normalized to the canonical
full name on ingest, and explicitly define that "not applicable" is for
non-MALDI methods while "none" indicates a MALDI experiment where no matrix was
applied; additionally consider adding a validator note or params comment in the
validators -> values section to recommend normalization of synonyms and to warn
when "none" is used for MALDI entries.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ms-imaging/1.0.0-dev/ms-imaging.yaml`:
- Around line 576-580: The ontology reference is inconsistent: the field name
"comment[washing method]" currently has ontology_accession set to IMS:1002402
while the description references "IMS:1002400 hierarchy"; fix by making them
match—either change ontology_accession to IMS:1002400 to match the description,
or update the description text to reference IMS:1002402 (and adjust the
parent/child phrasing accordingly) so the accession and description refer to the
same IMS term.
- Around line 38-40: The manifest generator currently looks for the wrong key
name "exclusive_with" (in scripts/generate_manifest.py where the generator
checks for exclusive_with at the two spots around the handlers at lines
referenced in the review) so mutual-exclusion entries from templates are
dropped; update all checks and references in generate_manifest.py to use the
schema field name "mutually_exclusive_with" instead of "exclusive_with"
(including any parsing, merging, or validation branches that read or propagate
that key) so the generator copies template constraints into the produced
manifest; ensure both occurrences referenced in the review are changed and any
unit/test that asserts exclusive behavior is updated to use the new key name.
- Around line 608-659: Update the ontology accession for the field named
comment[resolving power]: change the ontology_accession value from MS:1000011 to
MS:1000800 in the comment[resolving power] block so it references the correct
PSI-MS term (mass resolving power); leave the comment[mass resolution] entry
(which uses MS:1000011) unchanged.
- Around line 417-418: Replace the invalid IMS:1000232 term used in two places:
for comment[matrix application method] change the ontology accession from
IMS:1000232 to MS:1000833 (PSI-MS "matrix application type"); for
characteristics[tissue modification] remove the IMS ontology reference and
reassign this field to a specimen/sample-prep ontology (e.g., OBI) by replacing
IMS:1000232 with the correct OBI accession for "tissue modification" (or an
appropriate sample-prep ontology term) so the term is sourced from a
specimen/sample-prep ontology rather than IMS.

---

Nitpick comments:
In `@ms-imaging/1.0.0-dev/ms-imaging.yaml`:
- Around line 382-396: The values validator for the field listing sequence
directions currently includes the literal 'not available' while
allow_not_available: true already permits that special value; remove 'not
available' from the values array under the validators block (the values list
containing 'top down', 'bottom up', 'left right', 'right left', 'not available')
and likewise remove 'not available' from the analogous values list for
comment[line scan direction], leaving allow_not_available: true in place so the
special value is still allowed but not redundantly enumerated.
- Around line 503-518: The pattern validator on the comment[tissue section]
field uses '^.+$' which only enforces non-empty input; update the validator for
the comment[tissue section] entry to either remove the redundant pattern
validator or replace it with a more specific regex that matches the expected
identifier format (for example an alphanumeric/underscore pattern like
'^[A-Za-z0-9_\\-]+$' or whichever structure is appropriate); modify the
validators block for the comment[tissue section] field (validator_name: pattern
and its params) so it either omits the pattern or supplies the tighter regex and
update the params.description and examples to reflect the chosen constraint.
- Around line 549-557: The current pattern used in the validator
(validator_name: pattern, params.pattern) rejects ISO 8601 timezone designators;
update the regex in params.pattern to accept an optional timezone component
(either 'Z' or an offset like +HH:MM / -HH:MM / +HH) after the time portion, and
adjust the params.description/examples to reflect timezone-aware examples (e.g.,
2025-07-01T09:30:00Z and 2025-07-01T09:30:00+02:00); locate and modify the
pattern string in the validator params to add the optional timezone group and
update examples accordingly.
- Around line 96-129: Update the characteristics[matrix] description and
validators to clarify synonymous entries and the difference between "none" and
"not applicable": in the description for characteristics[matrix] (and/or its
validators block) state that abbreviations (e.g., "DHB", "CHCA", "SA", "DHAP",
"THAP", "DAN", "9-AA") are accepted as synonyms of their full chemical names and
will be normalized to the canonical full name on ingest, and explicitly define
that "not applicable" is for non-MALDI methods while "none" indicates a MALDI
experiment where no matrix was applied; additionally consider adding a validator
note or params comment in the validators -> values section to recommend
normalization of synonyms and to warn when "none" is used for MALDI entries.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fc1729c6-a1d8-4c25-9368-a8cdad7450fb

📥 Commits

Reviewing files that changed from the base of the PR and between 4ae0ce1 and 30d75d2.

📒 Files selected for processing (2)
  • .gitignore
  • ms-imaging/1.0.0-dev/ms-imaging.yaml

Comment on lines +38 to +40
mutually_exclusive_with:
- ms-proteomics
- affinity-proteomics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, find and check the generate_manifest.py file
find . -name "generate_manifest.py" -type f 2>/dev/null | head -5

Repository: bigbio/sdrf-templates

Length of output: 95


🏁 Script executed:

#!/bin/bash
# Search for field handling in the codebase
rg "exclusive_with|mutually_exclusive_with" --type-list | grep -E "\.py|\.yaml"

Repository: bigbio/sdrf-templates

Length of output: 454


🏁 Script executed:

#!/bin/bash
# Look at the manifest generator implementation
fd "generate_manifest.py" -type f -exec cat {} \;

Repository: bigbio/sdrf-templates

Length of output: 235


🏁 Script executed:

#!/bin/bash
# Search for exclusive_with and mutually_exclusive_with in all files
rg "exclusive_with|mutually_exclusive_with" --no-heading

Repository: bigbio/sdrf-templates

Length of output: 1191


🏁 Script executed:

#!/bin/bash
# Read the manifest generator file
wc -l scripts/generate_manifest.py

Repository: bigbio/sdrf-templates

Length of output: 97


🏁 Script executed:

#!/bin/bash
# Read the generate_manifest.py file to see context around exclusive_with handling
cat -n scripts/generate_manifest.py | head -176

Repository: bigbio/sdrf-templates

Length of output: 6987


🏁 Script executed:

#!/bin/bash
# Look at the schema definition for this field
rg -A 5 "mutually_exclusive_with" sdrf-template.schema.json | head -30

Repository: bigbio/sdrf-templates

Length of output: 293


🏁 Script executed:

#!/bin/bash
# Check if exclusive_with is mentioned in the schema
rg "exclusive_with" sdrf-template.schema.json

Repository: bigbio/sdrf-templates

Length of output: 97


Manifest generator must use mutually_exclusive_with to match schema.

The manifest generator checks for exclusive_with (lines 64-65, 119-120 of scripts/generate_manifest.py), but the schema defines this field as mutually_exclusive_with. Since templates correctly use the schema-defined field name, mutual exclusivity constraints are silently dropped from the generated manifest.

Update the manifest generator to check for mutually_exclusive_with instead of exclusive_with to align with the schema and ensure template constraints are properly propagated.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ms-imaging/1.0.0-dev/ms-imaging.yaml` around lines 38 - 40, The manifest
generator currently looks for the wrong key name "exclusive_with" (in
scripts/generate_manifest.py where the generator checks for exclusive_with at
the two spots around the handlers at lines referenced in the review) so
mutual-exclusion entries from templates are dropped; update all checks and
references in generate_manifest.py to use the schema field name
"mutually_exclusive_with" instead of "exclusive_with" (including any parsing,
merging, or validation branches that read or propagate that key) so the
generator copies template constraints into the produced manifest; ensure both
occurrences referenced in the review are changed and any unit/test that asserts
exclusive behavior is updated to use the new key name.

Comment on lines +417 to +418
- name: comment[matrix application method]
ontology_accession: IMS:1000232
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

IMS:1000232 ontology definition imaging mass spectrometry

💡 Result:

  • IMS:1000232 is not a term in the imzML “Imaging MS Ontology” (imagingMS.obo, ontology id ims, data-version 1.1.0). A search within the canonical imagingMS.obo in the imzML/imzML GitHub repo finds no IMS:1000232 entry. [1]

    • If you meant MS:1000232 (PSI-MS CV), that is a different ontology/prefix (PSI-MS uses MS: accessions, not IMS:).
  • Definition of “imaging mass spectrometry” (IMS / MSI) (conceptual, not the missing IMS:1000232 term):

    • “Mass spectrometry imaging is the method of scanning a sample of interest and generating an ‘image’ of the intensity distribution of a specific analyte/ion.” [2]
    • A peer‑reviewed description similarly characterizes IMS as using mass spectra acquired across spatial positions to visualize spatial molecular distributions. [3]

Sources
[1] imzML GitHub — imagingMS.obo (Imaging Mass Spectrometry Ontology) https://github.com/imzML/imzML/blob/master/imagingMS.obo
[2] MS‑Imaging.org — Background (imzML) https://www.ms-imaging.org/imzml/background/
[3] “imzML: Imaging Mass Spectrometry Markup Language…” (PubMed) https://pubmed.ncbi.nlm.nih.gov/21063949/


🏁 Script executed:

# Check the exact lines in the file to confirm IMS:1000232 usage
cat -n ms-imaging/1.0.0-dev/ms-imaging.yaml | sed -n '354p;418p'

Repository: bigbio/sdrf-templates

Length of output: 150


🏁 Script executed:

# Search for all IMS:1000232 references in the file
rg 'IMS:1000232' ms-imaging/1.0.0-dev/ms-imaging.yaml -n

Repository: bigbio/sdrf-templates

Length of output: 144


🌐 Web query:

MS:1000232 PSI-MS ontology definition

💡 Result:

PSI-MS / Mass Spectrometry Ontology term MS:1000232

  • Preferred label/name: peak intensity
  • IRI: http://purl.obolibrary.org/obo/MS_1000232
  • Status: obsolete
  • Obsolete reason / replacement: “This term was made obsolete because it was replaced by base peak intensity (MS:1000505).” [1]

Source: Mass Spectrometry Ontology (PSI-MS) generated documentation entry for MS_1000232. [1]

References:
[1] https://akswnc7.informatik.uni-leipzig.de/dstreitmatter/archivo/purl.obolibrary.org/obo--ms--owl/2022.03.24-101928/obo--ms--owl_type%3DgeneratedDocu.html


🌐 Web query:

IMS ontology "tissue modification" imaging mass spectrometry

💡 Result:

In imaging mass spectrometry, “IMS ontology” usually refers to the imzML controlled vocabulary imagingMS.obo (also called the Imaging MS Ontology / Mass Spectrometry Imaging Ontology), which defines instrument/imaging-run metadata terms (e.g., pixel size, stage position, scan pattern, etc.) used inside imzML files—not biological concepts like tissue processing. The imzML project describes these imaging-specific CV parameters as being stored in imagingMS.obo. [1]

If you’re looking for a term like “tissue modification” (e.g., fixation, embedding, derivatization, washing, antigen retrieval, etc.), that kind of sample-prep / specimen metadata is generally outside the scope of the imzML imagingMS.obo vocabulary; tooling around imzML commonly notes that sample metadata would require additional ontologies outside the package’s scope. [2]

Practical next steps:

  • Use the CardinalIO ontology helpers to confirm what exists in the imzML “ims” CV (e.g., find_terms("tissue", "ims") / find_terms("modif", "ims")), and you’ll likely find nothing relevant for “tissue modification,” because the CV is imaging-run–focused. [2]
  • Model “tissue modification” with a separate ontology (often OBI and/or other specimen/sample-prep vocabularies) and keep it alongside imzML metadata rather than inside imagingMS.obo. [2]

Sources
[1] MS-Imaging.org imzML background (imzML imaging-specific controlled vocabulary stored in imagingMS.obo) (ms-imaging.org)
[2] CardinalIO manual (imzML-related ontologies provided; sample metadata noted as outside scope / would require other ontologies) (bioc.r-universe.dev)

Citations:


🌐 Web query:

IMS ontology "matrix application method" imaging mass spectrometry

💡 Result:

In the HUPO-PSI Mass Spectrometry (PSI-MS) controlled vocabulary/ontology (often used to annotate MALDI/IMS metadata), what you’re looking for as “matrix application method” is captured as:

  • matrix application typeMS:1000833 (IRI: http://purl.obolibrary.org/obo/MS_1000833)
    • Relationship: part_of some MALDI matrix application
    • Child terms (i.e., the actual application “methods”) include:
      • sprayed MALDI matrix preparation (MS:1000838)
      • printed MALDI matrix preparation (MS:1000837)
      • spin coating MALDI sample preparation (MS:1001992)
      • dried droplet MALDI matrix preparation
      • precoated MALDI sample plate
      • sandwich MALDI sample preparation, overlayer MALDI sample preparation, vacuum drying MALDI sample preparation, etc. (akswnc7.informatik.uni-leipzig.de)

Why it matters in imaging mass spectrometry (IMS / MALDI-MSI): matrix application is widely noted as a critical sample-prep step, and the method of matrix application influences achievable spatial resolution and data quality. (pmc.ncbi.nlm.nih.gov)

Citations:


IMS:1000232 is not a valid term in the IMS ontology and must be corrected at both locations.

  • Line 354 (characteristics[tissue modification]): "Tissue modification" is outside the scope of the IMS ontology, which covers only imaging-run metadata. Use a specimen/sample-prep ontology (e.g., OBI) instead.
  • Line 418 (comment[matrix application method]): Replace IMS:1000232 with MS:1000833 (PSI-MS: "matrix application type"), which is the correct term for MALDI matrix application methods in imaging mass spectrometry.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ms-imaging/1.0.0-dev/ms-imaging.yaml` around lines 417 - 418, Replace the
invalid IMS:1000232 term used in two places: for comment[matrix application
method] change the ontology accession from IMS:1000232 to MS:1000833 (PSI-MS
"matrix application type"); for characteristics[tissue modification] remove the
IMS ontology reference and reassign this field to a specimen/sample-prep
ontology (e.g., OBI) by replacing IMS:1000232 with the correct OBI accession for
"tissue modification" (or an appropriate sample-prep ontology term) so the term
is sourced from a specimen/sample-prep ontology rather than IMS.

Comment on lines +576 to +580
- name: comment[washing method]
ontology_accession: IMS:1002402
description: >-
Wash protocol applied to the tissue section prior to MSI acquisition.
IMS:1002400 hierarchy (e.g. Carnoy's wash, chloroform wash, ethanol wash).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Inconsistent ontology reference.

The ontology_accession is IMS:1002402 but the description references "IMS:1002400 hierarchy". Align these by either updating the accession to the parent term or correcting the description.

Proposed fix (if IMS:1002402 is the correct accession)
   description: >-
     Wash protocol applied to the tissue section prior to MSI acquisition.
-    IMS:1002400 hierarchy (e.g. Carnoy's wash, chloroform wash, ethanol wash).
+    IMS:1002402 (e.g. Carnoy's wash, chloroform wash, ethanol wash).
     Affects which lipid/metabolite classes are detected. Use 'not applicable'
     if no washing was performed.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ms-imaging/1.0.0-dev/ms-imaging.yaml` around lines 576 - 580, The ontology
reference is inconsistent: the field name "comment[washing method]" currently
has ontology_accession set to IMS:1002402 while the description references
"IMS:1002400 hierarchy"; fix by making them match—either change
ontology_accession to IMS:1002400 to match the description, or update the
description text to reference IMS:1002402 (and adjust the parent/child phrasing
accordingly) so the accession and description refer to the same IMS term.

Comment on lines +608 to +659
- name: comment[resolving power]
ontology_accession: MS:1000011
description: Detector resolving power at a reference m/z. Format '{value} at {mz} m/z'. Use 'not available' if unrecorded. Corresponds to METASPACE Detector_Resolving_Power field.
requirement: optional
allow_not_applicable: false
allow_not_available: true
validators:
- validator_name: pattern
params:
pattern: '^\d+(\.\d+)? at \d+(\.\d+)? m/z$|^not available$'
description: Resolving power at reference m/z or 'not available'
examples:
- 140000 at 200 m/z
- 50000 at 400 m/z
- 30000 at 1000 m/z
- not available

- name: comment[normalization method]
ontology_accession: IMS:1002600
description: Spectral intensity normalization method applied during data processing. Use 'not applicable' if no normalization was applied.
requirement: optional
allow_not_applicable: true
allow_not_available: true
validators:
- validator_name: values
params:
values:
- TIC
- RMS
- median
- none
- not applicable
- not available
error_level: warning
description: Spectral normalization method

- name: comment[mass resolution]
ontology_accession: MS:1000011
description: Mass resolution of the instrument at the reported m/z. MS:1000011. Format '{value} ppm', '{value} Da', or '{value} mDa'. Use 'not available' if unrecorded.
requirement: optional
allow_not_applicable: false
allow_not_available: true
validators:
- validator_name: number_with_unit
params:
units: [ppm, Da, mDa]
error_level: warning
description: Mass resolution with unit
examples:
- 5 ppm
- 0.5 Da
- 50 mDa
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

PSI-MS MS:1000011 mass resolution ontology definition

💡 Result:

PSI-MS term MS:1000011 (“mass resolution”) — definition: “Smallest mass difference between two equal magnitude peaks so that the valley between them is a specified fraction of the peak height.” [1]


🌐 Web query:

PSI-MS ontology resolving power term accession

💡 Result:

In the PSI-MS (MS) ontology, the appropriate term for “resolving power” is mass resolving power, accession MS:1000800. [1], [2]

Note: an older/duplicate “resolving power” wording is treated as obsolete/redundant and points to MS:1000800. [2]

Sources:
[1] imzML specification showing example MS:1000800 (mass resolving power)
[2] PSI-MS ontology listing indicating replacement by mass resolving power (MS:1000800)


Change comment[resolving power] ontology_accession from MS:1000011 to MS:1000800.

The PSI-MS ontology defines MS:1000011 as "mass resolution" (peak separation ability), while MS:1000800 is the correct accession for "mass resolving power" (dimensionless ratio). The comment[resolving power] field at line 609 should use MS:1000800, not MS:1000011.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ms-imaging/1.0.0-dev/ms-imaging.yaml` around lines 608 - 659, Update the
ontology accession for the field named comment[resolving power]: change the
ontology_accession value from MS:1000011 to MS:1000800 in the comment[resolving
power] block so it references the correct PSI-MS term (mass resolving power);
leave the comment[mass resolution] entry (which uses MS:1000011) unchanged.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new technology-layer SDRF template for mass spectrometry imaging (MSI) experiments (ms-imaging), intended to extend sample-metadata with MSI acquisition and sample-prep columns and validation rules.

Changes:

  • Introduces ms-imaging template v1.0.0-dev with MSI-specific columns/validators for MALDI/DESI/SIMS.
  • Updates .gitignore to ignore .DS_Store.

Reviewed changes

Copilot reviewed 1 out of 2 changed files in this pull request and generated 6 comments.

File Description
ms-imaging/1.0.0-dev/ms-imaging.yaml New MSI technology template extending sample-metadata with spatial acquisition + instrument/prep metadata and validation.
.gitignore Ignores macOS .DS_Store files.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

contributors:
- Lisa Boatner (lboatner@chanzuckerberg.com) - Chan Zuckerberg Initiative
version: 1.0.0-dev
extends: sample-metadata
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extends is inconsistent with the rest of the repo: other templates pin/constraint the parent (e.g., extends: sample-metadata@>=1.0.0). Using plain sample-metadata may make inheritance ambiguous when sample-metadata versions change; consider switching to sample-metadata@>=1.0.0 for consistency.

Suggested change
extends: sample-metadata
extends: sample-metadata@>=1.0.0

Copilot uses AI. Check for mistakes.
Comment on lines +45 to +52
- validator_name: combination_of_columns_no_duplicate_validator
params:
column_name:
- source name
- assay name
column_name_warning:
- source name
- assay name
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This template defines combination_of_columns_no_duplicate_validator for source name + assay name, but that same uniqueness check already exists in the inherited base template. Consider removing this validator here (or extending it with additional columns if you intended a stricter uniqueness constraint) to avoid redundant validation/reporting.

Suggested change
- validator_name: combination_of_columns_no_duplicate_validator
params:
column_name:
- source name
- assay name
column_name_warning:
- source name
- assay name

Copilot uses AI. Check for mistakes.
- assay name
columns:
- name: technology type
ontology_accession: EFO:0010003
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technology type uses ontology_accession: EFO:0010003, but the same column in base, ms-proteomics, and affinity-proteomics uses PRIDE:0000663. If tooling relies on this accession for consistent column semantics, consider keeping PRIDE:0000663 here as well (you can still override allowed values via the values validator).

Suggested change
ontology_accession: EFO:0010003
ontology_accession: PRIDE:0000663

Copilot uses AI. Check for mistakes.
description: Line scan direction (IMS:1000480 horizontal or IMS:1000481 vertical)

- name: comment[matrix application method]
ontology_accession: IMS:1000232
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment[matrix application method] is assigned ontology_accession: IMS:1000232, which is also used for characteristics[tissue modification]. Given the different concepts, this looks like a copy/paste error; consider updating to the correct IMS accession for matrix application method (or omitting ontology_accession if it’s not well-defined).

Suggested change
ontology_accession: IMS:1000232

Copilot uses AI. Check for mistakes.
Comment on lines +576 to +600
- name: comment[washing method]
ontology_accession: IMS:1002402
description: >-
Wash protocol applied to the tissue section prior to MSI acquisition.
IMS:1002400 hierarchy (e.g. Carnoy's wash, chloroform wash, ethanol wash).
Affects which lipid/metabolite classes are detected. Use 'not applicable'
if no washing was performed.
requirement: optional
allow_not_applicable: true
allow_not_available: true
validators:
- validator_name: values
params:
values:
- Carnoy's wash
- chloroform wash
- ethanol wash
- graded ethanol series
- ammonium formiate wash
- none
- not applicable
- not available
error_level: warning
description: Tissue wash protocol (IMS:1002400 hierarchy)

Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In comment[washing method], the text references the IMS:1002400 hierarchy, but the column’s ontology_accession is IMS:1002402. Also, the guidance says to use not applicable when no washing was performed, but the allowed values include none—these should be made consistent to avoid confusing annotators.

Copilot uses AI. Check for mistakes.
- chloroform wash
- ethanol wash
- graded ethanol series
- ammonium formiate wash
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in allowed value: ammonium formiate wash should be ammonium formate wash.

Suggested change
- ammonium formiate wash
- ammonium formate wash

Copilot uses AI. Check for mistakes.
@ypriverol
Copy link
Member

@lmboat I love the idea; a couple of things before doing the full review of the template:

  • Can you have a look to PRIDE ontology https://github.com/PRIDE-Archive/pride-ontology, check which terms are there; id they are not; then add them there. You can also use PSI-MS; we use both independently. With the PRIDE ontology is now getting optimized for SDRF; then for me is the main point. Before merging this PR all therms needs to be added.
  • You also got some recommendations from @copilot.
  • Review also which columns from the ms-proteomics are needed or from where you doing the inheritance.
  • Can you also add your name to SDRF specification, including your affiliation and also to this template
  • Can you try to ping other people from the community of imaging to review the template?

This is absolutly fantastic and thanks for this PR.

@nithujohn
Copy link

@lmboat thank you so much for the template in MS- image. But looking at the template you mentioned in bigbio/proteomics-sample-metadata#450 matrix is tagged as comment[matrix] but for the pull request you have proposed as characteristics[matrix]. Just want to make sure it's not an error.

@lmboat
Copy link
Author

lmboat commented Mar 9, 2026

@lmboat thank you so much for the template in MS- image. But looking at the template you mentioned in bigbio/proteomics-sample-metadata#450 matrix is tagged as comment[matrix] but for the pull request you have proposed as characteristics[matrix]. Just want to make sure it's not an error.

@nithujohn good catch! I missed the fields in the unmerged example. comment[matrix] should be a comment rather than characteristics since the matrix is a sample preparation/analytical parameter rather than an intrinsic biological property of the sample. I will update the PR to reflect this, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants