Add schema extensions for transcriptomic benchmarking of derivation protocols#9
Add schema extensions for transcriptomic benchmarking of derivation protocols#9
Conversation
…rotocols This commit adds comprehensive schema support for modeling transcriptomic benchmarking studies of NAM derivation protocols, exemplified by monocyte-derived macrophage polarization protocol comparisons. New schema classes: - DerivationProtocol: Detailed protocol specifications with steps and reagents - ProtocolBenchmarkingStudy: Study type for systematic protocol comparisons - ExternalDatabase: Reference databases for benchmarking comparisons - ComparisonMethodology: Statistical and computational comparison methods - BenchmarkingMetric & ProtocolRanking: Performance assessment and ranking - Supporting classes: ProtocolStep, ProtocolReagent, ProtocolReproducibility New enumerations: - ReagentRoleEnum, OperatorDependenceEnum, EquipmentSensitivityEnum - CostLevelEnum, TimeInvestmentEnum, DatabaseTypeEnum - ComparisonTypeEnum, MetricTypeEnum Example data: - Added comprehensive YAML example modeling transcriptomic benchmarking of M-CSF vs GM-CSF macrophage polarization protocols Addresses #8: Support for bioinformatics benchmarking of derivation protocols 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Chris Mungall <cmungall@users.noreply.github.com>
|
Claude finished @cmungall's task —— View job Code Review Completed ✅I've conducted a comprehensive review of the schema extensions for transcriptomic benchmarking of derivation protocols. Here's my detailed feedback: Todo List:
✅ Strengths & Best PracticesExcellent Schema Design
LinkML Conventions Compliance
Comprehensive Example Data
🔧 Minor Issues & RecommendationsSchema Structure (
|
There was a problem hiding this comment.
Pull Request Overview
This PR adds comprehensive schema extensions for modeling transcriptomic benchmarking studies of derivation protocols, using macrophage polarization protocol comparisons as an example. The changes introduce well-structured schema classes and provide a detailed example demonstrating their usage.
Key Changes:
- New schema classes for derivation protocol modeling (
DerivationProtocol,ProtocolStep,ProtocolReagent,ProtocolReproducibility) - Benchmarking infrastructure classes (
ProtocolBenchmarkingStudy,ComparisonMethodology,BenchmarkingMetric,ProtocolRanking) - Reference database support (
ExternalDatabase,DataQualityMetric) - Comprehensive enums for controlled vocabularies (reagent roles, operator dependence, equipment sensitivity, etc.)
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/namo/schema/namo.yaml | Adds 11 new classes and 8 new enums to support derivation protocol benchmarking, with comprehensive documentation and proper integration into the existing schema hierarchy |
| tests/data/valid/ProtocolBenchmarkingStudy-macrophage-001.yaml | Provides a detailed example of macrophage polarization protocol benchmarking, demonstrating M-CSF vs GM-CSF protocols with transcriptomic comparison metrics |
| protocol_ranking: | ||
| - protocol: | ||
| id: "protocol:mcsf-001" | ||
| name: "M-CSF Classical Activation Protocol" |
There was a problem hiding this comment.
The protocol field in ProtocolRanking is defined with inlined: true in the schema (line 949), which means the full DerivationProtocol object should be embedded. However, the example only provides id and name as a reference.
Since the full protocol details are already available in the protocols_compared section (lines 33-158), consider one of these solutions:
- Remove
inlined: truefrom the schema and only use protocol references (id) here - Or embed the complete
DerivationProtocolobject (though this creates duplication)
Option 1 is recommended to avoid data duplication and maintain consistency with how other references work in the schema.
| name: "M-CSF Classical Activation Protocol" | |
| name: "M-CSF Classical Activation Protocol" | |
| description: "Differentiation of monocytes into macrophages using M-CSF, followed by classical activation with LPS and IFN-γ." | |
| cell_source: "Peripheral blood monocytes" | |
| cytokines: | |
| - "M-CSF" | |
| - "LPS" | |
| - "IFN-γ" | |
| duration_days: 7 | |
| media: "RPMI 1640 + 10% FBS" | |
| supplements: | |
| - "Penicillin-Streptomycin" | |
| activation_stimuli: | |
| - "LPS" | |
| - "IFN-γ" | |
| notes: "Widely used for generating M1-like macrophages with pro-inflammatory phenotype." |
| cost_assessment: "MODERATE" | ||
| time_investment: "MODERATE" | ||
| original_publication: | ||
| id: "PMID:12345678" |
There was a problem hiding this comment.
The PMID "PMID:12345678" appears to be a placeholder rather than a real publication identifier. Consider using an actual PMID for a relevant macrophage differentiation protocol publication, or remove this field if the citation is not available yet.
| id: "PMID:12345678" |
|
@claude please fix this test so unit tests pass. |
|
Claude finished @cmungall's task —— View job Todo List:
Fixed Issues ✅Schema Structure Fixes:
Test Data Fixes:
Impact: The fixes have been committed and pushed to the branch ( |
- Add missing 'is_a: NamedThing' declarations to ProtocolStep, ProtocolReagent, ProtocolReproducibility, ComparisonMethodology, BenchmarkingMetric, ProtocolRanking, and DataQualityMetric classes - Remove 'inlined: true' from ProtocolRanking.protocol field to avoid data duplication - Remove placeholder PMID from test data Co-authored-by: Chris Mungall <cmungall@users.noreply.github.com>
This PR adds comprehensive schema support for modeling transcriptomic benchmarking studies of NAM derivation protocols, exemplified by monocyte-derived macrophage polarization protocol comparisons.
New Schema Classes:
Example Data:
Added comprehensive YAML example modeling transcriptomic benchmarking of M-CSF vs GM-CSF macrophage polarization protocols.
Addresses #8
Generated with Claude Code