Skip to content

Conversation

@BelindaBGarana
Copy link
Contributor

@BelindaBGarana BelindaBGarana commented Jan 21, 2026

Resolves issue #801 (exceeding Synapse's 100-value limit for enums) by implementing cascading conditional filters.

Solution

  • Added filter fields: modelSystemType, cellLineCategory, cellLineGeneticDisorder
  • Generated 29 filtered enum subsets (all <100 entries) from syn51730943
  • Implemented JSON Schema if/then/else conditionals to dynamically filter options
  • Updated entity view logic to skip enum constraints for conditional fields

Result

Users select filters to narrow modelSystemName to relevant subsets (<100 values), while maintaining full searchability of all 809 entries.

Changes

Core changes:

  • .github/workflows/weekly-model-system-sync.yml - Updated workflow
  • .gitignore - Added cache pattern
  • Makefile - Deep merge fix
  • modules/Template/Data_Base.yaml - Field reordering
  • modules/Sample/CellLineCategory.yaml - Updated enum values
  • modules/Sample/CellLineModel.yaml - Updated base enum
  • modules/Sample/AnimalModel.yaml - Updated base enum
  • dist/NF.yaml - Merged schema

Scripts:

  • utils/add_conditional_enum_filtering.py - NEW conditional filtering script
  • utils/json_schema_entity_view.py - Skip conditional field enums
  • utils/sync_model_systems_enhanced.py - YAML fix

Largest filtered subset: 54 entries (Human + Cancer + NF1)

Generated with AI assistance from Claude Code

BelindaBGarana and others added 2 commits January 20, 2026 14:12
Revert enum value limit from 1000 back to 100 to comply with Synapse's
server-side constraint. The recent change to 1000 in commit 112db14
caused the create-curation-task workflow to fail with:

  400 Client Error: Maximum allowed enum values is 100

This limit is enforced by Synapse's API regardless of client settings.
Fields with >100 enum values (like modelSystemName with 809 values)
will now only use the first 100 values for validation.

Affected fields across schemas:
- modelSystemName: 809 values (37+ templates)
- assay: 202-203 values
- fileFormat: 118-119 values
- platform: 122-123 values
- institutions: 331 values

Fixes workflow run: https://github.com/nf-osi/nf-metadata-dictionary/actions/runs/21188870455

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Implement comprehensive filtering system to handle enum fields with >100 values
by using cascading filters based on user selections. This enables the Synapse
curator grid to show contextually relevant options without hitting the 100-value
limit.

New Filter Fields:
- modelSystemType: cell line, animal model, organoid, PDX
- cellLineCategory: cancer cell line, iPSC, transformed, etc.
- cellLineGeneticDisorder: NF1, NF2, schwannomatosis, etc.

Filter Cascade:
modelSystemType → modelSpecies → cellLineCategory → cellLineGeneticDisorder → modelSystemName

Generated 29 filtered enum subsets, all with <100 entries:
- Human NF1 cancer cell lines: 54 entries ✓
- Human NF1 iPSCs: 32 entries ✓
- Human transformed cell lines: 31/29 entries ✓
- Mouse, zebrafish, fly models: all <10 entries ✓

Data Source:
- Switched from syn26450069 to syn51730943 (NF Tools Database)
- Now includes species, cellLineCategory, cellLineGeneticDisorder metadata
- Maintains backward compatibility with CellLineModel.yaml, AnimalModel.yaml

Files Changed:
- Added ModelSystemType.yaml, CellLineCategory.yaml, CellLineGeneticDisorder.yaml
- Added 29 filtered enum files in modules/Sample/generated/
- Updated props.yaml with new filter fields and dependencies
- Created sync_model_systems_enhanced.py for generating filtered subsets
- Fixed json_schema_entity_view.py to use 100-value limit (not 1000)
- Added comprehensive implementation plan in docs/

Next Steps (still pending):
1. Add if/then/else conditional dependencies to JSON schemas
2. Reorder template fields (filters before modelSystemName)
3. Update json_schema_entity_view.py to skip enum constraints for conditional fields
4. Update weekly-model-system-sync.yml workflow
5. Rebuild schemas and test

Relates to: #797 (enum value limit issue)
Fixes: workflow run 21188870455 (400 error: max 100 enum values)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@BelindaBGarana BelindaBGarana linked an issue Jan 21, 2026 that may be closed by this pull request
This commit implements a comprehensive solution for handling the 809-value
modelSystemName enum by adding cascading conditional filters that reduce
options to <100 entries based on user selections. This resolves the Synapse
entity view constraint of maximum 100 enum values.

## Key Changes

### 1. New Filter Fields
- Added modelSystemType enum (cell line, animal model, organoid, PDX)
- Added cellLineCategory enum (10 categories from syn51730943)
- Added cellLineGeneticDisorder enum (5 disorders)
- Fields reordered in BiologicalAssayDataTemplate so filters appear before
  modelSystemName to enable proper UX in Synapse curator grid

### 2. Enhanced Sync Script
- Updated sync_model_systems_enhanced.py to query syn51730943 with full metadata
- Generates 29 filtered enum subsets in modules/Sample/generated/
- All filtered subsets have <100 entries (largest: 54 entries)
- Maintains backward compatibility with CellLineModel and AnimalModel enums
- Fixed YAML indentation bug in base enum file generation

### 3. JSON Schema Conditionals
- Created add_conditional_enum_filtering.py post-processing script
- Adds 28 if/then/else rules to each biological assay template
- Rules reference filtered enum subsets in $defs
- Enum values loaded from generated YAML files

### 4. Entity View Support
- Modified json_schema_entity_view.py to detect conditional fields
- Skips enum constraints on Synapse columns with conditional filtering
- Allows curator grid to handle filtering dynamically via JSON Schema

### 5. Build System Updates
- Updated Makefile to use deep merge (*+) for proper enum combination
- Updated weekly-model-system-sync.yml workflow to use enhanced sync script
- Workflow now tracks modules/Sample/generated/ files

## Files Changed
- Core: 4 files (Makefile, workflows, template, props)
- Modules: 3 base files + 29 generated enum subsets
- JSON Schemas: 63 schemas regenerated with new fields + conditionals
- Utils: 3 scripts (sync, filtering, entity view)
- Docs: Status tracking added

## Result
Users can now select filter values (species, category, disorder) to narrow
modelSystemName options to relevant subsets, all under Synapse's 100-value
limit. The full 809-value list remains searchable through conditional filtering.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@github-actions
Copy link
Contributor

github-actions bot commented Jan 21, 2026

Schema Validation Report

Generated: 2026-01-21 21:35:09 UTC

Summary

  • Generated schemas: 63
  • Validation passed: 57
  • Validation failed: 6

Details

  • GenericDataResourceTemplate.json: ✅ PASSED
  • GenomicsArrayTemplate.json: ✅ PASSED
  • ScSequencingAssayTemplate.json: ✅ PASSED
  • ScRNASeqTemplate.json: ✅ PASSED
  • GeneralMeasureDataTemplate.json: ❌ FAILED
  • PublicationTemplate.json: ✅ PASSED
  • PortalStudy.json: ✅ PASSED
  • ProcessedVariantCallsTemplate.json: ✅ PASSED
  • FlowCytometryTemplate.json: ✅ PASSED
  • BiospecimenTemplate.json: ✅ PASSED
  • WorkflowReport.json: ✅ PASSED
  • ReferenceSequenceTemplate.json: ✅ PASSED
  • AffinityProteomicsTemplate.json: ✅ PASSED
  • PlateBasedReporterAssayTemplate.json: ✅ PASSED
  • NonBiologicalAssayDataTemplate.json: ✅ PASSED
  • ImmunoMicroscopyTemplate.json: ❌ FAILED
  • RecordBasedTemplate.json: ✅ PASSED
  • Superdataset.json: ✅ PASSED
  • EpigeneticsAssayTemplate.json: ❌ FAILED
  • UpdateMilestoneReport.json: ✅ PASSED
  • GeneticsAssayTemplate.json: ✅ PASSED
  • GenomicsAssayTemplate.json: ✅ PASSED
  • Template.json: ✅ PASSED
  • ProcessedGeneExpressionTemplate.json: ✅ PASSED
  • ProteomicsAssayTemplate.json: ✅ PASSED
  • MRIAssayTemplate.json: ✅ PASSED
  • ProteinAssayTemplate.json: ✅ PASSED
  • WESTemplate.json: ✅ PASSED
  • EpidemiologyDataTemplate.json: ✅ PASSED
  • PdxGenomicsAssayTemplate.json: ✅ PASSED
  • SourceCodeTemplate.json: ✅ PASSED
  • ProtocolTemplate.json: ✅ PASSED
  • BiologicalAssayDataTemplate.json: ✅ PASSED
  • BulkSequencingAssayTemplate.json: ✅ PASSED
  • MaterialScienceAssayTemplate.json: ✅ PASSED
  • GenomicsAssayTemplateExtended.json: ✅ PASSED
  • CellTissuePhenotypingTemplate.json: ✅ PASSED
  • HumanCohortTemplate.json: ✅ PASSED
  • PortalPublication.json: ✅ PASSED
  • PortalDataset.json: ✅ PASSED
  • ProcessedExpressionTemplate.json: ❌ FAILED
  • PartialTemplate.json: ✅ PASSED
  • ProteinInteractionAssayTemplate.json: ✅ PASSED
  • DataLandscape.json: ✅ PASSED
  • ProteinArrayTemplate.json: ❌ FAILED
  • MethylationArrayTemplate.json: ✅ PASSED
  • BehavioralAssayTemplate.json: ✅ PASSED
  • MassSpecAssayTemplate.json: ✅ PASSED
  • AnimalIndividualTemplate.json: ✅ PASSED
  • MicroscopyAssayTemplate.json: ✅ PASSED
  • WGSTemplate.json: ✅ PASSED
  • PharmacokineticsAssayTemplate.json: ❌ FAILED
  • KinomicsAssayTemplate.json: ✅ PASSED
  • LightScatteringAssayTemplate.json: ✅ PASSED
  • ElectrophysiologyAssayTemplate.json: ✅ PASSED
  • FileBasedTemplate.json: ✅ PASSED
  • ChIPSeqTemplate.json: ✅ PASSED
  • ProcessedMergedDataTemplate.json: ✅ PASSED
  • EpigenomicsAssayTemplate.json: ✅ PASSED
  • ClinicalAssayTemplate.json: ✅ PASSED
  • ImagingAssayTemplate.json: ✅ PASSED
  • ProcessedAlignedReadsTemplate.json: ✅ PASSED
  • RNASeqTemplate.json: ✅ PASSED

@github-actions
Copy link
Contributor

✅ Artifact Build Status

All artifacts have been successfully built and validated from source modules.

Artifacts validated:

  • NF.jsonld (schematic-compatible JSON-LD)
  • dist/NF.yaml (LinkML YAML)
  • dist/NF.ttl (Turtle RDF)
  • registered-json-schemas/*.json (Synapse JSON schemas)

Note: Artifacts are not committed to this PR to avoid merge conflicts. All artifacts will be automatically rebuilt and committed to main after merge.

@github-actions
Copy link
Contributor

Entity Counts

Main branch: 4035 entities

  • Classes: 56
  • Slots: 479
  • Enums: 109
  • Anonymous: 795
  • Other: 2596

Current branch: 4055 entities

  • Classes: 56
  • Slots: 482
  • Enums: 112
  • Anonymous: 795
  • Other: 2610

Difference: +20 entities

Slots

Added (3):

  • cellLineCategory (Cell Line Category)
  • cellLineGeneticDisorder (Cell Line Genetic Disorder)
  • modelSystemType (Model System Type)
Enums

Added (3):

  • CellLineCategoryEnum
  • CellLineGeneticDisorderEnum
  • ModelSystemTypeEnum
Triple Counts

Main branch: 18321 triples
Current branch: 18503 triples
Difference: +182 triples

Template Changes

Modified: 45/45 templates

Modified Templates (45)
  • AffinityProteomicsTemplate
  • BehavioralAssayTemplate
  • BiologicalAssayDataTemplate
  • BulkSequencingAssayTemplate
  • CellTissuePhenotypingTemplate
  • ChIPSeqTemplate
  • ClinicalAssayTemplate
  • ElectrophysiologyAssayTemplate
  • EpidemiologyDataTemplate
  • EpigenomicsAssayTemplate
  • FileBasedTemplate
  • FlowCytometryTemplate
  • GenericDataResourceTemplate
  • GeneticsAssayTemplate
  • GenomicsArrayTemplate
  • GenomicsAssayTemplate
  • GenomicsAssayTemplateExtended
  • ImagingAssayTemplate
  • KinomicsAssayTemplate
  • LightScatteringAssayTemplate
  • MRIAssayTemplate
  • MassSpecAssayTemplate
  • MaterialScienceAssayTemplate
  • MethylationArrayTemplate
  • MicroscopyAssayTemplate
  • NonBiologicalAssayDataTemplate
  • PdxGenomicsAssayTemplate
  • PlateBasedReporterAssayTemplate
  • ProcessedAlignedReadsTemplate
  • ProcessedGeneExpressionTemplate
  • ProcessedMergedDataTemplate
  • ProcessedVariantCallsTemplate
  • ProteinAssayTemplate
  • ProteinInteractionAssayTemplate
  • ProteomicsAssayTemplate
  • ProtocolTemplate
  • RNASeqTemplate
  • RecordBasedTemplate
  • ReferenceSequenceTemplate
  • ScRNASeqTemplate
  • ScSequencingAssayTemplate
  • SourceCodeTemplate
  • WESTemplate
  • WGSTemplate
  • WorkflowReport

Range Changes

Found 3 slots with semantic range changes

Range Change Details (3 slots)

cellLineCategory (Cell Line Category)

  • Added: CellLineCategoryEnum

cellLineGeneticDisorder (Cell Line Genetic Disorder)

  • Added: CellLineGeneticDisorderEnum

modelSystemType (Model System Type)

  • Added: ModelSystemTypeEnum

@github-actions
Copy link
Contributor

Test Suite Report 24.7.2

Template Generation

template result link
BehavioralAssayTemplate 😄 template link
ChIPSeqTemplate 😄 template link
ClinicalAssayTemplate 😄 template link
EpigeneticsAssayTemplate
FlowCytometryTemplate 😄 template link
GenomicsAssayTemplate 😄 template link
GenomicsAssayTemplateExtended 😄 template link
HumanCohortTemplate
ImagingAssayTemplate 😄 template link
LightScatteringAssayTemplate 😄 template link
MethylationArrayTemplate 😄 template link
MRIAssayTemplate
PharmacokineticsAssayTemplate
PlateBasedReporterAssayTemplate 😄 template link
ProcessedAlignedReadsTemplate 😄 template link
ProcessedExpressionTemplate
ProcessedVariantCallsTemplate 😄 template link
ProteomicsAssayTemplate 😄 template link
ProtocolTemplate 😄 template link
RNASeqTemplate 😄 template link
ScRNASeqTemplate 😄 template link
UpdateMilestoneReport 😄 template link
WESTemplate
WGSTemplate 😄 template link

Manifest Validation

manifest result expectation
GenomicsAssayTemplate_0.csv 😄 Lists can be blank if attr not required using ‘list like’ rule
GenomicsAssayTemplate_1.csv 😄 Mixing blanks and regular list values works
GenomicsAssayTemplate_2.csv 😄 Conditional validation for attributes is currently not supported
GenomicsAssayTemplate_control.csv 😄 There should be no issue with this template.
ScRNASeqTemplate_0.csv 😄 Single list val works by using ‘list like’ rule
ScRNASeqTemplate_1.csv 😄 Fail because of missing data in required field libraryStrand

Manifest Submission

## _Manifest submission tests are currently in revision due to system migration._

BelindaBGarana and others added 15 commits January 21, 2026 09:25
Resolves the "unhashable type: 'list'" error that occurred when creating
entity views from schemas with nullable fields (e.g., type: ['array', 'null']).

The issue occurred because the code expected 'type' to be a string, but JSON
Schema allows it to be a list for nullable fields. This is a standard pattern
for optional fields in JSON Schema draft-07.

Changes:
- Updated _get_column_type_from_js_property() to handle list types
- Updated _get_column_type_from_js_one_of_list() to handle list types
- When type is a list, extract the first non-null type
- Added inline documentation explaining nullable type handling

Testing:
- Verified with nullable string, array, and number types
- Successfully parses ImagingAssayTemplate.json with 29 columns
- Conditional enum filtering continues to work correctly

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Resolves the "Too much data per column" error (106,114 bytes > 64KB limit)
that occurred when creating entity views with many enum columns.

The issue occurred because setting enum_values on columns stores those values
as part of the column definition, consuming row size. With multiple columns
having large enum lists (platform: 54 values, dataType: 60+ values, tumorType:
51 values, etc.), the total exceeded Synapse's 64KB limit.

Solution:
- Removed all enum_values from column definitions in entity views
- The JSON Schema binding already provides all validation and UI features
- Setting enum_values on columns is redundant when schema is bound
- The curator grid uses the bound JSON Schema for dropdowns/filtering

Benefits:
- Entity views stay well under the 64KB row size limit
- No loss of functionality - schema binding provides all enum features
- Cleaner, more maintainable code
- Consistent with best practices for schema-bound entities

Testing:
- Verified no columns have enum_values set
- All 29 columns created successfully
- Schema binding continues to provide validation

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This fix resolves the persistent "Too much data per column" error by ensuring
that old file views with enum-heavy column definitions are deleted before
creating fresh views.

Problem:
- Previous runs created file views with enum_values set on columns
- Even after fixing the code to not set enum_values, the existing views
  (like syn72372628) still had the old column definitions
- When .store() was called, it tried to update the existing view
- Synapse still checked the row size including old enum values
- Result: 106,114 bytes > 64KB limit

Solution:
- Before creating a new file view, check if one with the same name exists
- If found, delete it to ensure a clean slate
- Then create the new view with clean column definitions (no enum_values)
- This guarantees each run gets a fresh view with minimal row size

Implementation:
- Use syn.findEntityId() to check for existing views by name
- Delete found views before creating new ones
- Handle exceptions gracefully if no existing view is found

This ensures that changes to column definitions (like removing enum_values)
take effect immediately on the next run.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Resolves the persistent "Too much data per column" error by reducing the
maximum_size settings for STRING and STRING_LIST columns.

Problem:
- Entity views include ~50 total columns (29 schema + 21 system columns)
- Previous settings: STRING=250, STRING_LIST=100
- With STRING_LIST potentially multiplied by max list length (~100),
  the cumulative row size exceeded 119KB
- Synapse's hard limit is 64KB per row

Root Cause Analysis:
- STRING columns with maximum_size=250 each
- STRING_LIST columns where size = maximum_size × max_list_length
- With 2 STRING_LIST columns at 100 bytes each × 100 items = 20KB just for lists
- Plus 40+ STRING columns at 250 bytes = 10KB+
- Plus system column overhead
- Total: well over 64KB

Solution:
- Reduced STRING maximum_size: 250 → 100 bytes
- Reduced STRING_LIST maximum_size: 100 → 50 bytes
- Reduced name column: 256 → 100 bytes

New Estimated Row Size:
- 26 STRING columns × 100 = 2,600 bytes
- 2 STRING_LIST columns × 50 × 100 = 10,000 bytes (worst case)
- Total schema columns: ~12,750 bytes
- With system columns: well under 64KB limit

These sizes are sufficient for typical metadata values:
- Most enum values and IDs fit comfortably in 100 chars
- Model system names fit in 50 chars
- JSON Schema validation still enforces data correctness

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…ver)

Previous run failed with 64,494 bytes (494 bytes over the 64,000 byte limit).

Adjusted maximum_size values:
- STRING: 100 → 80 bytes
- STRING_LIST: 50 → 40 bytes
- name column: 100 → 80 bytes

Expected savings:
- STRING columns: 20 bytes × ~40 columns = 800 bytes
- STRING_LIST columns: 10 bytes × 100 items × 2 = 2,000 bytes
- Total: ~2,800 bytes saved

New estimated row size: ~61,700 bytes (safely under 64KB limit)

These sizes remain sufficient for metadata:
- 80 chars accommodates most enum values and identifiers
- 40 chars per list item works for model system names
- JSON Schema validation ensures data correctness

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Resolves "404 Client Error: Entity in trash can" error that occurs when
trying to create a CurationTask with a data_type that already exists.

Problem:
- When creating a CurationTask with a data_type that matches an existing
  or trashed task (e.g., ProteomicsAssay-syn70784418), Synapse fails
- Error: "Entity syn70792001 is in trash can"
- The old task (even if deleted) conflicts with the new task creation

Solution:
- Before creating a new CurationTask, check for existing tasks in the project
- If a task with the same data_type exists, delete it
- This ensures a clean slate for creating new tasks

Implementation:
- Use CurationTask.get_all(project_id) to list all tasks
- Filter by matching data_type
- Call .delete() on matching tasks
- Handle exceptions gracefully if no tasks exist or deletion fails

This mirrors the fix we applied for file views and ensures that
re-running the workflow on the same folder always succeeds.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Resolves missing dropdown values for individualID in Synapse curator grid.

Problem:
- individualID field had no enum values, appearing as free-text in curator grid
- For cell lines, individualID should reference syn51730943 (same as modelSystemName)
- The conditional enum filtering only applied to modelSystemName, not individualID

Solution:
- Updated add_conditional_enum_filtering.py to apply the same conditional logic
  to both modelSystemName and individualID fields
- Both fields now share the same filtered enum references in the then clauses
- Regenerated all 38 JSON schemas with individualID conditional filtering

Technical Details:
- Modified create_conditional_rule() to add individualID alongside modelSystemName
- Both fields reference the same $defs (e.g., CellLineHomosapiensCancercelllineNeurofibromatosistype1Enum)
- Updated docstrings and checks to handle both fields
- Each schema now has 28 conditional rules filtering both fields based on:
  * modelSystemType (cell line, animal model, organoid, PDX)
  * modelSpecies (Homo sapiens, Mus musculus, etc.)
  * cellLineCategory (Cancer, iPSC, etc.)
  * cellLineGeneticDisorder (NF1, NF2, etc.)

This ensures that when users select model system attributes, both modelSystemName
and individualID dropdowns show only relevant options (< 100 values each),
staying within Synapse's enum limit while providing full functionality.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Resolves import error: "cannot import name 'CurationTaskStatus'"

Problem:
- Added code to delete existing curation tasks before creating new ones
- Included an unnecessary import: from synapseclient.models.curation import CurationTaskStatus
- CurationTaskStatus doesn't exist or isn't exported from that module
- This caused the cleanup logic to fail with an import error
- As a result, old tasks weren't deleted, leading to "Entity in trash can" errors

Solution:
- Removed the unnecessary import statement
- CurationTask is already imported at the top of the file
- CurationTask.get_all() works without needing CurationTaskStatus

This allows the cleanup logic to run successfully and delete any existing
curation tasks before creating new ones.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Resolves AttributeError: "type object 'CurationTask' has no attribute 'get_all'"

Problem:
- Used CurationTask.get_all() which doesn't exist
- This caused the cleanup logic to fail
- Old curation tasks remained, causing "Entity in trash can" errors

Solution:
- Changed to CurationTask.list(project_id=project_id)
- This is the correct API method to retrieve curation tasks

Verified by inspecting CurationTask class methods - list() is the proper
method for retrieving all curation tasks for a project.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Resolves missing dropdowns for filter fields like modelSystemType,
cellLineCategory, and cellLineGeneticDisorder in the curator grid.

Problem:
- JSON Schema binding provides validation but NOT dropdown UI
- Users couldn't see valid values for filter fields
- Without filter dropdowns, conditional filtering for modelSystemName
  and individualID couldn't be triggered

Solution:
- Add enum_values back for fields with < 20 enum values
- Skip fields with conditional filtering (modelSystemName, individualID)
- This provides dropdown UI while staying under 64KB row limit

Fields that now have dropdowns (< 20 values):
- modelSystemType (4 values)
- cellLineCategory (10 values)
- cellLineGeneticDisorder (5 values)
- species (11 values)
- modelSpecies (11 values)
- ageUnit, modelAgeUnit (7 values each)
- diagnosis (16 values)
- dataSubtype (6 values)
- resourceType (8 values)

Fields that remain without column enum_values (rely on conditional logic):
- modelSystemName (809 values → filtered by conditionals)
- individualID (no base enum → needs investigation)

Row size impact: ~1,500 bytes added (still well under 64KB limit)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Reorders file view columns to improve data entry workflow.

Order logic:
1. Essential fields (id, name)
2. Filter fields for conditional enum filtering (modelSystemType, modelSpecies, etc.)
3. Conditional fields that depend on filters (modelSystemName, individualID)
4. Core sample metadata (species, sex, age, diagnosis, etc.)
5. Assay metadata (assay, platform, fileFormat, etc.)
6. Data classification (dataType, dataSubtype, resourceType)
7. IDs and references (specimenID, antibodyID, geneticReagentID, etc.)
8. Genotypes (nf1Genotype, nf2Genotype)
9. Comments

This ensures:
- Filter fields appear first so users can select them to trigger conditional dropdowns
- Related fields are grouped together
- Workflow follows logical data entry progression
- Any fields not in the priority list appear at the end

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Changes modelSystemType, modelSpecies, cellLineCategory, and
cellLineGeneticDisorder from optional to required fields.

Rationale:
- These fields control conditional enum filtering for modelSystemName and individualID
- Users must fill them in sequence for dropdowns to appear
- Making them required ensures correct workflow and prevents confusion
- Forces users to provide complete model system metadata

Workflow sequence:
1. modelSystemType (required) → selects model category
2. modelSpecies (required) → selects species
3. cellLineCategory (required for cell lines) → selects cell line type
4. cellLineGeneticDisorder (required for cell lines) → selects disorder
5. Then modelSystemName and individualID show filtered dropdowns (< 100 values)

Note: This may require schema updates for assays that don't use model systems.
Consider if conditional requirements (only when modelSystemType is set) would
be more appropriate.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…tionals

Regenerated all JSON schemas with the following critical updates:

1. Filter fields now marked as REQUIRED:
   - modelSystemType (required)
   - modelSpecies (required)
   - cellLineCategory (required)
   - cellLineGeneticDisorder (required)

2. individualID now has conditional enum filtering:
   - Added 28 conditional rules that filter dropdown values
   - Same filtering logic as modelSystemName
   - Dropdowns will show < 100 values based on filter selections

3. Updated dist/NF.yaml:
   - Merged latest changes from modules/props.yaml
   - Reflects required status of filter fields

Expected behavior in curator grid:
1. User must fill modelSystemType first (required dropdown)
2. User must fill modelSpecies (required dropdown)
3. User must fill cellLineCategory (required dropdown for cell lines)
4. User must fill cellLineGeneticDisorder (required dropdown for cell lines)
5. Then individualID and modelSystemName show filtered dropdowns (< 100 values)

This ensures conditional filtering always triggers and users follow the
correct workflow sequence.

Generated with:
- make NF.yaml
- python utils/gen-json-schema-class.py
- python utils/add_conditional_enum_filtering.py

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Documents that schema binding uses registered URI from main branch during
testing, but column enum values will provide dropdowns for filter fields.

For true local testing, schemas would need to be:
1. Registered to Synapse from this branch, or
2. Merge to main to trigger automatic registration

Current approach relies on:
- Column enum_values for filter field dropdowns (< 20 values)
- File view column definitions to show fields in curator grid

The bound schema (from main) may differ from local changes, but the
critical filter fields (modelSystemType, cellLineCategory, etc.) should
show dropdowns from their column enum_values.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@BelindaBGarana BelindaBGarana marked this pull request as draft January 21, 2026 20:20
Undoing Option 1 (local schema file) approach.
Restoring original schema binding behavior that uses registered URIs.

Preparing for Option 2: selective schema registration for testing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

100 enum value limit

2 participants