Skip to content

Update imaging-data-commons skill from v1.4.0 to v1.6.3 upstream#158

Open
fedorov wants to merge 2 commits into
K-Dense-AI:mainfrom
fedorov:update-idc-skill-v1.6.2
Open

Update imaging-data-commons skill from v1.4.0 to v1.6.3 upstream#158
fedorov wants to merge 2 commits into
K-Dense-AI:mainfrom
fedorov:update-idc-skill-v1.6.2

Conversation

@fedorov
Copy link
Copy Markdown
Contributor

@fedorov fedorov commented May 8, 2026

[1.6.3] - 2026-05-09

Added

  • ct_index, mr_index, pt_index tables (idc-index 0.12.3 / idc-index-data 24.2.0): modality-specific acquisition and reconstruction parameter indices, one row per series, all joining on SeriesInstanceUID
    • ct_index (21 columns): pixel spacing, slice thickness, kVp, convolution kernel, tube current min/max (dose-modulated), exposure, spiral pitch, scan options
    • mr_index (22 columns): field strength, scanning sequence, TE (array for multi-echo), TR, flip angle, DiffusionBValue (array for DWI), pixel bandwidth, receive coil, number of temporal positions
    • pt_index (21 columns): radionuclide, injected dose, reconstruction method, decay/scatter/attenuation correction, frame duration (array for dynamic PET), number of time slices
  • SQL query patterns for all three new tables in references/sql_patterns.md
  • Join column entries for ct_index, mr_index, pt_index in references/index_tables_guide.md and SKILL.md
  • Parquet file entries for ct_index.parquet, mr_index.parquet, pt_index.parquet in references/parquet_access_guide.md

Changed

  • Added concrete indices_overview code example showing how to search for a column across all tables and read column schemas without fetching the table; directly addresses the failure mode where agents query index for modality-specific parameters (SliceThickness, KVP, etc.) instead of using ct_index/mr_index/pt_index
  • Added troubleshooting entry "Column not found in index table" with a working indices_overview search snippet and join example, covering common acquisition/reconstruction parameters that live in the modality-specific index tables
  • Updated idc-index reference to 0.12.3
  • Clarified download_from_selection API: added explicit warning that it takes filter keyword arguments (not a DataFrame), comparison table vs download_dicom_series (which has a different first-argument order), and restructured the download example as a step-by-step query → extract UIDs → pass list flow
  • Documented download_dicom_series as an alternative download method with its own signature (seriesInstanceUID as first arg, then downloadDir)
  • Reduced redundancy and duplication in SKILL.md for cleaner reading

[1.6.2] - 2026-05-08

Changed

  • Moved version_metadata_index to second position in Available Tables (right after index) to surface it alongside the primary index
  • Moved prior_versions_index to last position in Available Tables; updated description to clarify it contains only removed/superseded series and should not be queried for current data
  • Added explicit Best Practices rule prohibiting web search for IDC data content questions; idc-index DuckDB queries are always authoritative — web sources are stale
  • Removed "Loaded" column from Available Tables and replaced with an unconditional rule: always call client.fetch_index("table_name") before querying any table; fetch_index() is idempotent for all tables including auto-loaded ones, so no exceptions are needed

[1.6.1] - 2026-05-08

Added

  • series_init_idc_version and series_revised_idc_version columns in primary index table (idc-index-data 24.1.0): expose the IDC version when each series was first added and last revised, enabling version-aware filtering
  • version_metadata_index table: maps each IDC version number to its release timestamp; requires client.fetch_index("version_metadata_index")
  • Tests for new index columns and version_metadata_index (61 total, up from 55)

Changed

  • Updated to idc-index 0.12.2 (idc-index-data 24.1.0); IDC data version remains v24
  • analysis_results_index column renames (idc-index-data 24.1.0): Updatedupdated, Descriptiondescription

[1.6.0] - 2026-05-07

Added

  • tests/test_bq_snippets.py: BigQuery snippet validation using bq query --dry_run — 33 tests covering all SQL examples in references/bigquery_guide.md (dicom_all, original_collections_metadata, segmentations, quantitative_measurements, qualitative_measurements, private elements, and clinical tables); skips automatically when bq CLI is unavailable or unauthenticated

Security

  • Fixed auto-upgrade subprocess call to pin idc-index to REQUIRED_VERSION (was "idc-index", now f"idc-index=={REQUIRED_VERSION}"), ensuring the installed version always matches the tested version declared in the frontmatter
  • Added network access transparency note to Overview documenting expected external endpoints (GCS, S3, BigQuery, DICOMweb proxy, Google Healthcare API) and clarifying that no credentials or environment variables are accessed by the skill
  • Added tested-with version comment to optional dependency install block (pandas>=1.5, numpy>=1.23, pydicom>=2.3)

Changed

  • Updated frontmatter description to be directive about skill triggering: now explicitly instructs invocation for IDC-related queries even without the word "IDC" in the prompt
  • Extracted "Batch Processing and Filtering" (section 6) from SKILL.md to references/use_cases.md (Use Case 5); replaced inline code block with a 2-sentence summary and pointer
  • Extracted "Integration with Analysis Pipelines" (section 9) from SKILL.md to references/use_cases.md (Use Case 6); replaced inline pydicom/SimpleITK code blocks with a 2-sentence summary and pointer
  • SKILL.md reduced from 865 → 775 lines (−90 lines); references/use_cases.md expanded from 187 → 278 lines
  • Updated to idc-index 0.12.1 (idc-index-data 24.0.4, IDC data version v24)
  • IDC v24 adds 15 new collections (161 → 176), ~39K new series, ~4 TB new data (99.27 TB total, 85,682 cases)
  • Updated collections_index column names to snake_case (idc-index-data 24.0.0 breaking change):
    CancerTypescancer_types, TumorLocationstumor_locations,
    Subjectssubjects, Speciesspecies, Sourcessources,
    SupportingDatasupporting_data, Programprogram_id
  • Updated analysis_results_index column names to snake_case (idc-index-data 24.0.4 breaking change):
    Subjectssubjects, Collectionscollections, Modalitiesmodalities

[1.5.0] - 2026-04-08

Added

  • volume_geometry_index table documentation: 3D geometry validation for single-frame CT, MR, and PT series; boolean checks (orientation, spacing, dimensions, slice positions) and composite regularly_spaced_3d_volume flag; join via SeriesInstanceUID
  • rtstruct_index table documentation: RT Structure Set metadata (total ROIs, ROI names, generation algorithms, interpreted types, referenced image series UID); join via SeriesInstanceUID
  • New reference guide references/parquet_access_guide.md: direct DuckDB queries against public GCS Parquet files without installing idc-index; URL pattern, available files, and query examples for main index, volume_geometry_index, and rtstruct_index
  • SQL patterns for volume_geometry_index and rtstruct_index in references/sql_patterns.md
  • Detailed documentation for BigQuery-only derived tables in references/bigquery_guide.md:
    • segmentations: per-segment anatomy with full schema, column descriptions, and queries for discovering structures, filtering by coded concept, and linking to SR measurements; note on gap vs seg_index in idc-index
    • quantitative_measurements: radiomics and clinical numeric measurements from DICOM SR TID1500 (volume, diameter, shape descriptors, texture, intensity statistics); full schema with column descriptions and query examples
    • qualitative_measurements: coded assessments from DICOM SR TID1500 (malignancy rating, calcification, texture, margin); full schema with column descriptions and query examples
    • measurement_groups: parent grouping table for SR measurements
    • Combined example joining all three derived tables for LIDC-IDRI nodule analysis (malignancy + volume + diameter)
  • SKILL.md section 7 now explicitly lists per-segment anatomy search, quantitative SR measurements, and qualitative SR measurements as BigQuery-only use cases with no idc-index equivalent

Changed

  • Updated to idc-index 0.11.14 (idc-index-data 23.10.1)
  • Added SOPClassUID and TransferSyntaxUID columns to Key Columns Reference in references/index_tables_guide.md
  • Added Direct Parquet Access entry to Data Access Options table and pointer in SKILL.md
  • Added parquet_access_guide.md to Quick Navigation table in SKILL.md

@fedorov fedorov changed the title Update imaging-data-commons skill from v1.4.0 to v1.6.2 upstream Update imaging-data-commons skill from v1.4.0 to v1.6.3 upstream May 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant