Skip to content

Latest commit

 

History

History
294 lines (231 loc) · 10.1 KB

File metadata and controls

294 lines (231 loc) · 10.1 KB

Project Summary: PyBIDS to bids2table Migration Analysis

What We Did

  1. Set up analysis repository with 10 major neuroimaging projects as submodules
  2. Analyzed PyBIDS usage across 8 projects (2 don't use PyBIDS)
  3. Identified usage patterns for 14 different PyBIDS methods/features
  4. Created comprehensive migration guide with three migration paths
  5. Designed compatibility layer architecture for bids2table
  6. Developed implementation plan with phased approach and timeline

Key Findings

Usage Statistics (8 projects analyzed)

  • Total PyBIDS method calls: ~145 occurrences
  • Projects using PyBIDS: 8/10 (80%)
    • Using: fmriprep, smriprep, nibabies, mriqc, qsiprep, fitlins, niworkflows, templateflow
    • Not using: neurosynth, bids-apps-example

Top Methods by Frequency

  1. BIDSLayout() - 51 uses (100% of PyBIDS projects)
  2. layout.get_metadata() - 35 uses (50% of projects)
  3. layout.get() - 34 uses (75% of projects)
  4. layout.get_sessions() - 8 uses (38% of projects)
  5. layout.get_subjects() - 7 uses (63% of projects)

Core vs Specialized

  • Critical methods (Phase 1): 83% of all usage
  • High-value methods (Phase 2): 14% of usage
  • Specialized methods (Phase 3+): 3% of usage

Insight: Focus on 5-6 core methods covers 97% of real-world usage.

Recommendations

Strategy: Three-Path Migration

Path 1: Compatibility Layer (Fastest - 1 line change)

# Change this:
from bids.layout import BIDSLayout

# To this:
from bids2table.compat import BIDSLayout

# Everything else stays the same!

Best for: Quick wins, large codebases, conservative teams

Path 2: Native bids2table (Best Performance)

import bids2table as b2t
tab = b2t.index_dataset('/path/to/dataset')
df = tab.to_pandas()

# Query with pandas
files = df[(df['sub'] == '01') & (df['suffix'] == 'T1w')]['file_path'].tolist()

Best for: New code, performance-critical applications, pandas-savvy developers

Path 3: Hybrid (Pragmatic)

Use compat layer for complex operations (fieldmaps), native b2t for simple queries.

Best for: Gradual migration, learning curve management

Deliverables

Documentation Created

  1. PYBIDS_USAGE_ANALYSIS.md (9 methods analyzed)

    • Detailed analysis of each PyBIDS method
    • Usage patterns and examples
    • Common parameters and metadata fields
    • Project-specific notes
    • Priority rankings
  2. MIGRATION_GUIDE.md (Complete migration reference)

    • Method-by-method migration instructions
    • Three approaches for each method (Old → Compat → Native)
    • Advanced patterns (caching, derivatives, multi-dataset)
    • Compatibility layer implementation sketch
    • Performance comparisons
    • Testing strategies
  3. IMPLEMENTATION_PLAN.md (4-week execution plan)

    • Architecture and design principles
    • Phased implementation (5 phases)
    • Testing strategy
    • Success criteria
    • Risk assessment
    • Timeline and milestones
  4. UPDATED_ANALYSIS.md (Post-validation)

    • Analysis of 3 additional repositories
    • New methods discovered (Query.NONE, Query.ANY, etc.)
    • Validation of approach
    • Updated priorities
    • Recommendations

Architecture Design

Proposed bids2table.compat Module

bids2table/
└── compat/                  (NEW - optional import)
    ├── __init__.py          (exports BIDSLayout, Query, BIDSFile)
    ├── layout.py            (BIDSLayout wrapper class)
    ├── query.py             (Query.OPTIONAL, Query.NONE, Query.ANY)
    ├── bidsfile.py          (BIDSFile wrapper for get_entities())
    └── fieldmaps.py         (Complex fieldmap association logic)

Design Principles

  1. Optional, not core - Separate submodule, not part of main API
  2. Thin wrapper - Delegates to native b2t functions
  3. Educational - Shows both compat and native approaches
  4. Performance-conscious - Leverages b2t's speed (~20x faster)
  5. Deprecation path - Easy to sunset when PyBIDS retires

Implementation Roadmap

Phase 1: Core Infrastructure (Week 1) - CRITICAL

  • BIDSLayout class with initialization and caching
  • .get() method with entity filtering
  • .get_metadata() wrapper
  • Basic tests

Deliverable: MVP that enables simple migrations

Phase 2: Entity Access (Week 1-2) - HIGH VALUE

  • BIDSFile class with .get_entities()
  • .get_subjects() and .get_sessions()
  • Query helpers (OPTIONAL, NONE, ANY)
  • Integration tests

Deliverable: Full support for entity-based workflows

Phase 3: Specialized Features (Week 2-3) - NICE TO HAVE

  • .get_fieldmap() - Complex fieldmap association
  • .build_path() - Path construction
  • .get_fmapids() - Fieldmap ID retrieval
  • Advanced tests

Deliverable: Feature parity with common PyBIDS usage

Phase 4: Polish & Testing (Week 3-4)

  • Performance optimization
  • Comparison tests with PyBIDS
  • Documentation finalization
  • Real-world validation (niworkflows, fmriprep)

Deliverable: Production-ready compatibility layer

Success Metrics

Minimum Viable Product (1 week)

  • Drop-in replacement for 80% of PyBIDS usage
  • Core methods working (Layout, .get(), .get_metadata())
  • Unit tests pass
  • Basic documentation

Full Feature Set (4 weeks)

  • 95%+ PyBIDS method coverage
  • Fieldmap association working
  • All integration tests pass
  • Performance >10x faster than PyBIDS
  • At least one pipeline migrated successfully

Adoption Goals (3-6 months)

  • niworkflows migrated (affects all nipreps pipelines)
  • 3+ pipelines using compat layer in production
  • Performance benchmarks published
  • Community feedback incorporated

Next Steps

Immediate (This Week)

  1. ✅ Complete usage analysis (DONE)
  2. ✅ Validate approach with additional repos (DONE)
  3. ✅ Create migration guide (DONE)
  4. ✅ Design implementation plan (DONE)
  5. 🔲 Present findings to b2t maintainers
  6. 🔲 Get feedback on compat layer approach

Short-term (Weeks 1-2)

  1. 🔲 Create feature branch: feat/pybids-compat
  2. 🔲 Implement Phase 1 (MVP)
  3. 🔲 Write unit tests
  4. 🔲 Get early user feedback

Medium-term (Weeks 3-4)

  1. 🔲 Implement Phases 2-3 (full features)
  2. 🔲 Integration testing with real pipelines
  3. 🔲 Performance benchmarking
  4. 🔲 Documentation completion

Long-term (Months 1-3)

  1. 🔲 Merge to main branch
  2. 🔲 Release as optional extra: pip install bids2table[compat]
  3. 🔲 Migrate niworkflows (highest leverage)
  4. 🔲 Support other pipelines in migration
  5. 🔲 Gather adoption metrics

Risk Mitigation

Technical Risks

  • Fieldmap logic complexity: Start simple, iterate based on real needs
  • API differences: Comprehensive testing, clear documentation of differences
  • Performance overhead: Profile and optimize, keep wrapper thin

Adoption Risks

  • User resistance: Provide multiple migration paths, excellent docs
  • Maintenance burden: Keep compat layer minimal, deprecate eventually
  • Breaking changes: Version pin dependencies, thorough testing

Resources

Repository Structure

b2t-pybids/
├── README.md                           (Project overview)
├── PYBIDS_USAGE_ANALYSIS.md           (Detailed method analysis)
├── MIGRATION_GUIDE.md                  (Complete migration reference)
├── IMPLEMENTATION_PLAN.md              (Execution roadmap)
├── UPDATED_ANALYSIS.md                 (Post-validation findings)
├── SUMMARY.md                          (This file)
├── projects/                           (10 submodules for analysis)
│   ├── fmriprep/
│   ├── smriprep/
│   ├── nibabies/
│   ├── mriqc/
│   ├── qsiprep/
│   ├── fitlins/
│   ├── niworkflows/
│   ├── templateflow/
│   ├── neurosynth/
│   ├── bids-apps-example/
│   ├── pybids/                        (Reference implementation)
│   └── bids2table/                    (Target library)
└── datasets/
    └── bids-examples/                  (Test datasets)

Key Documents

  • Analysis: PYBIDS_USAGE_ANALYSIS.md, UPDATED_ANALYSIS.md
  • Migration: MIGRATION_GUIDE.md
  • Implementation: IMPLEMENTATION_PLAN.md
  • Testing: Use datasets/bids-examples/* for validation

External References

Questions for Stakeholders

For b2t Maintainers

  1. Is a compat layer acceptable in the main repo?
  2. Alternative: separate package bids2table-pybids-compat?
  3. Should fieldmap logic be in core b2t or compat only?
  4. Preferred caching convention (location, naming)?
  5. Can we add PyBIDS as optional test dependency?

For Pipeline Maintainers (nipreps, fitlins, etc.)

  1. Would you adopt a compat layer if it's drop-in compatible?
  2. Timeline for migration (urgent, nice-to-have, not interested)?
  3. Any PyBIDS features we missed in analysis?
  4. Performance requirements (how fast is fast enough)?
  5. Support needs (documentation, migration help)?

Conclusion

This analysis demonstrates:

  1. Feasibility: 97% of PyBIDS usage is covered by 5-6 core methods
  2. Value: 20x performance improvement possible with b2t
  3. Strategy: Three-path approach maximizes adoption
  4. Roadmap: Clear 4-week implementation plan
  5. Impact: Migrating niworkflows affects all downstream pipelines

Recommendation: Proceed with compat layer implementation. Start with MVP (Week 1), gather feedback, iterate based on real-world testing.

Expected Outcome: Drop-in PyBIDS replacement that's faster, simpler, and sets foundation for eventual PyBIDS retirement.


Status: ✅ Analysis Phase Complete
Next: Present to b2t maintainers, get green light for implementation
Timeline: 4 weeks to production-ready compat layer
Impact: Enable migration of 8+ major neuroimaging pipelines