Title
BIDS-SQL: Persistent and Remote Metadata Querying
Short description and the goals for the OHBM BrainHack
This project addresses the scalability bottlenecks in current BIDS data handling. While pybids is the gold standard for single-study analysis, it lacks a native way to persist and query metadata across multiple large-scale datasets simultaneously. Currently, researchers must re-index data into memory for every session, which is slow and resource-heavy.
We are building a persistent BIDS metadata warehouse (using SQLite or PostgreSQL) designed to aggregate and index entire collections of datasets. Our goal is to enable complex, remote, SQL-based querying that allows you to search across massive cohorts (e.g., all T1w images across three different OpenNeuro studies) without needing to mirror the files locally or reload the index.
Brainhack Goals:
- Translate the current pybids layout into a robust SQL schema.
- Implement a persistent storage backend that eliminates costly startup re-indexing.
- Develop functionality to allow metadata querying of remote BIDS data (stored on servers/cloud) without local file mirroring.
- (Stretch goal) Optimize query paths for complex subsetting that are currently slow or difficult in existing tools.
Link to the Project
https://github.com/paul188/BIDS-SQL
Image/Logo for the OHBM brainhack website
https://drive.google.com/uc?export=download&id=1FJmXy1_H4yOVF7uqnoTTm729eRcAH2PC
Project lead
Paul Johannssen,
@github: paul188,
@discord: paul1887113
Affiliation: University Clinic Bonn, Germany
Main Hub
Bordeaux
Link to the Project pitch
No response
Other hubs covered by the leaders
Skills
- Python (Intermediate: essential for API and integration)
- SQL & Database Experience (Basic to Intermediate)
- BIDS Specification (Solid understanding of the BIDS structure)
(optional C++ for optimizing heavy queries)
Recommended tutorials for new contributors
Good first issues
No response
Twitter summary
No response
Short name for the Discord chat channel (~15 chars)
bids-sql
Please read and follow the OHBM Code of Conduct
Title
BIDS-SQL: Persistent and Remote Metadata Querying
Short description and the goals for the OHBM BrainHack
This project addresses the scalability bottlenecks in current BIDS data handling. While
pybidsis the gold standard for single-study analysis, it lacks a native way to persist and query metadata across multiple large-scale datasets simultaneously. Currently, researchers must re-index data into memory for every session, which is slow and resource-heavy.We are building a persistent BIDS metadata warehouse (using SQLite or PostgreSQL) designed to aggregate and index entire collections of datasets. Our goal is to enable complex, remote, SQL-based querying that allows you to search across massive cohorts (e.g., all T1w images across three different OpenNeuro studies) without needing to mirror the files locally or reload the index.
Brainhack Goals:
Link to the Project
https://github.com/paul188/BIDS-SQL
Image/Logo for the OHBM brainhack website
https://drive.google.com/uc?export=download&id=1FJmXy1_H4yOVF7uqnoTTm729eRcAH2PC
Project lead
Paul Johannssen,
@github: paul188,
@discord: paul1887113
Affiliation: University Clinic Bonn, Germany
Main Hub
Bordeaux
Link to the Project pitch
No response
Other hubs covered by the leaders
Skills
(optional C++ for optimizing heavy queries)
Recommended tutorials for new contributors
Good first issues
No response
Twitter summary
No response
Short name for the Discord chat channel (~15 chars)
bids-sql
Please read and follow the OHBM Code of Conduct