|
1 | 1 | # bids2table |
| 2 | + |
2 | 3 | [](https://github.com/childmindresearch/bids2table/actions/workflows/ci.yaml?query=branch%3Amain) |
3 | 4 | [](https://childmindresearch.github.io/bids2table/bids2table) |
4 | 5 | [](https://codecov.io/gh/childmindresearch/bids2table) |
@@ -100,7 +101,6 @@ As an example, here we index all datasets on [OpenNeuro](https://openneuro.org/) |
100 | 101 |
|
101 | 102 | Using 8 threads, we can index all ~1400 OpenNeuro datasets (1.2M files) in less than 15 minutes. |
102 | 103 |
|
103 | | - |
104 | 104 | ### Indexing datasets from python |
105 | 105 |
|
106 | 106 | You can also index datasets using the Python API. |
@@ -129,3 +129,27 @@ pq.write_table(tab, "ds000224.parquet") |
129 | 129 | # Convert to a pandas dataframe. |
130 | 130 | df = tab.to_pandas(types_mapper=pd.ArrowDtype) |
131 | 131 | ``` |
| 132 | + |
| 133 | +### Indexing with a custom BIDS schema |
| 134 | + |
| 135 | +By default, `bids2table` uses the BIDS schema bundled with `bidsschematools`. |
| 136 | +Pass a `schema=` argument to `index_dataset`, `batch_index_dataset`, |
| 137 | +`get_arrow_schema`, `get_column_names`, or `validate_bids_entities` to use a |
| 138 | +different schema. The argument may be a path to a schema directory, a string |
| 139 | +URI accepted by `bidsschematools.schema.load_schema`, or a pre-loaded |
| 140 | +`bidsschematools.types.Namespace`. |
| 141 | + |
| 142 | +```python |
| 143 | +import bidsschematools.schema |
| 144 | +import bids2table as b2t2 |
| 145 | + |
| 146 | +# Use a pre-loaded schema (e.g. when indexing several datasets that share one). |
| 147 | +schema = bidsschematools.schema.load_schema() |
| 148 | +tab = b2t2.index_dataset("bids-examples/ds102", schema=schema) |
| 149 | + |
| 150 | +# Or pass a path to a custom schema directory. |
| 151 | +tab = b2t2.index_dataset("/data/ds001", schema="/path/to/custom-schema") |
| 152 | +``` |
| 153 | + |
| 154 | +Different `schema` arguments may be used for different calls within the same |
| 155 | +process; per-call schemas propagate to worker processes when `max_workers > 0`. |
0 commit comments