Skip to content

Commit 7841265

Browse files
Merge pull request #40 from s-ccs/config-files-readjust
Config files readjust
2 parents fc08310 + 0cc634b commit 7841265

17 files changed

+177
-136
lines changed

README.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,9 @@ This package automates the conversion of EEG recordings (xdf files) to BIDS (Bra
2020
git clone https://github.com/s-ccs/LSLAutoBIDS.git
2121
```
2222
### **Step 2: Install the package**
23+
Go to the cloned directory and install the package using pip.
2324
```
25+
cd LSLAutoBIDS
2426
pip3 install lslautobids
2527
```
2628
It is advised to install the package in a separate environment (e.g. using `conda` or `virtualenv`).
@@ -39,13 +41,13 @@ The package requires the recorded XDF data to be organized in a specific directo
3941

4042

4143
- The `projects` root location is the root directory where all the eeg raw recordings (say `.xdf` files) are stored e.g. `projects/sub-A/ses-001/eeg/sub-A_ses-001_task-foo.xdf`.
42-
- The (optional) `project_stimulus` root location is the directory where the experiments (e.g `.py`, `.oxexp`) and behavioral files (e.g. eye-tracking recordings, labnotebook, participant forms, etc ) are stored.
44+
- The (optional) `project_other` root location is the directory where the experiments (e.g `.py`, `.oxexp`) and behavioral files (e.g. eye-tracking recordings, labnotebook, participant forms, etc ) are stored.
4345
- The `bids` root location is the directory where the converted BIDS data is stored, along with source data and code files which we want to version control using `Datalad`.
4446

4547
> [!IMPORTANT]
4648
> Please follow the BIDS data organization guidelines for storing the neuroimaging data for running this package. The BIDS conversion guidelines are based on the recommended directory/files structure. You only can change the location of the root directories according to your preference. You must also strictly follow the naming convention for the project and subject subdirectories.
4749
48-
Here you will find the recommended directory structure for storing the project data (recorded, stimulus and converted data) in the [data_organization](docs/data_organization.md) file.
50+
Here you will find the recommended directory structure for storing the project data (recorded, other and converted data) in the [data_organization](docs/data_organization.md) file.
4951

5052

5153
### **Step 4: Generate the configuration files**

docs/about.md

Lines changed: 0 additions & 3 deletions
This file was deleted.

docs/data_organization.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,22 @@
11
# How the data is organized
22

3-
In this project, we are using a sample xdf file along with the corresponding stimulus files to demonstrate how the data inside the `projectname` folder is organized. This data should be organized in a specific way:
3+
In this project, we are using a sample xdf file along with the corresponding other files to demonstrate how the data inside the `projectname` folder is organized. This data should be organized in a specific way:
44

55
### Recommended Project Organization Structure
66

77
For convenience, we have provided a recommended project organization structure for the root directories to organize the data better.
88

99

1010
> [!IMPORTANT]
11-
> The recommended directory structure is not self generated. The user needs to create the directories and store the recorded and stimulus data in them before running the conversion.
11+
> The recommended directory structure is not self generated. The user needs to create the directories and store the recorded and others data in them before running the conversion.
1212
1313
The dataset (both recorded and converted) is stored in the parent `data` directory. The `data` directory has three subdirectories under which the entire project is stored. The recommended directory structure is as follows:
1414
```
1515
data
1616
├── bids # Converted BIDS data
1717
├── projectname1
1818
├── projectname2
19-
├── project_stimulus # Experimental/Behavioral files
19+
├── project_other # Experimental/Behavioral files
2020
├── projectname1
2121
├── projectname2
2222
├── projects
@@ -26,7 +26,7 @@ data
2626
2727
```
2828

29-
Here `./data/projects/`, `./data/project_stimulus/`, `./data/bids/` are the root project directories. Each of this root directories will have a project name directory inside it and each project directory will have a subdirectory for each subject.
29+
Here `./data/projects/`, `./data/project_other/`, `./data/bids/` are the root project directories. Each of this root directories will have a project name directory inside it and each project directory will have a subdirectory for each subject.
3030

3131

3232
## Projects Folder
@@ -52,7 +52,7 @@ Filename Convention for the raw data files :
5252
- **tasklabel** - `duration, mscoco, ...`
5353
- **runlabel** - `001, 002, 003, ...` (need to be an integer)
5454

55-
## Project Stimulus Folder
55+
## Project Other Folder
5656

5757
This folder contains the experimental and behavioral files which we also store in the dataverse. The folder structure is should as follows:
5858

@@ -66,15 +66,15 @@ This folder contains the experimental and behavioral files which we also store i
6666
└── behavioral_files((lab notebook, CSV, EDF file, etc))
6767

6868
- **projectname** - any descriptive name for the project
69-
- **experiment** - contains the experimental files for the project. Eg: showStimulus.m, showStimulus.py
69+
- **experiment** - contains the experimental files for the project. Eg: showOther.m, showOther.py
7070
- **data** - contains the behavioral files for the corresponding subject. Eg: experimentalParameters.csv, eyetrackingdata.edf, results.tsv.
7171

7272

7373
You can get the filename convention for the data files [here](https://bids-standard.github.io/bids-starter-kit/folders_and_files/files.html#modalities).
7474

7575
## BIDS Folder
7676

77-
This folder contains the converted BIDS data files and other files we want to version control using `Datalad`. Since we are storing the entire dataset in the dataverse, we also store the raw xdf files and the associated stimulus/behavioral files in the dataverse. The folder structure is as follows:
77+
This folder contains the converted BIDS data files and other files we want to version control using `Datalad`. Since we are storing the entire dataset in the dataverse, we also store the raw xdf files and the associated other/behavioral files in the dataverse. The folder structure is as follows:
7878
```
7979
└── bids
8080
└──projectname/
@@ -90,7 +90,7 @@ This folder contains the converted BIDS data files and other files we want to ve
9090
├── sub-001_ses-001_task-Duration_run-001_eeg.eeg
9191
.........
9292
└── beh
93-
└──behavioral files
93+
└──behavioral files (other files)
9494
└── misc
9595
└── experimental files (This needs to stored in zip format)
9696
└── sourcedata

docs/developers_documentation.md

Lines changed: 39 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55

66
LSLAutoBIDS is a Python tool series designed to automate the following tasks sequentially:
77
- Convert recorded XDF files to BIDS format
8-
- Integrate the EEG data with non-EEG data (e.g., behavioral, stimulus) for the complete dataset
8+
- Integrate the EEG data with non-EEG data (e.g., behavioral, other) for the complete dataset
99
- Datalad integration for version control for the integrated dataset
1010
- Upload the dataset to Dataverse
1111
- Provide a command-line interface for cloning, configuring, and running the conversion process
@@ -17,7 +17,7 @@ LSLAutoBIDS is a Python tool series designed to automate the following tasks seq
1717
- DataLad integration for version control
1818
- Dataverse integration for data sharing
1919
- Configurable project management
20-
- Support for stimulus and behavioral data in addition to EEG data
20+
- Support for behavioral data (non eeg files) in addition to EEG data
2121
- Comprehensive logging and validation for BIDS compliance
2222

2323

@@ -55,6 +55,9 @@ LSLAutoBIDS is a Python tool series designed to automate the following tasks seq
5555
- [2. Logging Configuration (`config_logger.py`)](#2-logging-configuration-config_loggerpy)
5656
- [3. Utility Functions (`utils.py`)](#3-utility-functions-utilspy)
5757

58+
- [Testing](#testing)
59+
- [Running Tests](#running-tests)
60+
5861

5962
## Architecture - TODO
6063

@@ -84,7 +87,7 @@ The configuration system manages dataversse and project-specific settings using
8487
#### 1. Dataverse and Project Root Configuration (`gen_dv_config.py`)
8588

8689
This module generates a global configuration file for Dataverse and project root directories. This is a one-time setup per system. This file is stored in `~/.config/lslautobids/autobids_config.yaml` and contains:
87-
- Paths for BIDS, projects, and stimulus directories : This allows users to specify where their eeg data, stimulus data, and converted BIDS data are stored on their system. This paths should be relative to the home/users directory of your system and string format.
90+
- Paths for BIDS, projects, and project_other directories : This allows users to specify where their eeg data, behavioral data, and converted BIDS data are stored on their system. This paths should be relative to the home/users directory of your system and string format.
8891

8992
- Dataverse connection details: Base URL, API key, and parent dataverse name for uploading datasets. Base URL is the URL of the dataverse server (e.g. https://darus.uni-stuttgart.de), API key is your personal API token for authentication (found in your dataverse account settings), and parent dataverse name is the name of the dataverse under which datasets will be created (this can be found in the URL when you are in the dataverses page just after 'dataverse/'). For example, if the URL is `https://darus.uni-stuttgart.de/dataverse/simtech_pn7_computational_cognitive_science`, then the parent dataverse name is `simtech_pn7_computational_cognitive_science`.
9093

@@ -189,15 +192,15 @@ The pipeline is designed to ensure:
189192

190193
2. EEG recordings are converted to BIDS format using MNE and validated against the BIDS standard.
191194

192-
3. Behavioral and experimental metadata (also called stimulus files in general) are included and checked against project expectations.
195+
3. Behavioral and experimental metadata (also called other files in general in context on this project) are included and checked against project expectations.
193196

194197
4. Project metadata is populated (dataset_description.json). This is required as a part of BIDS standard.
195198

196199
5. The dataset is registered in Dataverse and optionally pushed/uploaded automatically.
197200

198201
#### 1. Entry Point (`bids_process_and_upload()`)
199202

200-
- Reads project configuration (<project_name>_config.toml) to check if a stimulus computer was used. (stimulusComputerUsed: true)
203+
- Reads project configuration (<project_name>_config.toml) to check if a other computer (non eeg files) was used. (otherFilesUsed: true)
201204

202205
- Iterates over each processed file and extracts identifiers. For example, for a file named `sub-001_ses-001_task-Default_run-001_eeg.xdf`, it extracts:
203206

@@ -246,7 +249,7 @@ This function handles the core conversion of a XDF files to BIDS format and cons
246249

247250
- Load `.xdf` with `create_raw_xdf()`. (See section).
248251

249-
- Apply anonymization (daysback_min + anonymization_number from project TOML config).
252+
- Apply anonymization (daysback_min + anonymizationNumber from project TOML config).
250253

251254
- Write EEG data into BIDS folder via `write_raw_bids().`
252255

@@ -261,7 +264,7 @@ This function handles the core conversion of a XDF files to BIDS format and cons
261264
- 0: BIDS Conversion done but validation failure
262265

263266
#### 3. Copy Source Files (`copy_source_files_to_bids()`)
264-
This function ensures that the original source files (EEG and stimulus/behavioral files) are also a part our dataset. These files can't be directly converted to BIDS format but we give the user the option to include them in the BIDS directory structure in a pseudo-BIDS format for completeness.
267+
This function ensures that the original source files (EEG and other/behavioral files) are also a part our dataset. These files can't be directly converted to BIDS format but we give the user the option to include them in the BIDS directory structure in a pseudo-BIDS format for completeness.
265268

266269
- Copies the .xdf into the following structure:
267270
`<BIDS_ROOT>/sourcedata/sub-XXX/ses-YYY/sub-XXX_ses-YYY_task-Name_run-ZZZ_eeg.xdf`
@@ -270,13 +273,13 @@ This function ensures that the original source files (EEG and stimulus/behaviora
270273

271274
- If a file already exists, logs a message and skips copying.
272275

273-
If stimulusComputerUsed=True in project config file:
276+
If otherFilesUsed=True in project config file:
274277

275278
1. Behavioral files are copied via `_copy_behavioral_files()`.
276279

277-
- Validates required files against TOML config (`ExpectedStimulusFiles`). In this config we add the the extensions of the expected stimulus files. For example, in our testproject we use EyeList 1000 Plus eye tracker which generates .edf and .csv files. So we add these extensions as required stimulus files. We also have mandatory labnotebook and participant info files in .tsv format.
280+
- Validates required files against TOML config (`OtherFilesInfo`). In this config we add the the extensions of the expected other files. For example, in our testproject we use EyeList 1000 Plus eye tracker which generates .edf and .csv files. So we add these extensions as required other files. We also have mandatory labnotebook and participant info files in .tsv format.
278281
- Renames files to include sub-XXX_ses-YYY_ prefix if missing.
279-
- Deletes the other files in the stimulus directory that are not listed in `ExpectedStimulusFiles` in the project config file. It doesn"t delete from the source directory, only from out BIDS dataset.
282+
- Deletes the other files in the project_other directory that are not listed in `OtherFilesInfo` in the project config file. It doesn"t delete from the source directory, only from out BIDS dataset.
280283

281284
2. Experimental files are copied via `_copy_experiment_files().`
282285

@@ -285,7 +288,7 @@ If stimulusComputerUsed=True in project config file:
285288
- Compresses into experiment.tar.gz.
286289
- Removes the uncompressed folder.
287290

288-
There is a flag in the `lslautobids run` command called `--redo_stim_pc` which when specified, forces overwriting of existing stimulus and experiment files in the BIDS dataset. This is useful if there are updates or corrections to the stimulus/behavioral data that need to be reflected in the BIDS dataset.
291+
There is a flag in the `lslautobids run` command called `--redo_other_pc` which when specified, forces overwriting of existing other and experiment files in the BIDS dataset. This is useful if there are updates or corrections to the other/behavioral data that need to be reflected in the BIDS dataset.
289292

290293
#### 4. Create Raw XDF (`create_raw_xdf()`)
291294
This function reads the XDF file and creates an MNE Raw object. It performs the following steps:
@@ -364,7 +367,7 @@ This module handles the creation of a new dataset in Dataverse using the `pyData
364367
#### 2. Linking DataLad to Dataverse (`link_datalad_dataverse.py`)
365368
This module links the local DataLad dataset to the remote Dataverse dataset as a sibling. The function performs the following steps:
366369
1. It first checks if the Dataverse is already created in the previous runs or it is just created in the current run (flag==0). If flag==0, it proceeds to link the DataLad dataset to Dataverse.
367-
2. It runs the command `datalad add-sibling-dataverse dataverse_base_url doi_id`. This command adds the Dataverse as a sibling to the local DataLad dataset, allowing for synchronization and data management between the two. For lslautobids, we currently only allow to deposit data to Dataverse. In future version, we shall also add user controlled options for adding other siblings like github, gitlab, etc.
370+
2. It runs the command `datalad add-sibling-dataverse dataverse_base_url doi_id`. This command adds the Dataverse as a sibling to the local DataLad dataset, allowing for synchronization and data management between the two. For lslautobids, we currently only allow to deposit data to Dataverse. In future version, we shall also add user controlled options for adding other siblings like github, gitlab, OpenNeuro, AWS etc.
368371

369372
We chose Dataverse as it serves as both a repository and a data sharing platform, making it suitable for our needs. It also integrates well with DataLad and allows sharing datasets with collaborators or the public.
370373

@@ -402,3 +405,27 @@ This module contains various utility functions used across the application.
402405
3. `write_toml_file` : Writes a dictionary to a TOML file.
403406

404407

408+
## Testing
409+
410+
The testing framework uses `pytest` to validate the functionality of the core components.
411+
412+
- The tests are located in the `tests/` directory and cover various modules including configuration generation, file processing, BIDS conversion, DataLad integration, and Dataverse interaction. (Work in progress)
413+
414+
- The test directory contains :
415+
- `test_utils` : Directory containing utility functions needed across multiple test files.
416+
- `testcases` : Directory containing all the tests in a in a directory structure - `test_<test_name>`.
417+
- Each `test_<test_name>` directory contains a `data` folder with sample data for that test and a `test_<test_name>.py` file with the actual test cases.
418+
- `run_all_tests.py` : A script to run all the tests in the `testcases` directory sequentially.
419+
420+
Tests will be added continuously as new features are added and existing features are updated.
421+
422+
### Running Tests
423+
424+
To run the tests, navigate to the `tests/` directory and execute:
425+
`python tests/run_all_tests.py`
426+
427+
These tests ensure that each component functions as expected and that the overall pipeline works seamlessly. This tests will also be triggered automatically on each push or PR to the main repository using GitHub Actions.
428+
429+
## Miscellianeous Points
430+
- To the current date, only EEG data is supported for BIDS conversion. Support for other modalities like Eye-tracking, etc,. in the BIDS format is not yet supported. Hence, LSLAutoBIDS relies on semi-BIDS data structures for those data and use user-definable regular expressions to match expected data-files. A future planned feature is to provide users more flexibility, especially in naming / sorting non-standard files. Currently, the user can only specify the expected file extensions for other/behavioral data and is automatically renamed to include sub-XXX_ses-YYY_ prefix if missing and also copied to pseudo-BIDS folder structure like `<BIDS_ROOT>/sourcedata/sub-XXX/ses-YYY/`, `<BIDS_ROOT>/misc/experiment.tar.gz` etc,.
431+

0 commit comments

Comments
 (0)