Skip to content

Commit 3800238

Browse files
committed
changes to meta creation, now runs one entire sequence post-saving all data
1 parent dd724b6 commit 3800238

File tree

8 files changed

+9156
-3306
lines changed

8 files changed

+9156
-3306
lines changed

README.md

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ code/
2525
path_logic.py # Optional helper to mirror generated outputs onto the BOOST file server
2626
2727
data/ # Subject-level caches (obs/int sites, then subject/task/data|plot)
28-
meta/ # Auto-saved aggregate CSVs (master_acc, cc_master, ps_master, mem_master, wl_master)
28+
meta/ # Aggregate CSVs rebuilt via META_RECREATE (cc_master, mem_master, ps_master, wl_master[_wide])
2929
group/plots/ # Example construct plots for quick reference
3030
requirements.txt # Python dependencies for QC + plotting
3131
run.py # Flask placeholder (not yet active)
@@ -40,7 +40,7 @@ run.py # Flask placeholder (not yet active)
4040
- `MEM_QC` inspects FN/SM performance with RT + accuracy rollups.
4141
- `WL_QC` orchestrates fuzzy matching against version-specific keys, handling WL and DWL simultaneously.
4242
4. **Visualize**`plot_utils` generates construct-appropriate figures (per-condition counts, RT distributions, WL learning curves, etc.).
43-
5. **Persist**`SAVE_EVERYTHING` stores per-participant CSVs and plots under `data/<study>/<site>/<subject>/<task>/`. `Handler._persist_all_masters()` writes aggregate CSVs into `meta/` on every successful task run to keep analytics in sync.
43+
5. **Persist**`SAVE_EVERYTHING` stores per-participant CSVs and plots under `data/<study>/<site>/<subject>/<task>/`. Once the task artifacts are saved, `META_RECREATE` is invoked for every domain so the aggregate CSVs in `meta/` stay synchronized with the subject-level cache.
4444

4545
## Supported Tasks
4646
| Construct | Tasks | Notes |
@@ -73,11 +73,10 @@ python code/main_handler.py all
7373
python code/main_handler.py AF
7474
```
7575

76-
Outputs land under `data/` using the subject -> task folder pattern enforced by `SAVE_EVERYTHING`. Every run also refreshes the aggregated CSVs in `meta/`:
77-
- `master_acc.csv`: high-level accuracy summaries for PS/MEM tasks.
76+
Outputs land under `data/` using the subject -> task folder pattern enforced by `SAVE_EVERYTHING`. Every run also refreshes the aggregated CSVs in `meta/` via `META_RECREATE`:
7877
- `cc_master.csv`: condition-level accuracy + mean RT for CC tasks.
79-
- `ps_master.csv`: per-block correct counts for PS tasks.
8078
- `mem_master.csv`: joined counts/RT/accuracy for FN/SM.
79+
- `ps_master.csv`: per-block correct counts for PS tasks.
8180
- `wl_master_wide.csv` & `wl_master.csv`: wide vs flattened WL summaries combining WL + DWL submissions.
8281

8382
## Visual Artifacts
@@ -95,13 +94,13 @@ Outputs land under `data/` using the subject -> task folder pattern enforced by
9594
## Extending the Pipeline
9695
1. Add the new task code and study IDs to `Handler.IDs`.
9796
2. Implement construct logic under `code/data_processing/` (reuse helpers in `utils.py` when possible).
98-
3. Register the new branch in `Handler.choose_construct()` and add persistence hooks for master CSVs.
97+
3. Register the new branch in `Handler.choose_construct()` and extend `META_RECREATE` if new aggregate metrics are required.
9998
4. Document the task behavior and update tests/fixtures to reflect the new data expectations.
10099

101100
## Troubleshooting
102101
- **No data returned from JATOS**: confirm the study IDs in `Handler.IDs` and that your token has access; adjust the `days_ago` window if you are backfilling.
103102
- **Missing session folders**: ensure input CSVs include `session` or `session_number`. `SAVE_EVERYTHING` uses those columns to label artifacts.
104-
- **WL metrics look stale**: WL and DWL write to the same `wl_master` rows via `_upsert_wl_master`; make sure both tasks are run for each session to populate delay scores.
103+
- **WL metrics look stale**: Rerun both WL and DWL so their subject CSVs exist before `META_RECREATE` rebuilds the wide/flat summaries.
105104

106105
## License & Data Privacy
107106
This repository processes sensitive participant responses. Keep tokens, raw exports, and downstream artifacts off public machines. Add new temp/output folders to `.gitignore` as needed to avoid leaking data.

code/data_processing/save_utils.py

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,14 @@ def __init__(self):
1111
self.sessions = {} # Dictionary to track session numbers for each subjectID
1212

1313
def _get_folder(self, subjID):
14-
if 7000 <= int(subjID) < 8000:
14+
try:
15+
subj_int = int(subjID)
16+
except (TypeError, ValueError):
17+
# Default to NE intervention for malformed IDs (logged upstream)
18+
return 'int', 'NE'
19+
if 7000 <= subj_int < 8000:
1520
return 'obs', 'UI'
16-
elif 8000 <= int(subjID) < 9000:
21+
elif 8000 <= subj_int < 9000:
1722
return 'int', 'UI'
1823
else:
1924
return 'int', 'NE'

0 commit comments

Comments
 (0)