Skip to content

Commit eac9c45

Browse files
committed
Refactor logging in orchestrator.py to use warnings for feature export errors and update test_orchestrator.py to remove commented-out tests
1 parent 33ed7c9 commit eac9c45

3 files changed

Lines changed: 87 additions & 89 deletions

File tree

README.md

Lines changed: 83 additions & 82 deletions
Original file line numberDiff line numberDiff line change
@@ -13,60 +13,6 @@ A Python toolkit for analysis of graphomotor data collected via Curious.
1313

1414
Welcome to `graphomotor`, a specialized Python library for analyzing graphomotor data collected via [Curious](https://www.gettingcurious.com/). This toolkit provides comprehensive tools for processing, analyzing, and visualizing data from various graphomotor assessment tasks including spiral drawing, trails making, alphabetic writing, digit symbol substitution, and the Rey-Osterrieth Complex Figure Test.
1515

16-
## Development Progress
17-
18-
⚠️ **This package is under active development.** Currently, the focus is on the Spiral task. After finalizing feature extraction, the next steps will involve implementing both preprocessing and visualization for this task. Once these parts are in place, we plan to extend support to other tasks.
19-
20-
| Task | Preprocessing | Feature Extraction | Visualization |
21-
| :--- | :---: | :---: | :---: |
22-
| Spiral | ![Spiral: Preprocessing Pending](https://img.shields.io/badge/pending-red) | ![Spiral: Feature Extraction In Progress](https://img.shields.io/badge/in_progress-yellow) | ![Spiral: Visualization Pending](https://img.shields.io/badge/pending-red) |
23-
| Rey-Osterrieth Complex Figure | ![Rey-Osterrieth: Preprocessing Pending](https://img.shields.io/badge/pending-red) | ![Rey-Osterrieth: Feature Extraction Pending](https://img.shields.io/badge/pending-red) | ![Rey-Osterrieth: Visualization Pending](https://img.shields.io/badge/pending-red) |
24-
| Alphabetic Writing | ![Alphabetic Writing: Preprocessing Pending](https://img.shields.io/badge/pending-red) | ![Alphabetic Writing: Feature Extraction Pending](https://img.shields.io/badge/pending-red) | ![Alphabetic Writing: Visualization Pending](https://img.shields.io/badge/pending-red) |
25-
| Digit Symbol Substitution | ![Digit Symbol Substitution: Preprocessing Pending](https://img.shields.io/badge/pending-red) | ![Digit Symbol Substitution: Feature Extraction Pending](https://img.shields.io/badge/pending-red) | ![Digit Symbol Substitution: Visualization Pending](https://img.shields.io/badge/pending-red) |
26-
| Trails Making | ![Trails Making: Preprocessing Pending](https://img.shields.io/badge/pending-red) | ![Trails Making: Feature Extraction Pending](https://img.shields.io/badge/pending-red) | ![Trails Making: Visualization Pending](https://img.shields.io/badge/pending-red) |
27-
28-
## Data Format Requirements
29-
30-
⚠️ **This implementation requires data to adhere to a specific format matching the standard output from [Curious drawing responses](https://mindlogger.atlassian.net/servicedesk/customer/portal/3/article/859242501).**
31-
32-
When exporting drawing data from Curious, you typically receive the following files:
33-
34-
- **report.csv**: Contains the participants' actual responses.
35-
- **activity_user_journey.csv**: Logs the entire journey through the activity, including button actions like "Next", "Skip", "Back", and "Undo", regardless of whether a response was provided.
36-
- **drawing-responses-{date}.zip**: A ZIP archive with raw drawing response CSV files for each participant (e.g., `drawing-responses-Mon May 29 2023.zip`).
37-
- **media-responses-{date}.zip**: A ZIP archive containing SVG files for the drawing responses (e.g., `media-responses-Mon May 29 2023.zip`).
38-
- **trails-responses-{date}.zip**: A ZIP archive with raw trail making response CSV files (if there are any) for each participant (e.g., `trails-responses-Mon May 29 2023.zip`).
39-
40-
For Spiral tasks, the toolkit uses only the CSV files from the drawing responses ZIP. Support for additional tasks will be added in future releases.
41-
42-
### File Naming Convention
43-
44-
Your spiral data files must follow this naming convention:
45-
46-
```text
47-
[5123456]a7f3b2e9-d4c8-f1a6-e5b9-c2d7f8a3e6b4-spiral_trace1_Dom.csv
48-
```
49-
50-
Where:
51-
52-
- **Participant ID**: Must be enclosed in brackets `[]` and be a 7-digit number starting with `5` (e.g., `[5123456]`) that matches the `target_secret_id` column in the **report.csv** file.
53-
- **Activity Submission ID**: Must be a 32-character hexadecimal string (e.g., `18f2-45ea-a1e4-2334e07cc706`) that matches the `id` column in the **report.csv** file.
54-
- **Task**: Must be one of the following that matches the `item` column in the **report.csv** file:
55-
- `spiral_trace1_Dom` through `spiral_trace5_Dom` (dominant hand tracing tasks)
56-
- `spiral_trace1_NonDom` through `spiral_trace5_NonDom` (non-dominant hand tracing tasks)
57-
- `spiral_recall1_Dom` through `spiral_recall3_Dom` (dominant hand recall tasks)
58-
- `spiral_recall1_NonDom` through `spiral_recall3_NonDom` (non-dominant hand recall tasks)
59-
60-
### Data Format
61-
62-
Your spiral data CSV file must contain the following columns:
63-
64-
```text
65-
line_number, x, y, UTC_Timestamp, seconds, epoch_time_in_seconds_start
66-
```
67-
68-
This format represents the standard output from [Curious drawing responses data dictionary](https://mindlogger.atlassian.net/servicedesk/customer/portal/3/article/596082739).
69-
7016
## Feature Extraction Capabilities
7117

7218
The toolkit extracts clinically relevant metrics from digitized drawing data. Currently implemented features include:
@@ -92,6 +38,8 @@ pip install git+https://github.com/childmindresearch/graphomotor
9238

9339
## Quick Start
9440

41+
> **⚠️ This implementation requires data to adhere to a specific format matching the standard output from [Curious drawing responses](https://mindlogger.atlassian.net/servicedesk/customer/portal/3/article/859242501).**
42+
9543
Currently, `graphomotor` is available as an importable Python library. CLI functionality is planned for future releases.
9644

9745
### Extracting Features from Spiral Drawing Data
@@ -110,32 +58,35 @@ features_df = orchestrator.run_pipeline(
11058
input_path=input_file
11159
)
11260

61+
# Features are returned as a pandas DataFrame with source file as index
62+
print(f"Extracted features: {list(features_df.columns)}")
63+
64+
# Access the single file's data (features_df has one row)
65+
file_path = features_df.index[0]
66+
print(f"File: {file_path}")
67+
print(f"Participant: {features_df.loc[file_path, 'participant_id']}")
68+
print(f"Task: {features_df.loc[file_path, 'task']}")
69+
print(f"Hand: {features_df.loc[file_path, 'hand']}")
70+
print(f"Duration: {features_df.loc[file_path, 'duration']}")
71+
```
72+
73+
```python
11374
# Option 2: Save to a directory with auto-generated filename
11475
# Creates a CSV file with auto-generated name in the specified directory
11576
# Format: {participant_id}_{task}_{hand}_features_{YYYYMMDD_HHMM}.csv
11677
features_df = orchestrator.run_pipeline(
11778
input_path=input_file,
11879
output_path="path/to/output/directory"
11980
)
81+
```
12082

83+
```python
12184
# Option 3: Save to a specific CSV file
12285
# Features will be saved to the specified file path
12386
features_df = orchestrator.run_pipeline(
12487
input_path=input_file,
12588
output_path="path/to/features.csv"
12689
)
127-
128-
# Features are returned as a pandas DataFrame with source file as index
129-
print(f"Successfully processed {len(features_df)} file")
130-
print(f"Extracted features: {list(features_df.columns)}")
131-
132-
# Access the single file's data (features_df has one row)
133-
file_path = features_df.index[0]
134-
print(f"File: {file_path}")
135-
print(f"Participant: {features_df.loc[file_path, 'participant_id']}")
136-
print(f"Task: {features_df.loc[file_path, 'task']}")
137-
print(f"Hand: {features_df.loc[file_path, 'hand']}")
138-
print(f"Duration: {features_df.loc[file_path, 'duration']}")
13990
```
14091

14192
#### Batch Processing
@@ -152,42 +103,92 @@ features_df = orchestrator.run_pipeline(
152103
input_path=input_dir,
153104
)
154105

106+
# Features are returned as a pandas DataFrame with source files as index
107+
# Columns include: participant_id, task, hand, start_time, and calculated features
108+
print(f"Successfully processed {len(features_df)} files")
109+
110+
# Access metadata and features for a specific file
111+
for file_path in features_df.index:
112+
print(f"File: {file_path}")
113+
print(f"Participant: {features_df.loc[file_path, 'participant_id']}")
114+
print(f"Task: {features_df.loc[file_path, 'task']}")
115+
print(f"Hand: {features_df.loc[file_path, 'hand']}")
116+
print(f"Duration: {features_df.loc[file_path, 'duration']}")
117+
118+
```
119+
120+
```python
155121
# Option 2: Save to a directory with auto-generated filename
156122
# Creates a single consolidated CSV file with auto-generated name
157123
# Format: batch_features_{YYYYMMDD_HHMM}.csv
158124
features_df = orchestrator.run_pipeline(
159125
input_path=input_dir,
160126
output_path="path/to/output/directory"
161127
)
128+
```
162129

130+
```python
163131
# Option 3: Save to a specific CSV file (single consolidated file)
164132
# All features will be written to one specified file
165133
features_df = orchestrator.run_pipeline(
166134
input_path=input_dir,
167135
output_path="path/to/consolidated_features.csv"
168136
)
137+
```
169138

170-
# Features are returned as a pandas DataFrame with source files as index
171-
# Columns include: participant_id, task, hand, start_time, and calculated features
172-
print(f"Successfully processed {len(features_df)} files")
139+
For detailed configuration options and additional parameters, refer to the [`run_pipeline` documentation](https://childmindresearch.github.io/graphomotor/graphomotor/core/orchestrator.html#run_pipeline).
173140

174-
# Access metadata and features for a specific file
175-
for file_path in features_df.index:
176-
print(f"File: {file_path}")
177-
print(f"Participant: {features_df.loc[file_path, 'participant_id']}")
178-
print(f"Task: {features_df.loc[file_path, 'task']}")
179-
print(f"Hand: {features_df.loc[file_path, 'hand']}")
180-
print(f"Duration: {features_df.loc[file_path, 'duration']}")
141+
## Development Progress
142+
143+
⚠️ **This package is under active development.** Currently, the focus is on the Spiral task. After finalizing feature extraction, the next steps will involve implementing both preprocessing and visualization for this task. Once these parts are in place, we plan to extend support to other tasks.
144+
145+
| Task | Preprocessing | Feature Extraction | Visualization |
146+
| :--- | :---: | :---: | :---: |
147+
| Spiral | ![Spiral: Preprocessing Pending](https://img.shields.io/badge/pending-red) | ![Spiral: Feature Extraction In Progress](https://img.shields.io/badge/in_progress-yellow) | ![Spiral: Visualization Pending](https://img.shields.io/badge/pending-red) |
148+
| Rey-Osterrieth Complex Figure | ![Rey-Osterrieth: Preprocessing Pending](https://img.shields.io/badge/pending-red) | ![Rey-Osterrieth: Feature Extraction Pending](https://img.shields.io/badge/pending-red) | ![Rey-Osterrieth: Visualization Pending](https://img.shields.io/badge/pending-red) |
149+
| Alphabetic Writing | ![Alphabetic Writing: Preprocessing Pending](https://img.shields.io/badge/pending-red) | ![Alphabetic Writing: Feature Extraction Pending](https://img.shields.io/badge/pending-red) | ![Alphabetic Writing: Visualization Pending](https://img.shields.io/badge/pending-red) |
150+
| Digit Symbol Substitution | ![Digit Symbol Substitution: Preprocessing Pending](https://img.shields.io/badge/pending-red) | ![Digit Symbol Substitution: Feature Extraction Pending](https://img.shields.io/badge/pending-red) | ![Digit Symbol Substitution: Visualization Pending](https://img.shields.io/badge/pending-red) |
151+
| Trails Making | ![Trails Making: Preprocessing Pending](https://img.shields.io/badge/pending-red) | ![Trails Making: Feature Extraction Pending](https://img.shields.io/badge/pending-red) | ![Trails Making: Visualization Pending](https://img.shields.io/badge/pending-red) |
152+
153+
## Data Format Requirements
154+
155+
When exporting drawing data from Curious, you typically receive the following files:
156+
157+
- **report.csv**: Contains the participants' actual responses.
158+
- **activity_user_journey.csv**: Logs the entire journey through the activity, including button actions like "Next", "Skip", "Back", and "Undo", regardless of whether a response was provided.
159+
- **drawing-responses-{date}.zip**: A ZIP archive with raw drawing response CSV files for each participant (e.g., `drawing-responses-Mon May 29 2023.zip`).
160+
- **media-responses-{date}.zip**: A ZIP archive containing SVG files for the drawing responses (e.g., `media-responses-Mon May 29 2023.zip`).
161+
- **trails-responses-{date}.zip**: A ZIP archive with raw trail making response CSV files (if there are any) for each participant (e.g., `trails-responses-Mon May 29 2023.zip`).
162+
163+
For Spiral tasks, the toolkit uses only the CSV files from the drawing responses ZIP. Support for additional tasks will be added in future releases.
164+
165+
### File Naming Convention
181166

182-
# Or work with the DataFrame directly
183-
print(f"Mean duration across all files: {features_df['duration'].astype(float).mean()}")
184-
print(f"Spiral with highest linear velocity: {features_df['linear_velocity_median'].astype(float).idxmax()}")
167+
Your spiral data files must follow this naming convention:
185168

186-
# Easy filtering and grouping by metadata
187-
print(f"Files with dominant hand: {len(features_df[features_df['hand'] == 'Dom'])}")
169+
```text
170+
[5123456]a7f3b2e9-d4c8-f1a6-e5b9-c2d7f8a3e6b4-spiral_trace1_Dom.csv
188171
```
189172

190-
For detailed configuration options and additional parameters, refer to the [`run_pipeline` documentation](https://childmindresearch.github.io/graphomotor/graphomotor/core/orchestrator.html#run_pipeline).
173+
Where:
174+
175+
- **Participant ID**: Must be enclosed in brackets `[]` and be a 7-digit number starting with `5` (e.g., `[5123456]`) that matches the `target_secret_id` column in the **report.csv** file.
176+
- **Activity Submission ID**: Must be a 32-character hexadecimal string (e.g., `18f2-45ea-a1e4-2334e07cc706`) that matches the `id` column in the **report.csv** file.
177+
- **Task**: Must be one of the following that matches the `item` column in the **report.csv** file:
178+
- `spiral_trace1_Dom` through `spiral_trace5_Dom` (dominant hand tracing tasks)
179+
- `spiral_trace1_NonDom` through `spiral_trace5_NonDom` (non-dominant hand tracing tasks)
180+
- `spiral_recall1_Dom` through `spiral_recall3_Dom` (dominant hand recall tasks)
181+
- `spiral_recall1_NonDom` through `spiral_recall3_NonDom` (non-dominant hand recall tasks)
182+
183+
### Data Format
184+
185+
Your spiral data CSV file must contain the following columns:
186+
187+
```text
188+
line_number, x, y, UTC_Timestamp, seconds, epoch_time_in_seconds_start
189+
```
190+
191+
This format represents the standard output from [Curious drawing responses data dictionary](https://mindlogger.atlassian.net/servicedesk/customer/portal/3/article/596082739).
191192

192193
## Future Directions
193194

src/graphomotor/core/orchestrator.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -146,13 +146,13 @@ def export_features_to_csv(
146146
results_df.to_csv(output_file)
147147
logger.debug(f"Features saved successfully to {output_file}")
148148
except Exception as e:
149-
logger.error(f"Failed to save features to {output_file}: {str(e)}")
149+
logger.warning(f"Failed to save features to {output_file}: {str(e)}")
150150

151151

152152
def _run_file(
153153
input_path: pathlib.Path,
154154
feature_categories: list[FeatureCategories],
155-
spiral_config: config.SpiralConfig | None,
155+
spiral_config: config.SpiralConfig,
156156
) -> dict[str, str]:
157157
"""Process a single file for feature extraction.
158158
@@ -180,7 +180,7 @@ def _run_file(
180180
def _run_directory(
181181
input_path: pathlib.Path,
182182
feature_categories: list[FeatureCategories],
183-
spiral_config: config.SpiralConfig | None,
183+
spiral_config: config.SpiralConfig,
184184
) -> list[dict[str, str]]:
185185
"""Process all CSV files in a directory and its subdirectories.
186186
@@ -228,7 +228,7 @@ def _run_directory(
228228
results.append(features)
229229
logger.debug(f"Successfully processed {csv_file.name}")
230230
except Exception as e:
231-
logger.error(f"Failed to process {csv_file.name}: {str(e)}")
231+
logger.warning(f"Failed to process {csv_file.name}: {str(e)}")
232232
failed_files.append(csv_file.name)
233233
continue
234234

tests/unit/test_orchestrator.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -66,9 +66,6 @@ def test_validate_feature_categories_mixed(caplog: pytest.LogCaptureFixture) ->
6666
assert "meaning_of_life" in caplog.text
6767

6868

69-
# Tests for extract_features()
70-
71-
7269
@pytest.mark.parametrize(
7370
"feature_categories, expected_feature_number",
7471
[

0 commit comments

Comments
 (0)