Skip to content

Commit f66a787

Browse files
authored
Merge pull request #36 from childmindresearch/alperkent/issue28
feature/issue-28/implement-batch-processing
2 parents ba62c7b + 3e5783e commit f66a787

12 files changed

Lines changed: 578 additions & 356 deletions

File tree

README.md

Lines changed: 128 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -11,11 +11,136 @@ A Python toolkit for analysis of graphomotor data collected via Curious.
1111
[![LGPL--2.1 License](https://img.shields.io/badge/license-LGPL--2.1-blue.svg)](https://github.com/childmindresearch/graphomotor/blob/main/LICENSE)
1212
[![Documentation](https://img.shields.io/badge/api-docs-blue)](https://childmindresearch.github.io/graphomotor)
1313

14-
Welcome to `graphomotor`, a specialized Python library for analyzing graphomotor data collected via [Curious](https://www.gettingcurious.com/). This toolkit provides comprehensive tools for processing, analyzing, and visualizing data from various graphomotor assessment tasks including spiral drawing, trails making, alphabetic writing, digit symbol substitution, and the Rey-Osterrieth Complex Figure Test.
14+
Welcome to `graphomotor`, a specialized Python library for analyzing graphomotor data collected via [Curious](https://www.gettingcurious.com/). This toolkit aims to provide comprehensive tools for processing, analyzing, and visualizing data from various graphomotor assessment tasks, including spiral drawing, trails making, alphabetic writing, digit symbol substitution, and the Rey-Osterrieth Complex Figure Test.
1515

16-
## Development Progress
16+
> ⚠️ **This package is under active development.** Currently, the focus is on the spiral drawing task. After finalizing feature extraction, the next steps will involve implementing both preprocessing and visualization for this task. Once these parts are in place, we plan to extend support to other tasks.
17+
18+
## Feature Extraction Capabilities
19+
20+
The toolkit extracts clinically relevant metrics from digitized drawing data. Currently implemented features include:
21+
22+
- **Temporal Features**: Task completion duration.
23+
- **Velocity Features**: Velocity analysis including linear, radial, and angular velocity components with statistical measures (sum, median, variation, skewness, kurtosis).
24+
- **Distance Features**: Spatial accuracy measurements using Hausdorff distance metrics with temporal normalizations and segment-specific analysis.
25+
- **Drawing Error Features**: Area under the curve (AUC) calculations between drawn paths and ideal reference trajectories to quantify spatial accuracy.
26+
27+
## Installation
28+
29+
Install the graphomotor package from PyPI:
30+
31+
```sh
32+
pip install graphomotor
33+
```
34+
35+
Or install the latest development version directly from GitHub:
36+
37+
```sh
38+
pip install git+https://github.com/childmindresearch/graphomotor
39+
```
40+
41+
## Quick Start
42+
43+
> ⚠️ This library **requires input data to adhere to a specific format** matching the standard output from [Curious drawing responses](https://mindlogger.atlassian.net/servicedesk/customer/portal/3/article/859242501). See more details in the [Data Format Requirements](#data-format-requirements) section below.
44+
45+
### Extracting Features from Spiral Drawing Data
46+
47+
#### Single File Processing
48+
49+
```python
50+
from graphomotor.core import orchestrator
51+
52+
# Path to your spiral drawing data file
53+
input_file = "path/to/your/spiral_data.csv"
54+
55+
# Option 1: Process file without saving any CSV file
56+
# Only return the DataFrame with extracted features
57+
features_df = orchestrator.run_pipeline(
58+
input_path=input_file
59+
)
60+
61+
# Features are returned as a pandas DataFrame with source file as index
62+
print(f"Extracted features: {list(features_df.columns)}")
63+
64+
# Access the single file's data (features_df has one row)
65+
file_path = features_df.index[0]
66+
print(f"File: {file_path}")
67+
print(f"Participant: {features_df.loc[file_path, 'participant_id']}")
68+
print(f"Task: {features_df.loc[file_path, 'task']}")
69+
print(f"Hand: {features_df.loc[file_path, 'hand']}")
70+
print(f"Duration: {features_df.loc[file_path, 'duration']}")
71+
```
72+
73+
```python
74+
# Option 2: Save to a directory with auto-generated filename
75+
# Creates a CSV file with auto-generated name in the specified directory
76+
# Format: {participant_id}_{task}_{hand}_features_{YYYYMMDD_HHMM}.csv
77+
features_df = orchestrator.run_pipeline(
78+
input_path=input_file,
79+
output_path="path/to/output/directory"
80+
)
81+
```
82+
83+
```python
84+
# Option 3: Save to a specific CSV file
85+
# Features will be saved to the specified file path
86+
features_df = orchestrator.run_pipeline(
87+
input_path=input_file,
88+
output_path="path/to/features.csv"
89+
)
90+
```
91+
92+
#### Batch Processing
93+
94+
```python
95+
from graphomotor.core import orchestrator
96+
97+
# Path to directory containing multiple spiral drawing data files
98+
input_dir = "path/to/your/spiral_data_directory"
99+
100+
# Option 1: Process files without saving any CSV files
101+
# Only return the DataFrame with extracted features
102+
features_df = orchestrator.run_pipeline(
103+
input_path=input_dir,
104+
)
105+
106+
# Features are returned as a pandas DataFrame with source files as index
107+
# Columns include: participant_id, task, hand, start_time, and calculated features
108+
print(f"Successfully processed {len(features_df)} files")
109+
110+
# Access metadata and features for a specific file
111+
for file_path in features_df.index:
112+
print(f"File: {file_path}")
113+
print(f"Participant: {features_df.loc[file_path, 'participant_id']}")
114+
print(f"Task: {features_df.loc[file_path, 'task']}")
115+
print(f"Hand: {features_df.loc[file_path, 'hand']}")
116+
print(f"Duration: {features_df.loc[file_path, 'duration']}")
17117

18-
⚠️ **This package is under active development.** Currently, the focus is on the Spiral task. After finalizing feature extraction, the next steps will involve implementing both preprocessing and visualization for this task. Once these parts are in place, we plan to extend support to other tasks.
118+
```
119+
120+
```python
121+
# Option 2: Save to a directory with auto-generated filename
122+
# Creates a single consolidated CSV file with auto-generated name
123+
# Format: batch_features_{YYYYMMDD_HHMM}.csv
124+
features_df = orchestrator.run_pipeline(
125+
input_path=input_dir,
126+
output_path="path/to/output/directory"
127+
)
128+
```
129+
130+
```python
131+
# Option 3: Save to a specific CSV file (single consolidated file)
132+
# All features will be written to one specified file
133+
features_df = orchestrator.run_pipeline(
134+
input_path=input_dir,
135+
output_path="path/to/consolidated_features.csv"
136+
)
137+
```
138+
139+
Currently, `graphomotor` is available as an importable Python library. CLI functionality is planned for future releases.
140+
141+
For detailed configuration options and additional parameters, refer to the [`run_pipeline` documentation](https://childmindresearch.github.io/graphomotor/graphomotor/core/orchestrator.html#run_pipeline).
142+
143+
## Development Progress
19144

20145
| Task | Preprocessing | Feature Extraction | Visualization |
21146
| :--- | :---: | :---: | :---: |
@@ -27,8 +152,6 @@ Welcome to `graphomotor`, a specialized Python library for analyzing graphomotor
27152

28153
## Data Format Requirements
29154

30-
⚠️ **This implementation requires data to adhere to a specific format matching the standard output from [Curious drawing responses](https://mindlogger.atlassian.net/servicedesk/customer/portal/3/article/859242501).**
31-
32155
When exporting drawing data from Curious, you typically receive the following files:
33156

34157
- **report.csv**: Contains the participants' actual responses.
@@ -67,58 +190,6 @@ line_number, x, y, UTC_Timestamp, seconds, epoch_time_in_seconds_start
67190

68191
This format represents the standard output from [Curious drawing responses data dictionary](https://mindlogger.atlassian.net/servicedesk/customer/portal/3/article/596082739).
69192

70-
## Feature Extraction Capabilities
71-
72-
The toolkit extracts clinically relevant metrics from digitized drawing data. Currently implemented features include:
73-
74-
- **Temporal Features**: Task completion duration.
75-
- **Velocity Features**: Velocity analysis including linear, radial, and angular velocity components with statistical measures (sum, median, variation, skewness, kurtosis).
76-
- **Distance Features**: Spatial accuracy measurements using Hausdorff distance metrics with temporal normalizations and segment-specific analysis.
77-
- **Drawing Error Features**: Area under the curve (AUC) calculations between drawn paths and ideal reference trajectories to quantify spatial accuracy.
78-
79-
## Installation
80-
81-
Install the graphomotor package from PyPI:
82-
83-
```sh
84-
pip install graphomotor
85-
```
86-
87-
Or install the latest development version directly from GitHub:
88-
89-
```sh
90-
pip install git+https://github.com/childmindresearch/graphomotor
91-
```
92-
93-
## Quick Start
94-
95-
Currently, `graphomotor` is available as an importable Python library. CLI functionality is planned for future releases.
96-
97-
### Extracting Features from Spiral Drawing Data
98-
99-
```python
100-
from graphomotor.core import orchestrator
101-
102-
# Path to your spiral drawing data file
103-
input_file = "path/to/your/spiral_data.csv"
104-
105-
# Directory where extracted features will be saved
106-
output_dir = "path/to/output/directory"
107-
108-
# Run the analysis pipeline
109-
features = orchestrator.run_pipeline(
110-
input_path=input_file,
111-
output_path=output_dir
112-
)
113-
114-
# Features are returned as a dictionary and saved as CSV
115-
print(f"Successfully extracted {len(features)} feature categories")
116-
```
117-
118-
For detailed configuration options and additional parameters, refer to the [`run_pipeline` documentation](https://childmindresearch.github.io/graphomotor/graphomotor/core/orchestrator.html#run_pipeline).
119-
120-
> **Note:** Currently, only single file processing is supported, with batch processing planned for future releases.
121-
122193
## Future Directions
123194

124195
The Graphomotor Study Toolkit is under active development. For more detailed information about upcoming features and development plans, please refer to our [GitHub Issues](https://github.com/childmindresearch/graphomotor/issues) page.

pyproject.toml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,8 @@ dependencies = [
1313
"pandas>=2.2.3",
1414
"pydantic>=2.11.1",
1515
"scipy>=1.15.2",
16-
"shapely>=2.1.0"
16+
"shapely>=2.1.0",
17+
"tqdm>=4.66.0"
1718
]
1819

1920
[dependency-groups]

src/graphomotor/core/models.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,8 @@ class Spiral(pydantic.BaseModel):
1818
- id: Unique identifier for the participant,
1919
- hand: Hand used ('Dom' for dominant, 'NonDom' for non-dominant),
2020
- task: Task name,
21-
- start_time: Start time of drawing.
21+
- start_time: Start time of drawing,
22+
- source_path: Path to the source CSV file.
2223
"""
2324

2425
model_config = pydantic.ConfigDict(arbitrary_types_allowed=True)

0 commit comments

Comments
 (0)