Skip to content

Commit 18fb47a

Browse files
committed
Merge branch 'main' into paper/JOSS-submission
2 parents e989880 + 2165b3e commit 18fb47a

19 files changed

Lines changed: 326 additions & 127 deletions

README.md

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ The main processing pipeline of the wristpy module can be described as follows:
3636
- ***Data imputation*** In the special case when dealing with the Actigraph `idle_sleep_mode == enabled`, the gaps in acceleration are filled in after calibration, to avoid biasing the calibration phase.
3737
- **Metrics Calculation**: Calculates various activity metrics on the calibrated data, namely ENMO (Euclidean norm, minus one), MAD (mean amplitude deviation) <sup>1</sup>, Actigraph activity counts<sup>2</sup>, MIMS (monitor-independent movement summary) unit <sup>3</sup>, and angle-Z (angle of acceleration relative to the *x-y* axis).
3838
- **Non-wear detection**: We find periods of non-wear based on the acceleration data. Specifically, the standard deviation of the acceleration values in a given time window, along each axis, is used as a threshold to decide `wear` or `not wear`. Additionally, we can use the temperature sensor, when avaia\lable, to augment the acceleration data. This is used in the CTA (combined temperature and acceleration) algorithm <sup>4</sup>, and in the `skdh` DETACH algorithm <sup>5</sup>. Furthermore, ensemble classification of non-wear periods is possible by providing a list (of any length) of non-wear algorithm options.
39-
- **Sleep Detection**: Using the HDCZ<sup>6</sup> and HSPT<sup>7</sup> algorithms to analyze changes in arm angle we are able to find periods of sleep. We find the sleep onset-wakeup times for all sleep windows detected. Any sleep periods that overlap with detected non-wear times are removed, and any remaining sleep periods shorter than 15 minutes (default value) are removed.
39+
- **Sleep Detection**: Using the HDCZ<sup>6</sup> and HSPT<sup>7</sup> algorithms to analyze changes in arm angle we are able to find periods of sleep. We find the sleep onset-wakeup times for all sleep windows detected. Any sleep periods that overlap with detected non-wear times are removed, and any remaining sleep periods shorter than 15 minutes (default value) are removed. Additionally, the SIB (sustained inactivity bouts) and the SPT (sleep period time) windows are provided as part of the output to aid in sleep metric post-processing.
4040
- **Physical activity levels**: Using the chosen physical activity metric (aggregated into time bins, 5 second default) we compute activity levels into the following categories: [`inactive`, `light`, `moderate`, `vigorous`]. The threshold values can be defined by the user, while the default values are chosen based on the specific activity metric and the values found in the literature <sup>8-10</sup>.
4141
- **Data output**: The output results can be saved in `.csv` or `.parquet` data formats, with the run-time configuration parameters saved in a `.json` dictionary.
4242

@@ -102,8 +102,11 @@ results = orchestrator.run(
102102
physical_activity_metric = results.physical_activity_metric
103103
anglez = results.anglez
104104
physical_activity_levels = results.physical_activity_levels
105-
nonwear_array = results.nonwear_epoch
106-
sleep_windows = results.sleep_windows_epoch
105+
nonwear_array = results.nonwear_status
106+
sleep_windows = results.sleep_status
107+
sib_periods = results.sib_periods
108+
spt_periods = results.spt_periods
109+
107110
```
108111
#### Running entire directories:
109112
```Python
@@ -131,8 +134,11 @@ subject1 = results_dict['subject1']
131134
physical_activity_metric = subject1.physical_activity_metric
132135
anglez = subject1.anglez
133136
physical_activity_levels = subject1.physical_activity_levels
134-
nonwear_array = subject1.nonwear_epoch
135-
sleep_windows = subject1.sleep_windows_epoch
137+
nonwear_array = subject1.nonwear_status
138+
sleep_windows = subject1.sleep_status
139+
sib_periods = subject1.sib_periods
140+
spt_periods = subject1.spt_periods
141+
136142
```
137143

138144
### Using Wristpy Through Docker

dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
FROM python:3.11-buster
1+
FROM python:3.11-bookworm
22

33
WORKDIR /app
44
COPY . /app/

docs/wristpy_tutorial.md

Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ results = orchestrator.run(
3333
This runs the processing pipeline with all the default arguments, creates an output `.csv` file, a `.json` file with the pipeline configuration parameters, and will create a `results` object that contains the various output metrics (namely; the specified physical activity metric, angle-z, physical activity classification values, non-wear status, and sleep status).
3434

3535

36-
The orchestrator can also process entire directories. The call to the orchestrator remains largely the same but now output is expected to be a directory and the desired filetype for the saved files **must** be specified:
36+
The orchestrator can also process entire directories. The call to the orchestrator remains largely the same but now output is expected to be a directory and the desired filetype for the saved files can be specified through the output_filetype arguement(default value is ".csv"):
3737

3838
```python
3939
from wristpy.core import orchestrator
@@ -45,6 +45,28 @@ results = orchestrator.run(
4545
)
4646
```
4747

48+
If users would prefer to process specific files instead of entire directories we recommend looping through a list of file names. The following code snipet will save results objects into a dictionary, and the output files into the desired directory:
49+
50+
```python
51+
from wristpy.core import orchestrator
52+
import pathlib
53+
54+
file_path = pathlib.Path("/path/to/data/")
55+
output_dir = pathlib.Path("/path/to/save/dir/")
56+
file_names = [pathlib.Path("file1.gt3x"), pathlib.Path("file2.gt3x"), pathlib.Path("file3.gt3x")]
57+
results_dict = {}
58+
59+
for file in file_names:
60+
input_path = file_path / file
61+
output_path = output_dir / file.stem + ".csv"
62+
result = orchestrator.run(
63+
input = input_path
64+
output = output_path
65+
)
66+
results_dict[file.stem] = result
67+
```
68+
69+
4870

4971

5072
We can visualize some of the outputs within the `results` object, directly, with the following scripts:
@@ -74,6 +96,8 @@ We can also view and process these outputs from the saved `.csv` output file:
7496
import polars as pl
7597
import matplotlib.pyplot as plt
7698

99+
output_results = pl.read_csv('path/to/save/file_name.csv', try_parse_dates=True)
100+
77101
activity_mapping = {
78102
"inactive": 0,
79103
"light": 1,

src/wristpy/core/cli.py

Lines changed: 18 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66

77
import typer
88

9-
from wristpy.core import config
9+
from wristpy.core import config, exceptions
1010

1111
logger = config.get_logger()
1212
app = typer.Typer(
@@ -74,11 +74,10 @@ def main(
7474
help="Path where data will be saved. Supports .csv and .parquet formats.",
7575
),
7676
output_filetype: OutputFileType = typer.Option(
77-
None,
77+
".csv",
7878
"-O",
7979
"--output-filetype",
80-
help="Format for save files when processing directories. "
81-
"Leave as None when processing single files.",
80+
help="Format for save files when processing directories. ",
8281
),
8382
calibrator: Calibrator = typer.Option(
8483
Calibrator.none,
@@ -150,17 +149,21 @@ def main(
150149
calibrator_value = None if calibrator == Calibrator.none else calibrator.value
151150

152151
logger.debug("Running wristpy. arguments given: %s", locals())
153-
orchestrator.run(
154-
input=input,
155-
output=output,
156-
calibrator=calibrator_value,
157-
activity_metric=activity_metric.value,
158-
thresholds=None if thresholds is None else thresholds,
159-
epoch_length=epoch_length,
160-
nonwear_algorithm=nonwear_algorithms, # type: ignore[arg-type] # Covered by NonwearAlgorithm Enum class
161-
verbosity=log_level,
162-
output_filetype=output_filetype.value if output_filetype else None, # type: ignore[arg-type] # Covered by OutputFileType Enum class
163-
)
152+
try:
153+
orchestrator.run(
154+
input=input,
155+
output=output,
156+
calibrator=calibrator_value,
157+
activity_metric=activity_metric.value,
158+
thresholds=None if thresholds is None else thresholds,
159+
epoch_length=epoch_length,
160+
nonwear_algorithm=nonwear_algorithms, # type: ignore[arg-type] # Covered by NonwearAlgorithm Enum class
161+
verbosity=log_level,
162+
output_filetype=output_filetype.value,
163+
)
164+
except exceptions.EmptyDirectoryError as e:
165+
typer.echo(f"Error: {e}", err=True)
166+
raise typer.Exit(1)
164167

165168

166169
if __name__ == "__main__":

src/wristpy/core/exceptions.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,3 +40,9 @@ class InvalidFileTypeError(LoggedException):
4040
"""Wristpy did not expect this file extension."""
4141

4242
pass
43+
44+
45+
class EmptyDirectoryError(LoggedException):
46+
"""No .gt3x or .bin files were found in the directory."""
47+
48+
pass

src/wristpy/core/orchestrator.py

Lines changed: 41 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
calibration,
1414
idle_sleep_mode_imputation,
1515
metrics,
16-
nonwear_utils,
16+
processing_utils,
1717
)
1818

1919
logger = config.get_logger()
@@ -33,7 +33,7 @@ def run(
3333
activity_metric: Literal["enmo", "mad", "ag_count", "mims"] = "enmo",
3434
nonwear_algorithm: Sequence[Literal["ggir", "cta", "detach"]] = ["ggir"],
3535
verbosity: int = logging.WARNING,
36-
output_filetype: Optional[Literal[".csv", ".parquet"]] = None,
36+
output_filetype: Literal[".csv", ".parquet"] = ".csv",
3737
) -> Union[writers.OrchestratorResults, Dict[str, writers.OrchestratorResults]]:
3838
"""Runs main processing steps for wristpy on single files, or directories.
3939
@@ -59,8 +59,8 @@ def run(
5959
activity_metric: The metric to be used for physical activity categorization.
6060
nonwear_algorithm: The algorithm to be used for nonwear detection.
6161
verbosity: The logging level for the logger.
62-
output_filetype: Specifies the data format for the save files. Must be None when
63-
processing files, must be a valid file type when processing directories.
62+
output_filetype: Specifies the data format for the save files. Only used when
63+
processing directories.
6464
6565
Returns:
6666
All calculated data in a save ready format as a Results object or as a
@@ -69,8 +69,7 @@ def run(
6969
Raises:
7070
ValueError: If the physical activity thresholds are not unique or not in
7171
ascending order.
72-
ValueError: If processing a file and the output_filetype is not None
73-
ValueError: If output is None but output_filetype is not None.
72+
7473
7574
References:
7675
[1] Hildebrand, M., et al. (2014). Age group comparability of raw accelerometer
@@ -105,13 +104,6 @@ def run(
105104
raise ValueError(message)
106105

107106
if input.is_file():
108-
if output_filetype is not None:
109-
raise ValueError(
110-
"When processing single files, output_filetype should be None - "
111-
"the file type will be determined from the output path."
112-
)
113-
logger.debug("Input is file, forwarding to run_file with output=%s", output)
114-
115107
return _run_file(
116108
input=input,
117109
output=output,
@@ -147,14 +139,13 @@ def _run_directory(
147139
epoch_length: float = 5,
148140
nonwear_algorithm: Sequence[Literal["ggir", "cta", "detach"]] = ["ggir"],
149141
verbosity: int = logging.WARNING,
150-
output_filetype: Optional[Literal[".csv", ".parquet"]] = None,
142+
output_filetype: Literal[".csv", ".parquet"] = ".csv",
151143
activity_metric: Literal["enmo", "mad", "ag_count", "mims"] = "enmo",
152144
) -> Dict[str, writers.OrchestratorResults]:
153145
"""Runs main processing steps for wristpy on directories.
154146
155147
The run_directory() function will execute the run_file() function on entire
156-
directories. The input and output (if any) paths must directories. An
157-
output_filetype must be specified if and only if an output is given. Output file
148+
directories. The input and output (if any) paths must directories. Output file
158149
names will be derived from input file names.
159150
160151
@@ -192,9 +183,6 @@ def _run_directory(
192183
Activity: Retrospective Observational Data Analysis Study JMIR Mhealth Uhealth
193184
2022;10(7):e38077 URL: https://mhealth.jmir.org/2022/7/e38077 DOI: 10.2196/38077
194185
"""
195-
if output is None and output_filetype is not None:
196-
raise ValueError("If no output is given, output_filetype must be None.")
197-
198186
if output is not None:
199187
if output.is_file():
200188
raise ValueError(
@@ -209,12 +197,14 @@ def _run_directory(
209197
file_names = list(itertools.chain(input.glob("*.gt3x"), input.glob("*.bin")))
210198

211199
if not file_names:
212-
raise FileNotFoundError(f"Directory {input} contains no .gt3x or .bin files.")
200+
raise exceptions.EmptyDirectoryError(
201+
f"Directory {input} contains no .gt3x or .bin files."
202+
)
213203

214204
results_dict = {}
215205
for file in file_names:
216206
output_file_path = (
217-
output / pathlib.Path(file.stem).with_suffix(output_filetype) # type: ignore[arg-type] # if output is defined, so is output_filetype
207+
output / pathlib.Path(file.stem).with_suffix(output_filetype)
218208
if output
219209
else None
220210
)
@@ -302,15 +292,6 @@ def _run_file(
302292
if output is not None:
303293
writers.OrchestratorResults.validate_output(output=output)
304294

305-
parameters_dictionary = {
306-
"thresholds": list(thresholds),
307-
"calibrator": calibrator,
308-
"epoch_length": epoch_length,
309-
"activity_metric": activity_metric,
310-
"nonwear_algorithm": list(nonwear_algorithm),
311-
"input_file": str(input),
312-
}
313-
314295
if calibrator is not None and calibrator not in ["ggir", "gradient"]:
315296
msg = (
316297
f"Invalid calibrator: {calibrator}. Choose: 'ggir', 'gradient'. "
@@ -352,17 +333,14 @@ def _run_file(
352333
dynamic_range=watch_data.dynamic_range,
353334
)
354335

355-
sleep_detector = analytics.GgirSleepDetection(anglez)
356-
sleep_windows = sleep_detector.run_sleep_detection()
357-
358-
nonwear_array = nonwear_utils.get_nonwear_measurements(
336+
nonwear_array = processing_utils.get_nonwear_measurements(
359337
calibrated_acceleration=calibrated_acceleration,
360338
temperature=watch_data.temperature,
361339
non_wear_algorithms=nonwear_algorithm,
362340
)
363341

364-
nonwear_epoch = nonwear_utils.nonwear_array_cleanup(
365-
nonwear_array=nonwear_array,
342+
nonwear_epoch = processing_utils.synchronize_measurements(
343+
data_measurement=nonwear_array,
366344
reference_measurement=activity_measurement,
367345
epoch_length=epoch_length,
368346
)
@@ -371,14 +349,40 @@ def _run_file(
371349
activity_measurement, thresholds
372350
)
373351

352+
sleep_detector = analytics.GgirSleepDetection(anglez)
353+
sleep_parameters = sleep_detector.run_sleep_detection()
374354
sleep_array = analytics.sleep_cleanup(
375-
sleep_windows=sleep_windows, nonwear_measurement=nonwear_epoch
355+
sleep_windows=sleep_parameters.sleep_windows, nonwear_measurement=nonwear_epoch
356+
)
357+
spt_windows = analytics.sleep_bouts_cleanup(
358+
sleep_parameter=sleep_parameters.spt_windows,
359+
nonwear_measurement=nonwear_epoch,
360+
time_reference_measurement=activity_measurement,
361+
epoch_length=epoch_length,
376362
)
363+
sib_periods = analytics.sleep_bouts_cleanup(
364+
sleep_parameter=sleep_parameters.sib_periods,
365+
nonwear_measurement=nonwear_epoch,
366+
time_reference_measurement=activity_measurement,
367+
epoch_length=epoch_length,
368+
)
369+
370+
parameters_dictionary = {
371+
"thresholds": list(thresholds),
372+
"calibrator": calibrator,
373+
"epoch_length": epoch_length,
374+
"activity_metric": activity_metric,
375+
"nonwear_algorithm": list(nonwear_algorithm),
376+
"input_file": str(input),
377+
}
378+
377379
results = writers.OrchestratorResults(
378380
physical_activity_metric=activity_measurement,
379381
anglez=anglez,
380382
physical_activity_levels=physical_activity_levels,
381383
sleep_status=sleep_array,
384+
sib_periods=sib_periods,
385+
spt_periods=spt_windows,
382386
nonwear_status=nonwear_epoch,
383387
processing_params=parameters_dictionary,
384388
)

src/wristpy/io/writers/writers.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,8 @@ class OrchestratorResults(pydantic.BaseModel):
2323
physical_activity_levels: models.Measurement
2424
nonwear_status: models.Measurement
2525
sleep_status: models.Measurement
26+
sib_periods: models.Measurement
27+
spt_periods: models.Measurement
2628
processing_params: Optional[Dict[str, Any]] = None
2729

2830
def save_results(self, output: pathlib.Path) -> None:

0 commit comments

Comments
 (0)