Skip to content

Commit 7dc1057

Browse files
133 directory processing (#142)
* Changing Run to run directories, old run is now run_file. * tests * Directory processing with generators, adjusting old modules. * testing and new data class * Removing unused code, changing cli to match new dataclass * error message change, using set for comparison * renamed class to resultsdictionary. * name change * Using dict instead of custom dataclass * doc string edits, writing in missing directories, removing directorynotfound error. * ruff error * Changes, and tutorial update * errors, doc strings, additional tests. * arg-type ignore * variable assignment mistake * Breaking up run_dir and run_file. Adding additional changes. * Doc string, ruff error * doc string * doc strings * Read me quick start update * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * final comment. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent d0e9226 commit 7dc1057

10 files changed

Lines changed: 429 additions & 112 deletions

File tree

README.md

Lines changed: 37 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,12 +50,19 @@ pip install wristpy
5050
## Quick start
5151

5252
### Using Wristpy through the command-line:
53+
#### Run single files:
5354
```sh
5455
wristpy /input/file/path.gt3x -o /save/path/file_name.csv -c gradient
5556
```
5657

58+
#### Run entire directories:
59+
```sh
60+
wristpy /path/to/files/input_dir -o /path/to/files/output_dir -c gradient -O .csv
61+
```
62+
5763
### Using Wristpy through a python script or notebook:
5864

65+
#### Running single files:
5966
```Python
6067

6168
from wristpy.core import orchestrator
@@ -72,13 +79,42 @@ results = orchestrator.run(
7279
calibrator='gradient', # Choose between 'ggir', 'gradient', or 'none'
7380
)
7481

75-
#Data availble in results object
82+
#Data available in results object
7683
enmo = results.enmo
7784
anglez = results.anglez
7885
physical_activity_levels = results.physical_activity_levels
7986
nonwear_array = results.nonwear_epoch
8087
sleep_windows = results.sleep_windows_epoch
8188
```
89+
#### Running entire directories:
90+
```Python
91+
92+
from wristpy.core import orchestrator
93+
94+
# Define input file path and output location
95+
96+
input_path = '/path/to/files/input_dir'
97+
output_path = '/path/to/files/output_dir'
98+
99+
# Run the orchestrator
100+
# Specify the output file type, support for saving as .csv and .parquet
101+
results_dict = orchestrator.run(
102+
input=input_path,
103+
output=output_path,
104+
calibrator='gradient', # Choose between 'ggir', 'gradient', or 'none'
105+
output_filetype = '.csv'
106+
)
107+
108+
109+
#Data available in dictionry of results.
110+
subject1 = results_dict['subject1']
111+
112+
enmo = subject1.enmo
113+
anglez = subject1.anglez
114+
physical_activity_levels = subject1.physical_activity_levels
115+
nonwear_array = subject1.nonwear_epoch
116+
sleep_windows = subject1.sleep_windows_epoch
117+
```
82118

83119
## References
84120
1. van Hees, V.T., Sabia, S., Jones, S.E. et al. Estimating sleep parameters

docs/wristpy_tutorial.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,20 @@ results = orchestrator.run(
3030
```
3131
This runs the processing pipeline with all the default arguments, creates an output `.csv` file, and will create a `results` object that contains the various output metrics (namely, enmo, angle-z, physical activity values, non-wear detection, sleep detection).
3232

33+
The orchestrator can also process entire directories. The call to the orchestrator remains largely the same but now output is expected to be a directory and the desired filetype for the saved files **must** be specified:
34+
35+
```python
36+
from wristpy.core import orchestrator
37+
38+
results = orchestrator.run(
39+
input = '/path/to/input/dir',
40+
output = '/path/to/output/dir',
41+
output_filetype = ".csv"
42+
)
43+
```
44+
45+
46+
3347
We can visualize some of the outputs within the `results` object, directly, with the following scripts:
3448

3549
Plot the ENMO across the entire data set:

src/wristpy/__main__.py

Lines changed: 4 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,11 @@
11
"""Main function for wristpy."""
22

3-
from wristpy.core import cli, orchestrator
3+
from wristpy.core import cli
44

55

6-
def run_main() -> orchestrator.Results:
7-
"""Main entry point to wristpy.
8-
9-
Returns:
10-
A Results object containing enmo, anglez, physical activity levels, nonwear
11-
detection, and sleep detection.
12-
"""
13-
return cli.main()
6+
def run_main() -> None:
7+
"""Main entry point to wristpy."""
8+
cli.main()
149

1510

1611
if __name__ == "__main__":

src/wristpy/core/cli.py

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
import argparse
44
import logging
55
import pathlib
6-
from typing import List, Optional
6+
from typing import List, Optional, Tuple, cast
77

88
from wristpy.core import config, orchestrator
99

@@ -35,6 +35,15 @@ def parse_arguments(args: Optional[List[str]] = None) -> argparse.Namespace:
3535
help="Path where data will be saved. Supports .csv and .parquet formats.",
3636
)
3737

38+
parser.add_argument(
39+
"-O",
40+
"--output_filetype",
41+
type=str,
42+
default=None,
43+
help="Format for save files when processing directories. "
44+
"Leave as None when processing single files.",
45+
)
46+
3847
parser.add_argument(
3948
"-c",
4049
"--calibrator",
@@ -83,7 +92,9 @@ def parse_arguments(args: Optional[List[str]] = None) -> argparse.Namespace:
8392
return parser.parse_args(args)
8493

8594

86-
def main(args: Optional[List[str]] = None) -> orchestrator.Results:
95+
def main(
96+
args: Optional[List[str]] = None,
97+
) -> None:
8798
"""Runs wristpy orchestrator with command line arguments.
8899
89100
Args:
@@ -120,10 +131,12 @@ def main(args: Optional[List[str]] = None) -> orchestrator.Results:
120131

121132
logger.debug("Running wristpy. arguments given: %s", arguments)
122133

123-
return orchestrator.run(
134+
orchestrator.run(
124135
input=arguments.input,
125136
output=arguments.output,
126-
thresholds=tuple(arguments.thresholds),
137+
thresholds=cast(Tuple[float, float, float], tuple(arguments.thresholds)),
127138
calibrator=None if arguments.calibrator == "none" else arguments.calibrator,
128139
epoch_length=None if arguments.epoch_length == 0 else arguments.epoch_length,
140+
verbosity=log_level,
141+
output_filetype=arguments.output_filetype,
129142
)

src/wristpy/core/exceptions.py

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -40,9 +40,3 @@ class InvalidFileTypeError(LoggedException):
4040
"""Wristpy did not expect this file extension."""
4141

4242
pass
43-
44-
45-
class DirectoryNotFoundError(LoggedException):
46-
"""Output save path not found."""
47-
48-
pass

src/wristpy/core/models.py

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,19 @@
11
"""Internal data model."""
22

3+
import pathlib
34
from typing import Optional
45

56
import numpy as np
67
import polars as pl
8+
import pydantic
79
from pydantic import BaseModel, field_validator
810

11+
from wristpy.core import config, exceptions
12+
13+
VALID_FILE_TYPES = (".csv", ".parquet")
14+
15+
logger = config.get_logger()
16+
917

1018
class Measurement(BaseModel):
1119
"""A single measurement of a sensor and its corresponding time."""
@@ -125,3 +133,59 @@ def validate_acceleration(cls, v: Measurement) -> Measurement:
125133
if v.measurements.ndim != 2 or v.measurements.shape[1] != 3:
126134
raise ValueError("acceleration must be a 2D array with 3 columns")
127135
return v
136+
137+
138+
class OrchestratorResults(pydantic.BaseModel):
139+
"""Dataclass containing results of orchestrator.run()."""
140+
141+
enmo: Measurement
142+
anglez: Measurement
143+
physical_activity_levels: Measurement
144+
nonwear_epoch: Measurement
145+
sleep_windows_epoch: Measurement
146+
147+
def save_results(self, output: pathlib.Path) -> None:
148+
"""Convert to polars and save the dataframe as a csv or parquet file.
149+
150+
Args:
151+
output: The path and file name of the data to be saved. as either a csv or
152+
parquet files.
153+
154+
"""
155+
logger.debug("Saving results.")
156+
self.validate_output(output=output)
157+
output.parent.mkdir(parents=True, exist_ok=True)
158+
159+
results_dataframe = pl.DataFrame(
160+
{"time": self.enmo.time}
161+
| {name: value.measurements for name, value in self}
162+
)
163+
164+
if output.suffix == ".csv":
165+
results_dataframe.write_csv(output, separator=",")
166+
elif output.suffix == ".parquet":
167+
results_dataframe.write_parquet(output)
168+
else:
169+
raise exceptions.InvalidFileTypeError(
170+
f"File type must be one of {VALID_FILE_TYPES}"
171+
)
172+
173+
logger.debug("results saved in: %s", output)
174+
175+
@classmethod
176+
def validate_output(cls, output: pathlib.Path) -> None:
177+
"""Validates that the output path exists and is a valid format.
178+
179+
Args:
180+
output: the name of the file to be saved, and the directory it will
181+
be saved in. Must be a .csv or .parquet file.
182+
183+
Raises:
184+
InvalidFileTypeError:If the output file path ends with any extension other
185+
than csv or parquet.
186+
"""
187+
if output.suffix not in VALID_FILE_TYPES:
188+
raise exceptions.InvalidFileTypeError(
189+
f"The extension: {output.suffix} is not supported."
190+
"Please save the file as .csv or .parquet",
191+
)

0 commit comments

Comments
 (0)