Skip to content
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@ requires-python = ">=3.12"
dependencies = [
"pandas>=2.2.3",
"pydantic>=2.11.1",
"scipy>=1.15.2"
"scipy>=1.15.2",
"shapely>=2.1.0"
]

[dependency-groups]
Expand Down
36 changes: 36 additions & 0 deletions src/graphomotor/features/drawing_error.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
"""Feature extraction module for drawing error-based metrics in spiral drawing data."""

import numpy as np
from shapely import geometry, ops
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check their license and how it impacts using their package with our license. Not sure what BSD-3-Clause license is

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BSD-3-Clause looks pretty permissive, so I don't think there would be any issues:

BSD 3-Clause License

Copyright (c) 2007, Sean C. Gillies. 2019, Casper van der Wel. 2007-2022, Shapely Contributors.
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

  1. Redistributions of source code must retain the above copyright notice, this
    list of conditions and the following disclaimer.

  2. Redistributions in binary form must reproduce the above copyright notice,
    this list of conditions and the following disclaimer in the documentation
    and/or other materials provided with the distribution.

  3. Neither the name of the copyright holder nor the names of its
    contributors may be used to endorse or promote products derived from
    this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


from graphomotor.core import models


def calculate_area_under_curve(
drawn_spiral: models.Spiral, reference_spiral: np.ndarray
) -> dict:
"""Calculate the area between drawn and reference spirals.

This function measures the deviation between drawn and reference spirals by
computing the enclosed area between them using the shapely library. Lower values
indicate better adherence to the template. The algorithm works by creating polygons
that connect spiral endpoints, finding intersections between lines, and calculating
the total area of the resulting polygons.

Args:
drawn_spiral: The spiral drawn by the subject.
reference_spiral: The reference spiral.

Returns:
Dictionary containing the area under curve metric
"""
spiral = drawn_spiral.data[["x", "y"]].values
line_drawn = geometry.LineString(spiral)
line_reference = geometry.LineString(reference_spiral)
first_segment = geometry.LineString([spiral[0], reference_spiral[0]])
last_segment = geometry.LineString([spiral[-1], reference_spiral[-1]])
merged_line = ops.unary_union(
[line_drawn, line_reference, first_segment, last_segment]
)
polygons = list(ops.polygonize(merged_line))
return {"area_under_curve": sum(p.area for p in polygons)}
34 changes: 34 additions & 0 deletions tests/unit/test_drawing_error.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
"""Test cases for drawing_error.py functions."""

import numpy as np
import pandas as pd
import pytest
from scipy import integrate

from graphomotor.core import models
from graphomotor.features import drawing_error


def test_calculate_area_under_curve(monkeypatch: pytest.MonkeyPatch) -> None:
"""Test that the area under the curve is calculated correctly."""
x = np.linspace(-np.pi / 2, 6 * np.pi / 4, 100)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why such a complicated x-axis data points? If you just do 0-2pi I'm pretty sure this is a nice easy calculation for AUC of the difference (and an exact value)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

I basically wanted to test that the shapely method is able to handle intersecting lines. And this area is equal to 8 but I just wanted to confirm with the integrate function.

y1 = np.sin(x)
y2 = np.sin(x + np.pi)

expected_area, _ = integrate.quad(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This begs the question: If we can use integration to find the area under the curve here why do we need the shapely method?

Copy link
Copy Markdown
Collaborator Author

@alperkent alperkent Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use integration for this test which uses only sine waves, but the spiral data is a lot trickier. My initial goal was to convert Cartesian coordinates to polar and just use np.trapz but there are some issues with conversion to polar in the beginning segment of the data and resampling the reference spiral at the same angles with the drawn spiral, so I've decided not to do this. I will explain more during our meeting tomorrow.

lambda x: np.abs(np.sin(x) - np.sin(x + np.pi)), -np.pi / 2, 6 * np.pi / 4
)

class MockSpiral:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't have a dedicated class within a unit test

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually couldn't find a better way of doing this test so I left it like this hoping you might suggest a better way lol

def __init__(self, data: pd.DataFrame, metadata: dict) -> None:
self.data = data
self.metadata = metadata

monkeypatch.setattr(models, "Spiral", MockSpiral)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No part of this mocking is needed?

Copy link
Copy Markdown
Collaborator Author

@alperkent alperkent Apr 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want the feature extraction functions to input the spiral object rather than only the needed columns of the dataframe, because I think it would make it easier to just call all of them with the same inputs in the orchestrator. I did not want to just import the whole dataframe instead of the spiral object because I thought maybe some features that we will extract later might need to use the metadata info like task name or handedness, or we could decide to prioritize getting a feature like task duration first and instead if calculating it separately for each other feature that uses it, we could just add it to the spiral object.
Should I just input the valid spiral fixture and change the data to this sine wave data? Because instantiating a spiral object without proper columns and metadata just throws errors.


calculated_area = drawing_error.calculate_area_under_curve(
models.Spiral(data=pd.DataFrame({"x": x, "y": y1}), metadata={}),
np.array([x, y2]).T,
)["area_under_curve"]

assert np.isclose(calculated_area, expected_area, rtol=1e-3)
37 changes: 37 additions & 0 deletions uv.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.