Skip to content

Conversation

@luisheb
Copy link
Contributor

@luisheb luisheb commented May 14, 2025

Describe the proposed changes

Product metric for mixed data

In applications involving heterogeneous data—such as combinations of scalar values and functional observations—it is often useful to define a metric that captures distances across all components in a consistent way. The following class implements a weighted product metric that supports combining multiple data types, each potentially requiring a different metric.

This class generalizes the idea of computing a norm over a product space, where each component may be associated with a different scale or importance. It supports common functional data representations (FDataGrid, FDataBasis), numeric arrays, and pandas.DataFrame objects mixing them.

Classes and Functions

The metric is defined as a weighted $L^p$ norm of component-wise distances. Each component may be assigned its own metric and weight, allowing fine-grained control over the overall distance calculation. This is especially useful in machine learning tasks that involve mixed input types, such as classification or clustering over functional and scalar features.

Functional wrappers for ease of use are also provided, along with a default metric that infers the appropriate behavior depending on the input type.

Checklist before requesting a review

  • I have performed a self-review of my code
  • The code conforms to the style used in this package
  • The code is fully documented and typed (type-checked with Mypy)
  • I have added thorough tests for the new/changed functionality

@luisheb luisheb changed the title Feature/product metric Product metric May 14, 2025
@luisheb luisheb marked this pull request as ready for review June 18, 2025 23:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant