-
Notifications
You must be signed in to change notification settings - Fork 72
Add Graded RBP metric class #1032
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 5 commits
bbbb33f
a155e40
314437e
cc78e47
9544ad7
11b733a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -36,6 +36,27 @@ def rank_biased_precision( | |
| return rbp / normalization | ||
|
|
||
|
|
||
| def graded_rank_biased_precision( | ||
| relevance: np.ndarray, weights: np.ndarray, normalization: float = 1.0 | ||
| ) -> float: | ||
| """ | ||
| Compute graded rank-biased precision. | ||
|
|
||
| Args: | ||
| relevance: | ||
| Float array of relevance/grade scores at each position | ||
| weights: | ||
| Weight for each item position (same length as relevance) | ||
| normalization: | ||
| Optional normalization factor, defaults to 1.0 | ||
|
|
||
| Returns: | ||
| Graded RBP score | ||
| """ | ||
| score = np.sum(weights * relevance).item() | ||
| return score / normalization | ||
|
|
||
|
|
||
| class RBP(ListMetric, RankingMetricBase): | ||
| """ | ||
| Evaluate recommendations with rank-biased precision :cite:p:`rbp`. | ||
|
|
@@ -63,6 +84,9 @@ class RBP(ListMetric, RankingMetricBase): | |
| in the paper; however, RBP with high patience should be no worse than nDCG | ||
| (and perhaps even better) in this regard. | ||
|
|
||
| This metric class supports relevance grades :math:`r_{ui} \\in \\[0, 1\\]` | ||
| via an optional ``grade_field``. | ||
|
|
||
| In recommender evaluation, we usually have a small test set, so the maximum | ||
| achievable RBP is significantly less than the theoretical maximum, and is a | ||
| function of the number of test items. With ``normalize=True``, the RBP | ||
|
|
@@ -99,6 +123,8 @@ class RBP(ListMetric, RankingMetricBase): | |
| patience: float | ||
| normalize: bool | ||
| weight_field: str | None | ||
| grade_field: str | None | ||
| unknown_grade: float | ||
|
|
||
| def __init__( | ||
| self, | ||
|
|
@@ -109,6 +135,8 @@ def __init__( | |
| patience: float = 0.85, | ||
| normalize: bool = False, | ||
| weight_field: str | None = None, | ||
| grade_field: str | None = None, | ||
| unknown_grade: float = 0.25, | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| ): | ||
| super().__init__(n, k=k) | ||
| self.patience = patience | ||
|
|
@@ -117,13 +145,16 @@ def __init__( | |
| self.weight = weight | ||
| self.normalize = normalize | ||
| self.weight_field = weight_field | ||
| self.grade_field = grade_field | ||
| self.unknown_grade = unknown_grade | ||
|
|
||
| @property | ||
| def label(self): | ||
| base = "RBP" if self.grade_field is None else "GradedRBP" | ||
| if self.n is not None: | ||
| return f"RBP@{self.n}" | ||
| return f"{base}@{self.n}" | ||
| else: | ||
| return "RBP" | ||
| return base | ||
|
Comment on lines
152
to
+157
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good. |
||
|
|
||
| @override | ||
| def measure_list(self, recs: ItemList, test: ItemList) -> float: | ||
|
|
@@ -134,8 +165,6 @@ def measure_list(self, recs: ItemList, test: ItemList) -> float: | |
| if nrel == 0: | ||
| return np.nan | ||
|
|
||
| good = recs.isin(test) | ||
|
|
||
| if self.weight_field is not None: | ||
| # use custom weights from field | ||
| weights = recs.field(self.weight_field) | ||
|
|
@@ -158,4 +187,17 @@ def measure_list(self, recs: ItemList, test: ItemList) -> float: | |
| else: | ||
| normalization = np.sum(weights).item() | ||
|
|
||
| return rank_biased_precision(good, weights, normalization) | ||
| # Binary relevance | ||
| if self.grade_field is None: | ||
| good = recs.isin(test) | ||
| return rank_biased_precision(good, weights, normalization) | ||
|
|
||
| # Graded relevance | ||
| if self.grade_field not in test._fields: | ||
| raise ValueError(f"Grade field '{self.grade_field}' not found in test ItemList") | ||
|
|
||
| grades = test.field(self.grade_field) | ||
|
Comment on lines
+196
to
+199
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We usually should not poke around inside other classes' private data —
|
||
| grade_map = dict(zip(test.ids(), grades)) | ||
| relevance = np.array([grade_map.get(item, self.unknown_grade) for item in recs.ids()]) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Missing test items get unknown_grade with a default of 0.25, and the tests also check for that same default. Was that intentional? It doesn't seem righ to me
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point. It probably makes sense to split this into two tests: one that verifies the default behavior, and another that verifies the parameter is actually being used. |
||
|
|
||
| return graded_rank_biased_precision(relevance, weights, normalization) | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to add
grade_fieldto theArgs:documentation.