Skip to content

[BUG] geometric_mean_score with average='macro' #1096

Open
@vocarvalho

Description

@vocarvalho

Hello, first of all thank you for the package. I believe there is an error in the calculation of the geometric_mean_score measure when set to the average='macro' option. When the problem is multiclass it works correctly (see below).

##################################################
#multiclass
from imblearn.metrics import geometric_mean_score
from sklearn.metrics import recall_score

y_true = [0, 1, 2, 0, 1, 2]
y_pred = [0, 2, 1, 0, 1, 2]

#for each label
print('-----------------')
print('correct: ',geometric_mean_score(y_true, y_pred, average=None))

#macro
print('-----------------')
vet = geometric_mean_score(y_true, y_pred, average=None)
print('correct: ',np.mean(vet))
print('correct: ',geometric_mean_score(y_true, y_pred, average='macro'))

#Answers
#-----------------
#correct:  [1.         0.61237244 0.61237244]
#-----------------
#correct:  0.7415816237971963
#correct:  0.7453559924999299
##################################################

However, when the problem is binary, it works incorrectly (see below). What I think it is doing is computing the g-mean from the TPR, TNR macros (example in the code).

##################################################
#binary
from imblearn.metrics import geometric_mean_score
from sklearn.metrics import recall_score

y_true = [0, 0, 1, 0, 1, 1]
y_pred = [0, 0, 0, 0, 0, 1]

#for each label
print('-----------------')
print('correct: ',geometric_mean_score(y_true, y_pred, average=None))

#macro: wrong, it should be the average of the scores, i.e., np.mean(vet)
print('-----------------')
vet = geometric_mean_score(y_true, y_pred, average=None)
print('correct: ',np.mean(vet))
#wrong
print('incorrect: ',geometric_mean_score(y_true, y_pred, average='macro'))

#what i think he is doing...computing the g-mean from the macros of TPR, TNR
print('-----------------')
#class 0 as the interesting class
TPR_0 = sensitivity = recall_score(y_true, y_pred, average='binary', pos_label=0)
TNR_1 = specificity = recall_score(y_true, y_pred, average='binary', pos_label=1)

#class 1 as the interesting class
TPR_1 = sensitivity = recall_score(y_true, y_pred, average='binary', pos_label=1)
TNR_0 = specificity = recall_score(y_true, y_pred, average='binary', pos_label=0)

macro_TPR = (TPR_0 + TPR_1)/2
macro_TPR = (TNR_0 + TNR_1)/2
print(macro_TPR)
print(macro_TPR)

gmean = np.sqrt(macro_TPR * macro_TPR)
print('incorrect: ',gmean)

#Answers
#-----------------
#correct:  [0.57735027 0.57735027]
#-----------------
#correct:  0.5773502691896257
#incorrect:  0.6666666666666666
#-----------------
#0.6666666666666666
#0.6666666666666666
#incorrect:  0.6666666666666666
##################################################

I would like to hear back if my reasoning is correct. I hope I have helped in some way. Regards.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions