Skip to content

Some confusions about AUCM loss #67

Closed
@ghost

Description

Dear developers,

After learning about your work, I have the following confusions:

  1. Why do we need to convert the square loss into an SPP problem? What problems will arise if we directly minimize $E[f(x_i)-a]^2+E[f(x_j)-b]^2+2*(m+b-a)^2$?

loss = self.mean((y_pred - self.a)**2*pos_mask) + self.mean((y_pred - self.b)**2*neg_mask) + (self.margin + self.mean(y_pred*neg_mask) - self.mean(y_pred*pos_mask))**2

  1. I noticed that in your code, $a$ in $E[f(x_i)-a]^2$ and $b$ in $E[f(x_j)-b]^2$ directly use self.a and self.b, while $a$ and $b$ in $(m+b-a)$ use the sample mean self.mean(y_pred*neg_mask) - self.mean(y_pred*pos_mask). I would like to know the reason for that.

loss = self.mean((y_pred - self.a)**2*pos_mask) + self.mean((y_pred - self.b)**2*neg_mask) + \ 2*self.alpha*(self.margin + self.mean(y_pred*neg_mask) - self.mean(y_pred*pos_mask)) - self.alpha**2

  1. Can I regard that the design of margin loss is to transform the square loss $(m+f(x_j)-f(x_i))^2$ into $max[0, (m+f(x_j)-f(x_i))]^2$ to allow $m+f(x_j)$ to be equal or less than $f(x_i)$ while the square loss only seeks to be equal to $f(x_i)$? When the loss function has a value of 0, is there a potential problem that the gradient cannot be updated?

  2. In the demo you provided, the AUC test score of AUCM based on PESG on CIFAR10 can reach 0.9245, while the value quoted in your paper Large-scale Robust Deep AUC Maximization is 0.715±0.008. Is this because some content has been updated?

I would be grateful if you could reply as soon as possible. Wish you a happy new year.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions