
Description
Dear developers,
After learning about your work, I have the following confusions:
- Why do we need to convert the square loss into an SPP problem? What problems will arise if we directly minimize
$E[f(x_i)-a]^2+E[f(x_j)-b]^2+2*(m+b-a)^2$ ?
loss = self.mean((y_pred - self.a)**2*pos_mask) + self.mean((y_pred - self.b)**2*neg_mask) + (self.margin + self.mean(y_pred*neg_mask) - self.mean(y_pred*pos_mask))**2
- I noticed that in your code,
$a$ in$E[f(x_i)-a]^2$ and$b$ in$E[f(x_j)-b]^2$ directly useself.a
andself.b
, while$a$ and$b$ in$(m+b-a)$ use the sample meanself.mean(y_pred*neg_mask) - self.mean(y_pred*pos_mask)
. I would like to know the reason for that.
loss = self.mean((y_pred - self.a)**2*pos_mask) + self.mean((y_pred - self.b)**2*neg_mask) + \ 2*self.alpha*(self.margin + self.mean(y_pred*neg_mask) - self.mean(y_pred*pos_mask)) - self.alpha**2
-
Can I regard that the design of margin loss is to transform the square loss
$(m+f(x_j)-f(x_i))^2$ into$max[0, (m+f(x_j)-f(x_i))]^2$ to allow$m+f(x_j)$ to be equal or less than$f(x_i)$ while the square loss only seeks to be equal to$f(x_i)$ ? When the loss function has a value of 0, is there a potential problem that the gradient cannot be updated? -
In the demo you provided, the AUC test score of AUCM based on PESG on CIFAR10 can reach 0.9245, while the value quoted in your paper Large-scale Robust Deep AUC Maximization is 0.715±0.008. Is this because some content has been updated?
I would be grateful if you could reply as soon as possible. Wish you a happy new year.