Hello, there are some questions about the article:
1、In equation 6

the $y_t^i$ may be zero, so the loss will be inf, how to solve this.
2、In training, each time a picture is entered, for example, a dog image, an embedding is generated using the dog tag, and Parameters of the Prediction Network were generated by this embedding. But for the test, input a picture, but we do not know its tag, so how to generate the parameters of the Prediction Network?
Hello, there are some questions about the article:

$y_t^i$ may be zero, so the loss will be inf, how to solve this.
1、In equation 6
the
2、In training, each time a picture is entered, for example, a dog image, an embedding is generated using the dog tag, and Parameters of the Prediction Network were generated by this embedding. But for the test, input a picture, but we do not know its tag, so how to generate the parameters of the Prediction Network?