-
Notifications
You must be signed in to change notification settings - Fork 46
Description
Dear author,
Thank you for your great work.
There are some questions while reproducing the official code.
From what I understand, the key of the L2P is to freeze a well-pretrained backbone (ViT) and train only small-sized prompts to achieve amazing performance.
However, if you look at the config in the domain increment setting using CORe50, the freeze part is an empty list.
When reproduced without any config modification in my environment, I got results (77.91%) similar to the paper.
According to the results, it is expected that full tuning without freezing of the backbone will be the result of the paper.
1. Why didn't you freeze the backbone in the domain-incremental setting?
2. Was it written in the paper? I also read the supplementary and didn't see anything about it.
Trivial question.
Only 99% of the samples of the entire CORe50 dataset are used because the subsmaple_rate is -1 in this part (test, train).
3. Is this the intended implementation?
And about positional embedding,
Before the release of the code version integrated with DualPrompt, the positional embedding was also added to prompts in L2P.
However, in the version of code that is integrated with DualPrompt, the positional embedding is no longer added to prompts (only added to image tokens) in L2P.
I think positional embeding will have a great impact on performance.
4. Which is the right?
Additionally, when using L2P in code integrated with DualPrompt,
Encoders have the input as [Prompts, CLS, Image tokens].
But the code before the integration with DualPrompt is [CLS, Prompts, Image tokens].
5. which one is correct?
Please let me know if there is anything I missed.
Best,
Jaeho Lee.