Hello,
Thank you for always replying in detail and helping me for my research!
I also happened to have another question regarding the results! In the OpenReview, I saw that you experimented with 1-p(unsure) instead of the p(yes) ratio! Are the results in the OpenReview (general response) the average performance across the models? If it is, would you mind sharing the InternLM-8b debiasing results using 1-p(unsure)?
Thank you so much!