Skip to content

Peihao-Xiang/Label-Ranker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Label Ranker: Self-aware Preference for Classification Label Position in Visual Masked Self-supervised Pre-trained Model (ACM ICMR 2025)

HCPS link Preprints.org link Citation link ICMR link
Peihao Xiang, Kaida Wu, and Ou Bai
HCPS Laboratory, Department of Electrical and Computer Engineering, Florida International University

visitor badge Open in Colab Hugging Face Datasets

Official TensorFlow implementation and Label Ranker codes for Label Ranker: Self-aware Preference for Classification Label Position in Visual Masked Self-supervised Pre-trained Model.

Note: The .ipynb is just a simple example. In addition, the VideoMAE encoder model should be pre-trained using the Self-supervised method, but this repository does not provide it.

Overview

This paper investigates the impact of randomly initialized unique encoding of classification label position on the visual masked self-supervised pre-trained model when fine-tuning downstream classification tasks. Our findings indicate that different random initializations lead to significant variations in fine-tuned results, even when using the same allocation strategy for classification datasets. The accuracy gap between these results suggests that the visual masked self-supervised pre-trained model has an inherent preference for classification label positions. To investigate this, we compare it with the non-self-supervised visual pre-trained model and hypothesize that the masked self-supervised model exhibits a self-aware bias toward certain label positions. To mitigate the instability caused by random encoding, we propose a classification label position ranking algorithm, Label Ranker. It is based on 1-D dimensionality reduction of feature maps using Linear Discriminant Analysis and position-rank encoding of them by unsupervised feature clustering using the similarity property of Euclidean distance. This algorithm ensures that label position encoding align with the model’s inherent preference. Extensive ablation experiments using ImageMAE and VideoMAE models on the CIFAR-100, UCF101, and HMDB51 classification datasets validate our approach. Results demonstrate that our method effectively stabilizes classification label position encoding, improving fine-tuned performance for visual masked self-supervised models.


Fig. 1 Illustration of the Problem Origin. The impact of randomly initializing the unique position encoding of the classification labels.

Implementation details


Fig. 2 Label Ranker Processing: Linear Discriminant Analysis is used to reduce the dimension of features into 1-D and map them to the linear projection panel.


Fig. 3 Classification Label Sequence. Calculate the centroid and Euclidean distance of various 1-D projected feature points for position ranking.

Main Results

CIFRA-100

Result_on_CIFRA-100

UCF101

Result_on_UCF101

HMDB51

Result_on_HMDB51

Contact

If you have any questions, please feel free to reach me out at pxian001@fiu.edu.

Acknowledgments

This project is built upon ImageMAE, kerasMAE and VideoMAE. Thanks for their great codebase.

License

This project is under the Apache License 2.0. See LICENSE for details.

Citation

If you find this repository helpful, please consider citing our work:

@article{202503.0003,
	doi = {10.20944/preprints202503.0003.v1},
	url = {https://doi.org/10.20944/preprints202503.0003.v1},
	year = 2025,
	month = {March},
	publisher = {Preprints},
	author = {Peihao Xiang and Kaida Wu and Ou Bai},
	title = {Label Ranker: Self-Aware Preference for Classification Label Position in Visual Masked Self-Supervised Pre-Trained Model},
	journal = {Preprints}
}

@inproceedings{10.1145/3731715.3733369,
author = {Xiang, Peihao and Bai, Ou},
title = {Label Ranker: Self-aware Preference for Classification Label Position in Visual Masked Self-supervised Pre-trained Model},
year = {2025},
isbn = {9798400718779},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3731715.3733369},
doi = {10.1145/3731715.3733369},
pages = {1535–1541},
numpages = {7},
location = {Chicago, IL, USA},
series = {ICMR '25}
}

About

[ACM ICMR 2025 Oral] TensorFlow code implementation of "Label Ranker: Self-aware Preference for Classification Label Position in Visual Masked Self-supervised Pre-trained Model"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors